[PATCH] D158463: [AMDGPU] Add IR-level pass to rewrite away address space 7
Matt Arsenault via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 13 04:45:08 PDT 2023
arsenm added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:1
+//===-- AMDGPULowerBufferFatPointers.cpp ------------------*- C++ -*-=//
+//
----------------
Not padded out to same column as the later header lines
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:13
+//
+// This pass returns, doing nothing, on r600.
+//
----------------
I wouldn't bother mentioning this and just not add the pass for r600
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:43
+// `ptr addrspace(8) %x.rsrc` and `ptr addrspace(6) %x.off`, which will be
+// combined into `{ptr addrspace(8), ptr addrspace(6)} %x = {%x.rsrc, %x.off}`
+// if needed. Similarly, `vector<Nxp7>` becomes `{vector<Nxp8>, vector<Nxp6>}`
----------------
addrspace(6) is not just 32-bit, it's 32-bit constant. We shouldn't be assuming these are only const accesses. Should the offset just not be a pointer at all?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:53
+//
+// The first phase is to rewrite away all loads and stors of `ptr addrspace(7)`
+// containing values to corresponding `i160` values. This is handled by
----------------
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:54
+// The first phase is to rewrite away all loads and stors of `ptr addrspace(7)`
+// containing values to corresponding `i160` values. This is handled by
+// `StoreFatPtrsAsIntsVisitor` , which visits loads, stores, and allocas
----------------
"handled by containing values"? Not sure what this means
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:63
+// Such a transformation allows the later phases of the pass to not need
+// to handle buffer fat pointers moving to and from memory, where we loud
+// have to handle the incompatibility between a `{Nxp8, Nxp6}` representation
----------------
where we 'loud'?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:67
+// of resources and vectors of offsets are concatentated before being stored to
+// memory) are handled through ilpmententing `inttoptr` and `ptrtoint` only.
+//
----------------
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:69-70
+//
+// Atomics involving `ptr addrspace(7)` are not supported, as such values are
+// too large to be atomically processed in any case.
+//
----------------
atomic load and store should work fine
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:79
+// This uses a `BufferFatPtrToStructTypeMap` and a `FatPtrConstMaterializer`
+// to, usually by way of `setType`ing values. Constans are handled here
+// because there isn't a good way to fix them up later.
----------------
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:107
+// Split users do not necessarily need to produce parts themselves (
+// a `load f32, ptr addrspace(7)` does not, for example), but, if they do not
+// generate fat buffer pointers, they must RAUW in their replacement
----------------
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:165
+// fragments of the underlying source language variable.
+// 2. All uses that were themselves split are replaced by an `undef` of the
+// struct type, as they will themselves be erased soon. This rule, combined
----------------
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:240-241
+protected:
+ virtual Type *remapScalar(PointerType *PT) = 0;
+ virtual Type *remapVector(VectorType *VT) = 0;
+
----------------
What's the point in having these be separate?
Also not sure why these are virtual
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:259
+ Type *remapScalar(PointerType *PT) override {
+ return Type::getIntNTy(PT->getContext(), DL.getPointerTypeSizeInBits(PT));
+ }
----------------
This is just getIntPtrType
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:263
+ return VT->getWithNewType(
+ Type::getIntNTy(VT->getContext(), DL.getPointerTypeSizeInBits(VT)));
+ }
----------------
This is just getIntPtrType
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:283
+ Type *Ty, SmallPtrSetImpl<StructType *> &Seen) {
+ Type **Entry = &Map[Ty];
+ if (*Entry)
----------------
Type *&
Could you also move all the core logic to a separate function, such that every return doesn't need to consider the assignment to the map?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:298
+ }
+ bool IsUniqued = !isa<StructType>(Ty) || cast<StructType>(Ty)->isLiteral();
+ // Base case for ints, floats, opaque pointers, and so on, which don't
----------------
dyn_cast? Also could make a predicate function and clarify what uniqued means
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:348
+ llvm_unreachable("Unknown type of type that contains elements");
+ return nullptr;
+}
----------------
remove dead return
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:360
+ PointerType::get(Ctx, AMDGPUAS::BUFFER_RESOURCE),
+ PointerType::get(Ctx, AMDGPUAS::CONSTANT_ADDRESS_32BIT));
+}
----------------
Don't really think CONSTANT_ADDRESS_32BIT is appropriate for this
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:412
+/// down into resource and offset parts). This has the downside of imposing
+/// marshaling costs when reading or storing these values, but since placing
+/// such pointers into memory is an uncommon operation at best, we feel that
----------------
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:587
+ if (auto *AZ = dyn_cast<ConstantAggregateZero>(C))
+ return std::make_pair(AZ->getStructElement(0), AZ->getStructElement(1));
+ if (auto *SC = dyn_cast<ConstantStruct>(C))
----------------
can use {}?
================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerBufferFatPointers.cpp:1831
+ F->replaceAllUsesWith(NewF);
+ F->setName("");
+ M.getFunctionList().insert(F->getIterator(), NewF);
----------------
Could also create an anonymous function and use takeName. Are you getting an added suffix now?
================
Comment at: llvm/lib/Target/AMDGPU/SIISelLowering.cpp:14604-14605
- if (AMDGPU::isFlatGlobalAddrSpace(AS) &&
+ if ((AMDGPU::isFlatGlobalAddrSpace(AS) ||
+ AS == AMDGPUAS::BUFFER_FAT_POINTER) &&
Subtarget->hasAtomicFaddNoRtnInsts()) {
----------------
Probably should have an additional predicate function for this. It's separate but not really from global
================
Comment at: llvm/test/CodeGen/AMDGPU/lower-buffer-fat-pointers-p7-in-memory.ll:34
+; CHECK-SAME: (ptr [[A:%.*]], ptr [[B:%.*]]) #[[ATTR0]] {
+; CHECK-NEXT: [[X:%.*]] = load <4 x i160>, ptr [[A]], align 128
+; CHECK-NEXT: [[TMP1:%.*]] = lshr <4 x i160> [[X]], <i160 32, i160 32, i160 32, i160 32>
----------------
Is this alignment correct? I thought p7 was 256 bit / align 32
================
Comment at: llvm/test/CodeGen/AMDGPU/lower-buffer-fat-pointers-p7-in-memory.ll:161
+ ret void
+}
----------------
Also load of array and load of unpacked struct?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158463/new/
https://reviews.llvm.org/D158463
More information about the llvm-commits
mailing list