[llvm] [AMDGPU] Add IR-level pass to rewrite away address space 7 (PR #77952)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Thu Feb 29 04:14:58 PST 2024
================
@@ -0,0 +1,1983 @@
+//===-- AMDGPULowerBufferFatPointers.cpp ---------------------------=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This pass lowers operations on buffer fat pointers (addrspace 7) to
+// operations on buffer resources (addrspace 8) and is needed for correct
+// codegen.
+//
+// # Background
+//
+// Address space 7 (the buffer fat pointer) is a 160-bit pointer that consists
+// of a 128-bit buffer descriptor and a 32-bit offset into that descriptor.
+// The buffer resource part needs to be it needs to be a "raw" buffer resource
+// (it must have a stride of 0 and bounds checks must be in raw buffer mode
+// or disabled).
+//
+// When these requirements are met, a buffer resource can be treated as a
+// typical (though quite wide) pointer that follows typical LLVM pointer
+// semantics. This allows the frontend to reason about such buffers (which are
+// often encountered in the context of SPIR-V kernels).
+//
+// However, because of their non-power-of-2 size, these fat pointers cannot be
+// present during translation to MIR (though this restriction may be lifted
+// during the transition to GlobalISel). Therefore, this pass is needed in order
+// to correctly implement these fat pointers.
+//
+// The resource intrinsics take the resource part (the address space 8 pointer)
+// and the offset part (the 32-bit integer) as separate arguments. In addition,
+// many users of these buffers manipulate the offset while leaving the resource
+// part alone. For these reasons, we want to typically separate the resource
+// and offset parts into separate variables, but combine them together when
+// encountering cases where this is required, such as by inserting these values
+// into aggretates or moving them to memory.
+//
+// Therefore, at a high level, `ptr addrspace(7) %x` becomes `ptr addrspace(8)
+// %x.rsrc` and `i32 %x.off`, which will be combined into `{ptr addrspace(8),
+// i32} %x = {%x.rsrc, %x.off}` if needed. Similarly, `vector<Nxp7>` becomes
+// `{vector<Nxp8>, vector<Nxi32 >}` and its component parts.
+//
+// # Implementation
+//
+// This pass proceeds in three main phases:
+//
+// ## Rewriting loads and stores of p7
+//
+// The first phase is to rewrite away all loads and stors of `ptr addrspace(7)`,
+// including aggregates containing such pointers, to ones that use `i160`. This
+// is handled by `StoreFatPtrsAsIntsVisitor` , which visits loads, stores, and
+// allocas and, if the loaded or stored type contains `ptr addrspace(7)`,
+// rewrites that type to one where the p7s are replaced by i160s, copying other
+// parts of aggregates as needed. In the case of a store, each pointer is
+// `ptrtoint`d to i160 before storing, and load integers are `inttoptr`d back.
+// This same transformation is applied to vectors of pointers.
+//
+// Such a transformation allows the later phases of the pass to not need
+// to handle buffer fat pointers moving to and from memory, where we load
+// have to handle the incompatibility between a `{Nxp8, Nxi32}` representation
+// and `Nxi60` directly. Instead, that transposing action (where the vectors
+// of resources and vectors of offsets are concatentated before being stored to
+// memory) are handled through implementing `inttoptr` and `ptrtoint` only.
+//
+// Atomics operations on `ptr addrspace(7)` values are not suppported, as the
+// hardware does not include a 160-bit atomic.
+//
+// ## Type remapping
+//
+// We use a `ValueMapper` to mangle uses of [vectors of] buffer fat pointers
+// to the corresponding struct type, which has a resource part and an offset
+// part.
+//
+// This uses a `BufferFatPtrToStructTypeMap` and a `FatPtrConstMaterializer`
+// to, usually by way of `setType`ing values. Constants are handled here
+// because there isn't a good way to fix them up later.
+//
+// This has the downside of leaving the IR in an invalid state (for example,
+// the instruction `getelementptr {ptr addrspace(8), i32} %p, ...` will exist),
+// but all such invalid states will be resolved by the third phase.
+//
+// Functions that don't take buffer fat pointers are modified in place. Those
+// that do take such pointers have their basic blocks moved to a new function
+// with arguments that are {ptr addrspace(8), i32} arguments and return values.
+// This phase also records intrinsics so that they can be remangled or deleted
+// later.
+//
+//
+// ## Splitting pointer structs
+//
+// The meat of this pass consists of defining semantics for operations that
+// produce or consume [vectors of] buffer fat pointers in terms of their
+// resource and offset parts. This is accomplished throgh the `SplitPtrStructs`
+// visitor.
+//
+// In the first pass through each function that is being lowered, the splitter
+// inserts new instructions to implement the split-structures behavior, which is
+// needed for correctness and performance. It records a list of "split users",
+// instructions that are being replaced by operations on the resource and offset
+// parts.
+//
+// Split users do not necessarily need to produce parts themselves (
+// a `load float, ptr addrspace(7)` does not, for example), but, if they do not
+// generate fat buffer pointers, they must RAUW in their replacement
+// instructions during the initial visit.
+//
+// When these new instructions are created, they use the split parts recorded
+// for their initial arguments in order to generate their replacements, creating
+// a parallel set of instructions that does not refer to the original fat
+// pointer values but instead to their resource and offset components.
+//
+// Instructions, such as `extractvalue`, that produce buffer fat pointers from
+// sources that do not have split parts, have such parts generated using
+// `extractvalue`. This is also the initial handling of PHI nodes, which
+// are then cleaned up.
+//
+// ### Conditionals
+//
+// PHI nodes are initially given resource parts via `extractvalue`. However,
+// this is not an efficient rewrite of such nodes, as, in most cases, the
+// resource part in a conditional or loop remains constant throughout the loop
+// and only the offset varies. Failing to optimize away these constant resources
+// would cause additional registers to be sent around loops and might lead to
+// waterfall loops being generated for buffer operations due to the
+// "non-uniform" resource argument.
+//
+// Therefore, after all instructions have been visited, the pointer splitter
+// post-processes all encountered conditionals. Given a PHI node or select,
+// getPossibleRsrcRoots() collects all values that the resource parts of that
+// conditional's input could come from as well as collecting all conditional
+// instructions encountered during the search. If, after filtering out the
+// initial node itself, the set of encountered conditionals is a subset of the
+// potential roots and there is a single potential resource that isn't in the
+// conditional set, that value is the only possible value the resource argument
+// could have throughout the control flow.
+//
+// If that condition is met, then a PHI node can have its resource part changed
+// to the singleton value and then be replaced by a PHI on the offsets.
+// Otherwise, each PHI node is split into two, one for the resource part and one
+// for the offset part, which replace the temporary `extractvalue` instructions
+// that were added during the first pass.
+//
+// Similar logic applies to `select`, where
+// `%z = select i1 %cond, %cond, ptr addrspace(7) %x, ptr addrspace(7) %y`
+// can be split into `%z.rsrc = %x.rsrc` and
+// `%z.off = select i1 %cond, ptr i32 %x.off, i32 %y.off`
+// if both `%x` and `%y` have the same resource part, but two `select`
+// operations will be needed if they do not.
+//
+// ### Final processing
+//
+// After conditionals have been cleaned up, the IR for each function is
+// rewritten to remove all the old instructions that have been split up.
+//
+// Any instruction that used to produce a buffer fat pointer (and therefore now
+// produces a resource-and-offset struct after type remapping) is
+// replaced as follows:
+// 1. All debug value annotations are cloned to reflect that the resource part
+// and offset parts are computed separately and constitute different
+// fragments of the underlying source language variable.
+// 2. All uses that were themselves split are replaced by a `poison` of the
+// struct type, as they will themselves be erased soon. This rule, combined
+// with debug handling, should leave the use lists of split instructions
+// empty in almost all cases.
+// 3. If a user of the original struct-valued result remains, the structure
+// needed for the new types to work is constructed out of the newly-defined
+// parts, and the original instruction is replaced by this structure
+// before being erased. Instructions requiring this construction include
+// `ret` and `insertvalue`.
+//
+// # Consequences
+//
+// This pass does not alter the CFG.
+//
+// Alias analysis information will become coarser, as the LLVM alias analyzer
+// cannot handle the buffer intrinsics. Specifically, while we can determine
+// that the following two loads do not alias:
+// ```
+// %y = getelementptr i32, ptr addrspace(7) %x, i32 1
+// %a = load i32, ptr addrspace(7) %x
+// %b = load i32, ptr addrspace(7) %y
+// ```
+// we cannot (except through some code that runs during scheduling) determine
+// that the rewritten loads below do not alias.
+// ```
+// %y.off = add i32 %x.off, 1
+// %a = call @llvm.amdgcn.raw.ptr.buffer.load(ptr addrspace(8) %x.rsrc, i32
+// %x.off, ...)
+// %b = call @llvm.amdgcn.raw.ptr.buffer.load(ptr addrspace(8)
+// %x.rsrc, i32 %y.off, ...)
+// ```
+// However, existing alias information is preserved.
+//===----------------------------------------------------------------------===//
+
+#include "AMDGPU.h"
+#include "AMDGPUTargetMachine.h"
+#include "GCNSubtarget.h"
+#include "SIDefines.h"
+#include "llvm/ADT/SetOperations.h"
+#include "llvm/ADT/SmallVector.h"
+#include "llvm/Analysis/ConstantFolding.h"
+#include "llvm/CodeGen/TargetPassConfig.h"
+#include "llvm/IR/AttributeMask.h"
+#include "llvm/IR/Constants.h"
+#include "llvm/IR/DebugInfo.h"
+#include "llvm/IR/DerivedTypes.h"
+#include "llvm/IR/IRBuilder.h"
+#include "llvm/IR/InstIterator.h"
+#include "llvm/IR/InstVisitor.h"
+#include "llvm/IR/Instructions.h"
+#include "llvm/IR/Intrinsics.h"
+#include "llvm/IR/IntrinsicsAMDGPU.h"
+#include "llvm/IR/Metadata.h"
+#include "llvm/IR/Operator.h"
+#include "llvm/InitializePasses.h"
+#include "llvm/Pass.h"
+#include "llvm/Support/AtomicOrdering.h"
+#include "llvm/Support/Debug.h"
+#include "llvm/Support/ErrorHandling.h"
+#include "llvm/Transforms/Utils/Cloning.h"
+#include "llvm/Transforms/Utils/Local.h"
+#include "llvm/Transforms/Utils/ValueMapper.h"
+
+#define DEBUG_TYPE "amdgpu-lower-buffer-fat-pointers"
+
+using namespace llvm;
+
+static constexpr unsigned BufferOffsetWidth = 32;
+
+namespace {
+/// Recursively replace instances of ptr addrspace(7) and vector<Nxptr
+/// addrspace(7)> with some other type as defined by the relevant subclass.
+class BufferFatPtrTypeLoweringBase : public ValueMapTypeRemapper {
+ DenseMap<Type *, Type *> Map;
+
+ Type *remapTypeImpl(Type *Ty, SmallPtrSetImpl<StructType *> &Seen);
+
+protected:
+ virtual Type *remapScalar(PointerType *PT) = 0;
+ virtual Type *remapVector(VectorType *VT) = 0;
+
+ const DataLayout &DL;
+
+public:
+ BufferFatPtrTypeLoweringBase(const DataLayout &DL) : DL(DL) {}
+ Type *remapType(Type *SrcTy) override;
+ void clear() { Map.clear(); }
+};
+
+/// Remap ptr addrspace(7) to i160 and vector<Nxptr addrspace(7)> to
+/// vector<Nxi60> in order to correctly handling loading/storing these values
+/// from memory.
+class BufferFatPtrToIntTypeMap : public BufferFatPtrTypeLoweringBase {
+ using BufferFatPtrTypeLoweringBase::BufferFatPtrTypeLoweringBase;
+
+protected:
+ Type *remapScalar(PointerType *PT) override { return DL.getIntPtrType(PT); }
+ Type *remapVector(VectorType *VT) override { return DL.getIntPtrType(VT); }
+};
+
+/// Remap ptr addrspace(7) to {ptr addrspace(8), i32} (the resource and offset
+/// parts of the pointer) so that we can easily rewrite operations on these
+/// values that aren't loading them from or storing them to memory.
+class BufferFatPtrToStructTypeMap : public BufferFatPtrTypeLoweringBase {
+ using BufferFatPtrTypeLoweringBase::BufferFatPtrTypeLoweringBase;
+
+protected:
+ Type *remapScalar(PointerType *PT) override;
+ Type *remapVector(VectorType *VT) override;
+};
+} // namespace
+
+// This code is adapted from the type remapper in lib/Linker/IRMover.cpp
+Type *BufferFatPtrTypeLoweringBase::remapTypeImpl(
+ Type *Ty, SmallPtrSetImpl<StructType *> &Seen) {
+ Type **Entry = &Map[Ty];
+ if (*Entry)
+ return *Entry;
+ if (auto *PT = dyn_cast<PointerType>(Ty)) {
+ if (PT->getAddressSpace() == AMDGPUAS::BUFFER_FAT_POINTER) {
+ return *Entry = remapScalar(PT);
+ }
+ }
+ if (auto *VT = dyn_cast<VectorType>(Ty)) {
+ auto *PT = dyn_cast<PointerType>(VT->getElementType());
+ if (PT && PT->getAddressSpace() == AMDGPUAS::BUFFER_FAT_POINTER) {
+ return *Entry = remapVector(VT);
+ }
+ return *Entry = Ty;
+ }
+ // Whether the type is one that is structurally uniqued - that is, if it is
+ // not a named struct (the only kind of type where multiple structurally
+ // identical types that have a distinct `Type*`)
+ StructType *TyAsStruct = dyn_cast<StructType>(Ty);
+ bool IsUniqued = !TyAsStruct || TyAsStruct->isLiteral();
+ // Base case for ints, floats, opaque pointers, and so on, which don't
+ // require recursion.
+ if (Ty->getNumContainedTypes() == 0 && IsUniqued)
+ return *Entry = Ty;
+ if (!IsUniqued) {
+ // Create a dummy type for recursion purposes.
+ if (!Seen.insert(TyAsStruct).second) {
+ StructType *Placeholder = StructType::create(Ty->getContext());
+ return *Entry = Placeholder;
+ }
+ }
+ bool Changed = false;
+ SmallVector<Type *> ElementTypes;
+ ElementTypes.reserve(Ty->getNumContainedTypes());
+ for (unsigned int I = 0, E = Ty->getNumContainedTypes(); I < E; ++I) {
+ Type *OldElem = Ty->getContainedType(I);
+ Type *NewElem = remapTypeImpl(OldElem, Seen);
+ ElementTypes.push_back(NewElem);
+ Changed |= (OldElem != NewElem);
+ }
+ if (!Changed) {
+ return *Entry = Ty;
+ }
+ if (auto *ArrTy = dyn_cast<ArrayType>(Ty))
+ return *Entry = ArrayType::get(ElementTypes[0], ArrTy->getNumElements());
+ if (auto *FnTy = dyn_cast<FunctionType>(Ty))
+ return *Entry = FunctionType::get(ElementTypes[0],
+ ArrayRef(ElementTypes).slice(1),
+ FnTy->isVarArg());
+ if (auto *STy = dyn_cast<StructType>(Ty)) {
+ // Genuine opaque types don't have a remapping.
+ if (STy->isOpaque())
+ return *Entry = Ty;
+ bool IsPacked = STy->isPacked();
+ if (IsUniqued)
+ return *Entry = StructType::get(Ty->getContext(), ElementTypes, IsPacked);
+ SmallString<16> Name(STy->getName());
+ STy->setName("");
+ Type **RecursionEntry = &Map[Ty];
+ if (*RecursionEntry) {
+ auto *Placeholder = cast<StructType>(*RecursionEntry);
+ Placeholder->setBody(ElementTypes, IsPacked);
+ Placeholder->setName(Name);
+ return *Entry = Placeholder;
+ }
+ return *Entry = StructType::create(Ty->getContext(), ElementTypes, Name,
+ IsPacked);
+ }
+ llvm_unreachable("Unknown type of type that contains elements");
+}
+
+Type *BufferFatPtrTypeLoweringBase::remapType(Type *SrcTy) {
+ SmallPtrSet<StructType *, 2> Visited;
+ return remapTypeImpl(SrcTy, Visited);
+}
+
+Type *BufferFatPtrToStructTypeMap::remapScalar(PointerType *PT) {
+ LLVMContext &Ctx = PT->getContext();
+ return StructType::get(PointerType::get(Ctx, AMDGPUAS::BUFFER_RESOURCE),
+ IntegerType::get(Ctx, BufferOffsetWidth));
+}
+
+Type *BufferFatPtrToStructTypeMap::remapVector(VectorType *VT) {
+ ElementCount EC = VT->getElementCount();
+ LLVMContext &Ctx = VT->getContext();
+ Type *RsrcVec =
+ VectorType::get(PointerType::get(Ctx, AMDGPUAS::BUFFER_RESOURCE), EC);
+ Type *OffVec = VectorType::get(IntegerType::get(Ctx, BufferOffsetWidth), EC);
+ return StructType::get(RsrcVec, OffVec);
+}
+
+static bool isBufferFatPtrOrVector(Type *Ty) {
+ if (auto *PT = dyn_cast<PointerType>(Ty->getScalarType()))
+ return PT->getAddressSpace() == AMDGPUAS::BUFFER_FAT_POINTER;
+ return false;
+}
+
+// True if the type is {ptr addrspace(8), i32} or a struct containing vectors of
+// those types. Used to quickly skip instructions we don't need to process.
+static bool isSplitFatPtr(Type *Ty) {
+ auto *ST = dyn_cast<StructType>(Ty);
+ if (!ST)
+ return false;
+ if (!ST->isLiteral() || ST->getNumElements() != 2)
+ return false;
+ auto *MaybeRsrc =
+ dyn_cast<PointerType>(ST->getElementType(0)->getScalarType());
+ auto *MaybeOff =
+ dyn_cast<IntegerType>(ST->getElementType(1)->getScalarType());
+ return MaybeRsrc && MaybeOff &&
+ MaybeRsrc->getAddressSpace() == AMDGPUAS::BUFFER_RESOURCE &&
+ MaybeOff->getBitWidth() == BufferOffsetWidth;
+}
+
+// True if the result type or any argument types are buffer fat pointers.
+static bool isBufferFatPtrConst(Constant *C) {
+ Type *T = C->getType();
+ return isBufferFatPtrOrVector(T) ||
+ llvm::any_of(C->operands(), [](const Use &U) {
+ return isBufferFatPtrOrVector(U.get()->getType());
+ });
+}
+
+namespace {
+/// Convert [vectors of] buffer fat pointers to integers when they are read from
+/// or stored to memory. This ensures that these pointers will have the same
+/// memory layout as before they are lowered, even though they will no longer
+/// have their previous layout in registers/in the program (they'll be broken
+/// down into resource and offset parts). This has the downside of imposing
+/// marshalling costs when reading or storing these values, but since placing
+/// such pointers into memory is an uncommon operation at best, we feel that
+/// this cost is acceptable for better performance in the common case.
+class StoreFatPtrsAsIntsVisitor
+ : public InstVisitor<StoreFatPtrsAsIntsVisitor, bool> {
+ BufferFatPtrToIntTypeMap *TypeMap;
+
+ ValueToValueMapTy ConvertedForStore;
+
+ IRBuilder<> IRB;
+
+ // Convert all the buffer fat pointers within the input value to inttegers
+ // so that it can be stored in memory.
+ Value *fatPtrsToInts(Value *V, Type *From, Type *To, const Twine &Name);
+ // Convert all the i160s that need to be buffer fat pointers (as specified)
+ // by the To type) into those pointers to preserve the semantics of the rest
+ // of the program.
+ Value *intsToFatPtrs(Value *V, Type *From, Type *To, const Twine &Name);
+
+public:
+ StoreFatPtrsAsIntsVisitor(BufferFatPtrToIntTypeMap *TypeMap, LLVMContext &Ctx)
+ : TypeMap(TypeMap), IRB(Ctx) {}
+ bool processFunction(Function &F);
+
+ bool visitInstruction(Instruction &I) { return false; }
+ bool visitAllocaInst(AllocaInst &I);
+ bool visitLoadInst(LoadInst &LI);
+ bool visitStoreInst(StoreInst &SI);
+ bool visitGetElementPtrInst(GetElementPtrInst &I);
+};
+} // namespace
+
+Value *StoreFatPtrsAsIntsVisitor::fatPtrsToInts(Value *V, Type *From, Type *To,
+ const Twine &Name) {
+ if (From == To)
+ return V;
+ ValueToValueMapTy::iterator Find = ConvertedForStore.find(V);
+ if (Find != ConvertedForStore.end())
+ return Find->second;
+ if (isBufferFatPtrOrVector(From)) {
+ Value *Cast = IRB.CreatePtrToInt(V, To, Name + ".int");
+ ConvertedForStore[V] = Cast;
+ return Cast;
+ }
+ if (From->getNumContainedTypes() == 0)
+ return V;
+ // Structs, arrays, and other compound types.
+ Value *Ret = PoisonValue::get(To);
+ if (auto *AT = dyn_cast<ArrayType>(From)) {
+ Type *FromPart = AT->getArrayElementType();
+ Type *ToPart = cast<ArrayType>(To)->getElementType();
+ for (uint64_t I = 0, E = AT->getArrayNumElements(); I < E; ++I) {
+ Value *Field = IRB.CreateExtractValue(V, I);
+ Value *NewField =
+ fatPtrsToInts(Field, FromPart, ToPart, Name + "." + Twine(I));
+ Ret = IRB.CreateInsertValue(Ret, NewField, I);
+ }
+ } else {
+ for (auto [Idx, FromPart, ToPart] :
+ enumerate(From->subtypes(), To->subtypes())) {
+ Value *Field = IRB.CreateExtractValue(V, Idx);
+ Value *NewField =
+ fatPtrsToInts(Field, FromPart, ToPart, Name + "." + Twine(Idx));
+ Ret = IRB.CreateInsertValue(Ret, NewField, Idx);
+ }
+ }
+ ConvertedForStore[V] = Ret;
+ return Ret;
+}
+
+Value *StoreFatPtrsAsIntsVisitor::intsToFatPtrs(Value *V, Type *From, Type *To,
+ const Twine &Name) {
+ if (From == To)
+ return V;
+ if (isBufferFatPtrOrVector(To)) {
+ Value *Cast = IRB.CreateIntToPtr(V, To, Name + ".ptr");
+ return Cast;
+ }
+ if (From->getNumContainedTypes() == 0)
+ return V;
+ // Structs, arrays, and other compound types.
+ Value *Ret = PoisonValue::get(To);
+ if (auto *AT = dyn_cast<ArrayType>(From)) {
+ Type *FromPart = AT->getArrayElementType();
+ Type *ToPart = cast<ArrayType>(To)->getElementType();
+ for (uint64_t I = 0, E = AT->getArrayNumElements(); I < E; ++I) {
+ Value *Field = IRB.CreateExtractValue(V, I);
+ Value *NewField =
+ intsToFatPtrs(Field, FromPart, ToPart, Name + "." + Twine(I));
+ Ret = IRB.CreateInsertValue(Ret, NewField, I);
+ }
+ } else {
+ for (auto [Idx, FromPart, ToPart] :
+ enumerate(From->subtypes(), To->subtypes())) {
+ Value *Field = IRB.CreateExtractValue(V, Idx);
+ Value *NewField =
+ intsToFatPtrs(Field, FromPart, ToPart, Name + "." + Twine(Idx));
+ Ret = IRB.CreateInsertValue(Ret, NewField, Idx);
+ }
+ }
+ return Ret;
+}
+
+bool StoreFatPtrsAsIntsVisitor::processFunction(Function &F) {
+ bool Changed = false;
+ // The visitors will mutate GEPs and allocas, but will push loads and stores
+ // to the worklist to avoid invalidation.
+ for (Instruction &I : make_early_inc_range(instructions(F))) {
+ Changed |= visit(I);
+ }
+ ConvertedForStore.clear();
+ return Changed;
+}
+
+bool StoreFatPtrsAsIntsVisitor::visitAllocaInst(AllocaInst &I) {
+ Type *Ty = I.getAllocatedType();
+ Type *NewTy = TypeMap->remapType(Ty);
+ if (Ty == NewTy)
+ return false;
+ I.setAllocatedType(NewTy);
+ return true;
+}
+
+bool StoreFatPtrsAsIntsVisitor::visitGetElementPtrInst(GetElementPtrInst &I) {
+ Type *Ty = I.getSourceElementType();
+ Type *NewTy = TypeMap->remapType(Ty);
+ if (Ty == NewTy)
+ return false;
+ // We'll be rewriting the type `ptr addrspace(7)` out of existence soon, so
+ // make sure GEPs don't have different semantics with the new type.
+ I.setSourceElementType(NewTy);
+ I.setResultElementType(TypeMap->remapType(I.getResultElementType()));
+ return true;
+}
+
+bool StoreFatPtrsAsIntsVisitor::visitLoadInst(LoadInst &LI) {
+ Type *Ty = LI.getType();
+ Type *IntTy = TypeMap->remapType(Ty);
+ if (Ty == IntTy)
+ return false;
+
+ IRB.SetInsertPoint(&LI);
+ auto *NLI = cast<LoadInst>(LI.clone());
+ NLI->mutateType(IntTy);
+ NLI = IRB.Insert(NLI);
+ copyMetadataForLoad(*NLI, LI);
+ NLI->takeName(&LI);
+
+ Value *CastBack = intsToFatPtrs(NLI, IntTy, Ty, NLI->getName());
+ LI.replaceAllUsesWith(CastBack);
+ LI.eraseFromParent();
+ return true;
+}
+
+bool StoreFatPtrsAsIntsVisitor::visitStoreInst(StoreInst &SI) {
+ Value *V = SI.getValueOperand();
+ Type *Ty = V->getType();
+ Type *IntTy = TypeMap->remapType(Ty);
+ if (Ty == IntTy)
+ return false;
+
+ IRB.SetInsertPoint(&SI);
+ Value *IntV = fatPtrsToInts(V, Ty, IntTy, V->getName());
+ for (auto *Dbg : at::getAssignmentMarkers(&SI))
+ Dbg->setValue(IntV);
+
+ SI.setOperand(0, IntV);
+ return true;
+}
+
+/// Return the ptr addrspace(8) and i32 (resource and offset parts) in a lowered
+/// buffer fat pointer constant.
+static std::pair<Constant *, Constant *>
+splitLoweredFatBufferConst(Constant *C) {
+ if (auto *AZ = dyn_cast<ConstantAggregateZero>(C))
+ return std::make_pair(AZ->getStructElement(0), AZ->getStructElement(1));
+ if (auto *SC = dyn_cast<ConstantStruct>(C))
+ return std::make_pair(SC->getOperand(0), SC->getOperand(1));
+ llvm_unreachable("Conversion should've created a {p8, i32} struct");
+}
+
+namespace {
+/// Handle the remapping of ptr addrspace(7) constants.
+class FatPtrConstMaterializer final : public ValueMaterializer {
+ BufferFatPtrToStructTypeMap *TypeMap;
+ BufferFatPtrToIntTypeMap *IntTypeMap;
+ // An internal mapper that is used to recurse into the arguments of constants.
+ // While the documentation for `ValueMapper` specifies not to use it
+ // recursively, examination of the logic in mapValue() shows that it can
+ // safely be used recursively when handling constants, like it does in its own
+ // logic.
+ ValueMapper InternalMapper;
+
+ Constant *materializeBufferFatPtrConst(Constant *C);
+
+ const DataLayout &DL;
+
+public:
+ // UnderlyingMap is the value map this materializer will be filling.
+ FatPtrConstMaterializer(BufferFatPtrToStructTypeMap *TypeMap,
+ ValueToValueMapTy &UnderlyingMap,
+ BufferFatPtrToIntTypeMap *IntTypeMap,
+ const DataLayout &DL)
+ : TypeMap(TypeMap), IntTypeMap(IntTypeMap),
+ InternalMapper(UnderlyingMap, RF_None, TypeMap, this), DL(DL) {}
+ virtual ~FatPtrConstMaterializer() = default;
+
+ Value *materialize(Value *V) override;
+};
+} // namespace
+
+Constant *FatPtrConstMaterializer::materializeBufferFatPtrConst(Constant *C) {
+ Type *SrcTy = C->getType();
+ auto *NewTy = dyn_cast<StructType>(TypeMap->remapType(SrcTy));
+ if (C->isNullValue())
+ return ConstantAggregateZero::getNullValue(NewTy);
+ if (isa<PoisonValue>(C))
----------------
arsenm wrote:
Braces
https://github.com/llvm/llvm-project/pull/77952
More information about the llvm-commits
mailing list