[PATCH] D145441: [AMDGPU] Define data layout entries for buffers
Krzysztof Drewniak via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 6 15:23:35 PST 2023
krzysz00 created this revision.
krzysz00 added reviewers: nhaehnle, arsenm, b-sumner, piotr, sstefan1, jdoerfert.
Herald added subscribers: kosarev, jeroen.dobbelaere, foad, wenlei, okura, kuter, kerbowa, arphaman, zzheng, hiraditya, arichardson, tpr, dstuttard, yaxunl, jvesely, kzhuravl, MatzeB.
Herald added a project: All.
krzysz00 requested review of this revision.
Herald added subscribers: llvm-commits, cfe-commits, pcwang-thead, wdng.
Herald added projects: clang, LLVM.
Per discussion at
https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798,
we define two new address spaces for AMDGCN targets.
The first is address space 7, a non-integral address space (which was
already in the data layout) that has 160-bit pointers (which are
256-bit aligned) and uses a 32-bit offset. These pointers combine a
128-bit buffer descriptor and a 32-bit offset, and will be usable with
normal LLVM operations (load, store, GEP). However, they will be
rewritten out of existence before code generation.
The second of these is address space 8, the address space for "buffer
resources". These will be used to represent the resource arguments to
buffer instructions, and the intrinsics will, in the future, be
changed from taking <4 x i32> as the resource arguments to a
ptr addrspace(8). These pointers are 128-bits long (with the same
alignment). However, they must not be used as the arguments to
getelementptr or otherwise used in address computations, since they
can have arbitrarily complex inherent addressing semantics that can't
be represented in LLVM. These are, however, integral, since inttoptr
and ptrtoint behave deterministically and reasonably. While this runs
the risk of GEPs being optimized to incorrect pointer arithmetic,
address space 8 pointers / buffer resources must not appear in a GEP
anyway, so it'll be fine.
Future work includes:
- Upgrading (including auto-upgrading) buffer intrinsics from 4xi32 to
ptr addrspace(8).
- A late rewrite to turn address space 7 operations into buffer
intrinsics and offset computations.
This commit also updates the "fallback address space" for buffer
intrinsics to the buffer resource, and updates the alias analysis
table.
Depends on D143437 <https://reviews.llvm.org/D143437>
Repository:
rG LLVM Github Monorepo
https://reviews.llvm.org/D145441
Files:
clang/lib/Basic/Targets/AMDGPU.cpp
clang/test/CodeGen/target-data.c
clang/test/CodeGenOpenCL/amdgpu-env-amdgcn.cl
llvm/docs/AMDGPUUsage.rst
llvm/lib/IR/AutoUpgrade.cpp
llvm/lib/Target/AMDGPU/AMDGPU.h
llvm/lib/Target/AMDGPU/AMDGPUAliasAnalysis.cpp
llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
llvm/lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp
llvm/lib/Target/AMDGPU/SIISelLowering.cpp
llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f32-no-rtn.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f32-rtn.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.f64.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.v2f16-no-rtn.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/buffer-atomic-fadd.v2f16-rtn.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/irtranslator-non-integral-address-spaces.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.atomic.dim.a16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.dim.a16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.load.2d.d16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.load.2d.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.load.2darraymsaa.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.load.3d.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.a16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.d.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.g16.a16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.sample.g16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.store.2d.d16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.image.atomic.dim.mir
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.add.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.cmpswap.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.fadd-with-ret.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.atomic.fadd.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.load.format.f16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.load.format.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.load.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.format.f16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.format.f32.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.buffer.store.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.load.f16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.load.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.store.f16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.store.i8.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.raw.tbuffer.store.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.add.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.cmpswap.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.fadd-with-ret.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.atomic.fadd.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.format.f16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.format.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.load.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.store.format.f16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.store.format.f32.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.buffer.store.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.tbuffer.load.f16.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/llvm.amdgcn.struct.tbuffer.load.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.image.load.1d.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.image.sample.1d.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.raw.buffer.load.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.struct.buffer.load.ll
llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.struct.buffer.store.ll
llvm/test/CodeGen/AMDGPU/addrspacecast-captured.ll
llvm/test/CodeGen/AMDGPU/annotate-kernel-features-hsa.ll
llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f32-no-rtn.ll
llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f32-rtn.ll
llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.f64.ll
llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.v2f16-no-rtn.ll
llvm/test/CodeGen/AMDGPU/buffer-atomic-fadd.v2f16-rtn.ll
llvm/test/CodeGen/AMDGPU/buffer-intrinsics-mmo-offsets.ll
llvm/test/CodeGen/AMDGPU/cgp-addressing-modes.ll
llvm/test/CodeGen/AMDGPU/extract_subvector_vec4_vec3.ll
llvm/test/CodeGen/AMDGPU/force-alwaysinline-lds-global-address.ll
llvm/test/CodeGen/AMDGPU/loop-idiom.ll
llvm/test/CodeGen/AMDGPU/mdt-preserving-crash.ll
llvm/test/CodeGen/AMDGPU/noop-shader-O0.ll
llvm/test/CodeGen/AMDGPU/nullptr-long-address-spaces.ll
llvm/test/CodeGen/AMDGPU/nullptr.ll
llvm/test/CodeGen/AMDGPU/promote-alloca-lifetime.ll
llvm/test/CodeGen/AMDGPU/promote-alloca-to-lds-select.ll
llvm/test/CodeGen/AMDGPU/sgpr-copy-local-cse.ll
llvm/test/CodeGen/AMDGPU/splitkit-getsubrangeformask.ll
llvm/test/CodeGen/AMDGPU/unroll.ll
llvm/test/CodeGen/AMDGPU/unsupported-image-a16.ll
llvm/test/CodeGen/AMDGPU/unsupported-image-g16.ll
llvm/test/CodeGen/AMDGPU/vgpr-liverange-ir.ll
llvm/test/CodeGen/MIR/AMDGPU/custom-pseudo-source-values.ll
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/adaptive_constant_global_redzones.ll
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/adaptive_global_redzones.ll
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_do_not_instrument_lds.ll
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_do_not_instrument_scratch.ll
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_instrument_constant_address_space.ll
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_instrument_generic_address_space.ll
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/asan_instrument_global_address_space.ll
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/global_metadata_addrspacecasts.ll
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/no_redzones_in_lds_globals.ll
llvm/test/Instrumentation/AddressSanitizer/AMDGPU/no_redzones_in_scratch_globals.ll
llvm/test/Transforms/AlignmentFromAssumptions/amdgpu-crash.ll
llvm/test/Transforms/AtomicExpand/AMDGPU/expand-atomic-i16.ll
llvm/test/Transforms/EarlyCSE/AMDGPU/memrealtime.ll
llvm/test/Transforms/IndVarSimplify/AMDGPU/no-widen-to-i64.ll
llvm/test/Transforms/InferAddressSpaces/AMDGPU/noop-ptrint-pair.ll
llvm/test/Transforms/InferAddressSpaces/AMDGPU/ptrmask.ll
llvm/test/Transforms/InferAddressSpaces/X86/noop-ptrint-pair.ll
llvm/test/Transforms/Inline/AMDGPU/amdgpu-inline-alloca-argument.ll
llvm/test/Transforms/InstCombine/AMDGPU/memcpy-from-constant.ll
llvm/test/Transforms/InstCombine/alloca-in-non-alloca-as.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/aa-metadata.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/adjust-alloca-alignment.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/complex-index.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/extended-index.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/gep-bitcast.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/insertion-point.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/interleaved-mayalias-store.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/invariant-load.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores-private.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/merge-stores.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/missing-alignment.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/multiple_tails.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/no-implicit-float.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/optnone.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/pointer-elements.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/selects-inseltpoison.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/selects.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/store_with_aliasing_load.ll
llvm/test/Transforms/LoadStoreVectorizer/AMDGPU/weird-type-accesses.ll
llvm/test/Transforms/LoopLoadElim/pr46854-adress-spaces.ll
llvm/test/Transforms/LoopStrengthReduce/AMDGPU/atomics.ll
llvm/test/Transforms/LoopStrengthReduce/AMDGPU/different-addrspace-addressing-mode-loops.ll
llvm/test/Transforms/LoopStrengthReduce/AMDGPU/lsr-invalid-ptr-extend.ll
llvm/test/Transforms/LoopStrengthReduce/AMDGPU/lsr-postinc-pos-addrspace.ll
llvm/test/Transforms/LoopStrengthReduce/AMDGPU/preserve-addrspace-assert.ll
llvm/test/Transforms/OpenMP/attributor_pointer_offset_crash.ll
llvm/test/Transforms/OpenMP/spmdization_constant_prop.ll
llvm/test/Transforms/OpenMP/values_in_offload_arrays.alloca.ll
llvm/test/Transforms/SLPVectorizer/AMDGPU/address-space-ptr-sze-gep-index-assert.ll
llvm/test/Transforms/VectorCombine/AMDGPU/as-transition-inseltpoison.ll
llvm/test/Transforms/VectorCombine/AMDGPU/as-transition.ll
llvm/unittests/Bitcode/DataLayoutUpgradeTest.cpp
More information about the llvm-commits
mailing list