[PATCH] D94648: [amdgpu] Implement lower function LDS pass

Jon Chesterfield via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 13 19:40:30 PST 2021


JonChesterfield created this revision.
JonChesterfield added reviewers: hsmhsm, scchan, b-sumner, madhur13490, yaxunl, t-tye, msearles, acmeman925, arsenm, rampitec.
Herald added subscribers: kerbowa, jfb, mgrang, hiraditya, tpr, dstuttard, mgorny, nhaehnle, jvesely, kzhuravl.
JonChesterfield requested review of this revision.
Herald added subscribers: llvm-commits, sstefan1, wdng.
Herald added a reviewer: jdoerfert.
Herald added a project: LLVM.

[amdgpu] Implement lower function LDS pass

Local variables are allocated at kernel launch. This pass collects global
variables that are used from non-kernel functions, moves them into a new struct
type, and allocates an instance of that type in every kernel. Uses are then
replaced with a constantexpr offset.

Prior to this pass, accesses from a function are compiled to trap. With this
pass, most such accesses are removed before reaching codegen. The trap logic
is left unchanged by this pass. It is still reachable for the cases this pass
misses, notably the extern shared construct from hip and variables marked
constant which survive the optimizer.

This is of interest to the openmp project because the deviceRTL runtime library
uses cuda shared variables from functions that cannot be inlined. Trunk llvm
therefore cannot compile some openmp kernels for amdgpu. In addition to the
unit tests attached, this patch applied to ROCm llvm with fixed-abi enabled
and the function pointer hashing scheme deleted passes the openmp suite.

This lowering will use more LDS than strictly necessary. It is intended to be
a functionally correct fallback for cases that are difficult to target from
future optimisation passes.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D94648

Files:
  llvm/lib/Target/AMDGPU/AMDGPU.h
  llvm/lib/Target/AMDGPU/AMDGPULowerFunctionLDSPass.cpp
  llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp
  llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
  llvm/lib/Target/AMDGPU/CMakeLists.txt
  llvm/test/CodeGen/AMDGPU/GlobalISel/lds-global-non-entry-func.ll
  llvm/test/CodeGen/AMDGPU/addrspacecast-initializer-unsupported.ll
  llvm/test/CodeGen/AMDGPU/lds-global-non-entry-func.ll
  llvm/test/CodeGen/AMDGPU/lower-function-lds-inactive.ll
  llvm/test/CodeGen/AMDGPU/lower-function-lds-used-list.ll
  llvm/test/CodeGen/AMDGPU/lower-function-lds.ll
  llvm/test/CodeGen/AMDGPU/promote-alloca-to-lds-constantexpr-use.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D94648.316557.patch
Type: text/x-patch
Size: 27593 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210114/ecc659c6/attachment.bin>


More information about the llvm-commits mailing list