[PATCH] D91516: [AMDGPU] Replace uses of LDS globals within non-kernel functions by pointers.

Jon Chesterfield via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Apr 12 07:12:56 PDT 2021


JonChesterfield added a comment.

The algorithm I had in mind was along the lines of:

  for each LDS variable:
    if should-transform
      create 16 bit integer in LDS
      initialize that global with (constexpr) address of variable
      replace all uses of variable with a (constexpr) access through new pointer
  
  where
  should-transform:
   if (sizeof) < 8ish return false
   if used by instruction in indirectly called function return false
   if only used by kernels return false
   probably other exclusions
   return true
   



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUReplaceLDSUseWithPointer.cpp:384
+      SmallPtrSet<Instruction *, 8> Insts;
+      getInstructions(U, Insts);
+
----------------
hsmhsm wrote:
> JonChesterfield wrote:
> > Why do we want to replace constexpr with instructions? This comment contradicts the implementation
> Again not clear about what you intended here - let's take it offline.
I was thinking the introduced 16 bit pointers will be initialised with constexpr from the corresponding variable.

This patch presently initialises them with undef, which I think thwarts using constexpr everywhere, and means we insert stores in the kernel entry basic block here.

If we fix the back end to handle LDS variables with initializers (at least the simple case of only used from kernel and initialized with address of some other variable), then quite a lot of the complexity of this patch drops out.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp:196
 
-static cl::opt<bool>
-    DisableLowerModuleLDS("amdgpu-disable-lower-module-lds", cl::Hidden,
-                          cl::desc("Disable lower module lds pass"),
-                          cl::init(false));
+static cl::opt<bool, true> EnableLowerModuleLDS(
+    "amdgpu-enable-lower-module-lds", cl::desc("Enable lower module lds pass"),
----------------
I think this test reads better as proposed here - 'enable-lower-module-lds=true' is better than 'disable-lower-module-lds=false'. Separable from the rest of this patch, we could land a patch that just inverts that commandline flag and updates the tests to match. That removes some noise from this review.


================
Comment at: llvm/test/CodeGen/AMDGPU/lower-module-lds.ll:22
 
+; The two i16 pointers which are introduced by the pass `amdgpu-replace-lds-use-with-pointer` should be removed by the pass `amdgpu-lower-module-lds`.
+; CHECK-NOT: @llvm.amdgcn.lds.pointer.1 = internal unnamed_addr addrspace(3) global i16 undef, align 2
----------------
This test should only check the behaviour of lower-module-lds. Separate tests check the behaviour of amdgpu-replace-lds-use-with-pointer.

Equally, running amdgpu-lower-module-lds by itself should not automatically run amdgpu-replace-lds-use-with-pointer and vice versa.


================
Comment at: llvm/test/CodeGen/AMDGPU/replace_lds_report_error_no_func_def.ll:3
+
+; ERROR: LLVM ERROR: The pass "Replace LDS Use With Pointer" assumes that the definitions of both caller and callee appear within same module. But, definition for the callee "callee_1" not available.
+
----------------
This is an error in the implementation, not something that should have a test checking the implementation is broken. Instead of assuming the definition of both are in the same module and crashing if they aren't, the pass should ignore a variable which doesn't meet that requirement.


================
Comment at: llvm/test/CodeGen/AMDGPU/replace_lds_test_direct_call_misc.ll:15
+; GCN-LABEL: entry:
+; GCN: %0 = load i16, i16 addrspace(3)* @llvm.amdgcn.lds.pointer.{{[0-9]+}}, align 2
+; GCN-NEXT: %1 = inttoptr i16 %0 to [1 x i32] addrspace(3)*
----------------
These tests would be more robust if the new pointer was named based on the global it is intended to reference, as then the regex can check that we created load from the correct pointer (as opposed to just one of the new pointers).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D91516/new/

https://reviews.llvm.org/D91516



More information about the llvm-commits mailing list