[PATCH] D102401: [AMDGPU] Allocate LDS globals in sorted order of their alignment and size.

Stanislav Mekhanoshin via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 13 10:34:00 PDT 2021


rampitec added a comment.

We probably need a test which only runs this lowering and showing a resulting transformed IR. Test the alignment, sorting and situation with multiple kernels using intersecting LDS variable sets.



================
Comment at: llvm/include/llvm/Support/Alignment.h:348
 
+inline Align max(Align Lhs, Align Rhs) { return Lhs > Rhs ? Lhs : Rhs; }
+
----------------
It does not follow the style around, make the body on separate lines.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp:41
+// Collect all kernel functions within which `GV` is used.
+static std::set<const Function *> collectUsedKernels(const GlobalVariable *GV) {
+  std::set<const Function *> UsedKernels;
----------------
Is there a reason to use std::set and not one of llvm's collections?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp:52
+
+    if (isa<GlobalVariable>(UU))
+      continue;
----------------
Can a GV hold a pointer to this GV?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp:61
+    const auto *I = dyn_cast<Instruction>(UU);
+    assert(I && "Expected an instruction");
+    const Function *F = I->getFunction();
----------------
It can probably be Operator.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp:70
+
+static std::vector<const GlobalVariable *>
+getSortedUsedLDSGlobals(const Function *K) {
----------------
Why std::vector?


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp:82
+  for (const auto *GV : LDSGlobals) {
+    if (llvm::is_contained(collectUsedKernels(GV), K))
+      UsedLDSGlobals.push_back(GV);
----------------
Looks like you are traversing all users just to check if a GC used from this kernel. This is wasteful, you can just create a function checking that a GV is reachable from this kernel and early return if it is.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp:149
+    for (const auto *GV : SortedLDSGlobals)
+      allocateLDSGlobal(F.getParent()->getDataLayout(), *GV);
+  }
----------------
Get DL outside of the loop.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp:165
+
   /// TODO: We should sort these to minimize wasted space due to alignment
   /// padding. Currently the padding is decided by the first encountered use
----------------
This comment shall not be relevant anymore?


================
Comment at: llvm/lib/Target/AMDGPU/Utils/AMDGPULDSUtils.cpp:125
 
+std::vector<const GlobalVariable *> collectStaticLDSGlobals(const Module *M) {
+  std::vector<const GlobalVariable *> StaticLDSGlobals;
----------------
This is mostly a copy-paste of findVariablesToLower() above. I would suggest to factor it somehow.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102401/new/

https://reviews.llvm.org/D102401



More information about the llvm-commits mailing list