[PATCH] D138141: [amdgpu] Reimplement LDS lowering

Jon Chesterfield via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sat Nov 19 10:15:13 PST 2022


JonChesterfield marked an inline comment as done.
JonChesterfield added a comment.

Was sidetracked by trying to work out why global-variable-relocs.ll and waitcnt-looptest.ll have started failing, but neither do anything with LDS (no addrspace(3) vars plus lowering pass makes no change to the IR), and rolling back to main doesn't fix them. Buildbot is green so that's weird, attributing to local quirk / non-determinism.

Comments responded to above, lit test coverage could still be better.



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp:161-190
+  static constexpr const char name[] = "amdgpu-lower-module-lds-strategy";
+
+  enum Opts { module, table, kernel, hybrid, unknown };
+
+  Opts Value = Opts::module;
+  operator Opts() const { return Value; }
+
----------------
arsenm wrote:
> You're re-inventing cl::opt with clEnumVal
So I was. Interesting. Changed to use clEnumValN (to preserve the namespacing at the use sites). 

Sadly the test case lds-reject-unknown-lowering-kind.ll - updated with the new error string - is now refusing to run under check-llvm despite running fine in a shell. That coverage seems less important when using a ready-made parser so I've deleted the test.



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp:532
+      } else {
+        Elements.push_back(UndefValue::get(I16));
+      }
----------------
nlopes wrote:
> Please use PoisonValue here if possible. We are trying to remove undef values, and thus we want to avoid adding new uses that can be easily replaced with poison.
> Thank you!
Whether this can be easily replaced with poison depends on how the rest of the toolchain deals with poison. The string replacement is fine, it's whether it then blows up under testing due to interaction with the rest of the compiler that encourages caution.

The semantics of poison fit this behaviour - if this value is read, the compiler has a bug or runtime behaviour has diverged from that compiler's expectation. I would therefore like to change from undef to poison after this lands, so that if it does fail testing we have a much easier time triaging.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp:610
+
+      for (Use &U : llvm::make_early_inc_range(GV->uses())) {
+
----------------
arsenm wrote:
> Don't need llvm:: (not sure why this seems to be the most popular function for people to put it on)
Had it in a few other places too, on various types and on sort. Dropped all of them, then put it back on sort to resolve ambiguity with std::sort (which doesn't make masses of sense to me but will stay on task)


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp:640
+        Value *replacement =
+            Builder.CreateIntToPtr(extended, GV->getType(), GV->getName());
+
----------------
arsenm wrote:
> Can you add an assert somewhere that the maximum LDS size fits in 16-bits? I'm also wondering if we should just make LDS pointers 16-bit with 4 byte alignment, or add a 16-bit address space for this
The backend currently errors on that in user-facing fashion. A check here will be slightly less accurate than in the back end because this runs before promote alloca puts more things into LDS and I'm not sure we have more information to provide a better report either - what sort of error message do you have in mind?

I suppose the current code path would truncate the offsets to i16 before the back end crashes, but that seems harmless. We'd have slightly cleaner tests with a native sixteen bit type but there's not much in it.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMCInstLower.cpp:171
+  // Intercept LDS variables with known addresses
+  if (const GlobalVariable *GV = dyn_cast<GlobalVariable>(CV)) {
+    if (AMDGPUMachineFunction::isKnownAddressLDSGlobal(*GV)) {
----------------
arsenm wrote:
> Is this actually reachable?
Definitely yes. It's hit when lowering the lookup table for the table fallback.


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp:128
+
+  if (GV.getName().equals(ModuleLDSName)) {
+    return 0;
----------------
arsenm wrote:
> Why not just ==?
iirc that did the wrong thing with a string literal, as opposed to a StringLiteral


================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.h:111-112
+  // LDS allocation should have an associated kernel function
+  static const Function *
+  getKernelLDSFunctionFromGlobal(const GlobalVariable &GV);
   static const GlobalVariable *
----------------
arsenm wrote:
> These aren't really gaining anything from being part of AMDGPUMachineFunction, but I guess it's where the other stuff already lives
I'd guess they're here because the dynamic lds alignment is tracked through the instance, but with that now mostly factored out they could be moved under some utility header.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D138141/new/

https://reviews.llvm.org/D138141



More information about the llvm-commits mailing list