[PATCH] D101310: [AMDGPU] Replace uses of LDS globals within non-kernel functions by pointers.

Mahesha S via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu May 6 22:46:11 PDT 2021


hsmhsm added a comment.

In D101310#2743123 <https://reviews.llvm.org/D101310#2743123>, @JonChesterfield wrote:

> I can't work out from the implementation, or the comments, what the condition is for deciding whether to transform an LDS variable. Exactly which cases is this intended to transform?

I in-fact implemented what you advocated-for in the abondened patch. that is,

for each LDS variable:

  if should-transform
    create 16 bit integer in LDS
    initialize that global with (constexpr) address of variable
    replace all uses of variable with a (constexpr) access through new pointer

where
should-transform:
 if (sizeof) < 8ish return false
 if used by instruction in indirectly called function return false
 if only used by kernels return false
 probably other exclusions
 return true

> Emitting stores in the entry block is probably not sufficient to ensure the pointers have the right value when later read from. The stores will look dead and can be eliminated or moved across the function calls. One of the attractions of implementing LDS initializers in the back end is ensuring that such transforms cannot break the lowering. Alternatively, the lower module pass + backend coupling could be updated so that an object can be used instead of zero, at which point the IR passes will natively understand the connection.

In that case, I will abandon this patch, and stop spending any time on it. I hope that you will continue to strenthen your patch accordingly.

> In order to exercise this code at runtime, we need executable code that uses LDS from functions. I think that will be necessary to be confident that the transform proposed here works correctly. Possibly also a microbenchmark to determine which cases this makes faster as well.

When we do not still have a common agreement on the implementation itself, threre is no point in microbenchmarking, and following the same anology your original patch needs it in the first place. So, I just stop working on it anymore by assuming that you will take care of updating your original patch when you feel that the optimal LDS usage required.



================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUReplaceLDSUseWithPointer.cpp:84
+
+  report_fatal_error(ErrStr);
+}
----------------
JonChesterfield wrote:
> This is an optimisation. Instead of a fatal error, it can always leave the variable unchanged. Therefore it should not abort compilation.
call to reportReplaceLDSError() is being made, when the pass cannot proceed further for one or the other reason (internal error situation). And, it is perfectly valid for any optimiation pass to abonden and report an error when it faces some internal error kind of situation. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101310/new/

https://reviews.llvm.org/D101310



More information about the llvm-commits mailing list