[PATCH] D94648: [amdgpu] Implement lower function LDS pass

Jon Chesterfield via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 16 04:23:08 PDT 2021


JonChesterfield added a comment.

Note to self - there is ongoing interest in minimising the LDS usage of applications. This patch allocates the struct in every kernel (see the call to markUsedByKernel, it is applied exactly once to each kernel), in order to support calls to functions that make use of that struct.

This could be refined. Kernels that make no calls don't need to unconditionally allocate this struct. If the kernel itself does use some LDS that was moved into it, that use will remain and suffice to trigger allocation of the struct as normal. More difficult to compute (one for the attributor?), kernels that call no functions that could refer to that struct also don't need to allocate it.

A simplified variant on @hsmhsm's proposal, an LDS variable that is used from an internal function that has not had it's address taken could be passed into the function by pointer from the caller, ultimately leaving the &var, i.e. the use of that variable, in the top level kernel. Access to that variable would be slower than in this patch - an extra dereference, and loss of an argument register to propagate the address down the call tree - but it would move the variable out of the combined struct for a saving of LDS in other kernels. For large variables and scarce LDS that is probably a win.

See also a note further up about maintaining the name of the variable through the IR, instead of using '0' directly, as that would make the IR easier to read. Particularly useful if we end up refining this pass further.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94648/new/

https://reviews.llvm.org/D94648



More information about the llvm-commits mailing list