[PATCH] D94648: [amdgpu] Implement lower function LDS pass

Jon Chesterfield via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Jan 28 16:50:16 PST 2021


JonChesterfield added a comment.

I think that's all the comments addressed other than objections to inline asm, which still passes zero to a "s" constrained register.

The inline asm makes the kernel use the newly created LDS structure, so that passes like PromoteAlloca can see that it uses said structure instance. Some alternatives are:

- Modify PromoteAlloca (and any other passes that use size of LDS) to look for the magic variable. Means spreading knowledge of this transform across other passes.
- Add an IR intrinsic, SDag and GlobalISel lowering to pseudo instruction, pseudo expansion to no-op. Semantically very similar to the inline asm.
- Metadata - mark kernels as using +N bytes of LDS beyond what their uses suggest
- Alternative lowering / transform

There are various optimisations available, e.g. metadata to mark functions as can't-use-lds, propagated, and used to drop the 'use' of the variable from some kernels, indirection to allow putting variables at different offsets across different kernels etc.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94648/new/

https://reviews.llvm.org/D94648



More information about the llvm-commits mailing list