[PATCH] D94648: [amdgpu] Implement lower function LDS pass
Jon Chesterfield via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 28 16:50:16 PST 2021
JonChesterfield added a comment.
I think that's all the comments addressed other than objections to inline asm, which still passes zero to a "s" constrained register.
The inline asm makes the kernel use the newly created LDS structure, so that passes like PromoteAlloca can see that it uses said structure instance. Some alternatives are:
- Modify PromoteAlloca (and any other passes that use size of LDS) to look for the magic variable. Means spreading knowledge of this transform across other passes.
- Add an IR intrinsic, SDag and GlobalISel lowering to pseudo instruction, pseudo expansion to no-op. Semantically very similar to the inline asm.
- Metadata - mark kernels as using +N bytes of LDS beyond what their uses suggest
- Alternative lowering / transform
There are various optimisations available, e.g. metadata to mark functions as can't-use-lds, propagated, and used to drop the 'use' of the variable from some kernels, indirection to allow putting variables at different offsets across different kernels etc.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D94648/new/
https://reviews.llvm.org/D94648
More information about the llvm-commits
mailing list