[PATCH] D94648: [amdgpu] Implement lower function LDS pass
Jon Chesterfield via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 24 15:28:16 PST 2021
JonChesterfield added a comment.
Working again now. Mailing list revealed that the new pass manager isn't used for backends yet, so the last patch dropped the invocation from the opt pipeline. Left the plumbing in place (so the pass can still be run with the new manager, as in the tests). When the new pass manager is used for the amdgcn backend, we can slot this pass in roughly the same place as it runs now.
With this patch and amdgpu-fixed-function-abi=true, most of the generic openmp kernels in the aomp test suite pass with the function pointer hashing scheme disabled. That isn't quite the same as most generic kernels passing with trunk clang, though ones that don't use printf or malloc would be expected to.
Started on the path to making this safe to run repeatedly, with more LDS introduced in between each step. That makes the rewrite to access at 0 + offset unsafe. Would instead need to emit uses of the module directly, and patch allocateLDSGlobal to consider that specific variable to be fine to access from within a non-kernel function, as well as allocating it at zero as we presently do. That would be equivalent to the current scheme, except slightly more obvious what is going on in the IR, in exchange for being a more invasive change to the back end.
That change plus renaming the variable if already present would be correct for multiple invocations. Cleaner would be to also replace the module variable with scalars, SROA fashion, before starting the pass to avoid the nested struct buildup. I'd like to leave those revisions for later as this patch is already a couple of months in.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D94648/new/
https://reviews.llvm.org/D94648
More information about the llvm-commits
mailing list