[PATCH] D109594: [AMDGPU] Initialize LDS pointers after alloca, but before call.

Jon Chesterfield via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 14 04:09:56 PDT 2021


JonChesterfield added a comment.

Modifying exec from IR is right out. I was referring back to the internal discussion on how to implement this where that was briefly considered.

Given we now know more than when the original decision was taken, let us revisit that decision and not split the block. MIR is a point where this and all other uniform stores could be optimised for power consumption by masking exec, without changing IR optimisations.

Moving all alloca to entry was proposed on the mailing list some years ago and implemented in at least one out of tree target. I suspect it is not already done in tree because codegen is more efficient on most CPU targets without it. I doubt it is especially complicated to implement - the entry block dominates the other blocks in the function.

I'm not convinced the transform being enabled here is necessary, and have previously outlined a variety of alternatives which you remain unwilling to consider. My interest here is solely in avoiding your "temporary hacks" breaking openmp.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109594/new/

https://reviews.llvm.org/D109594



More information about the llvm-commits mailing list