[llvm] [AMDGPU][InsertWaitCnts] Optimize loadcnt insertion at function boundaries (PR #169647)
Jay Foad via llvm-commits
llvm-commits at lists.llvm.org
Wed Nov 26 07:38:31 PST 2025
jayfoad wrote:
Looks good overall, but the implementation is more complicated than it needs to be because of a pre-existing issue: on GFX12+ we never use VMEM_ACCESS, only VMEM_READ_ACCESS. (@Pierre-vh it might be a nice cleanup to combine those two into a single WaitEventType, since no subtarget uses both of them.)
https://github.com/llvm/llvm-project/pull/169647
More information about the llvm-commits
mailing list