[llvm] [AMDGPU][InsertWaitCnts] Optimize loadcnt insertion at function boundaries (PR #169647)

Pankaj Dwivedi via llvm-commits llvm-commits at lists.llvm.org
Wed Nov 26 08:13:44 PST 2025


PankajDwivedi-25 wrote:





> Looks good overall, but the implementation is more complicated than it needs to be because of a pre-existing issue: on GFX12+ we never use VMEM_ACCESS, only VMEM_READ_ACCESS.

Then probably I should avoid checking both here VMEM_ACCESS, VMEM_READ_ACCESS , or that will go in another patch.


https://github.com/llvm/llvm-project/pull/169647


More information about the llvm-commits mailing list