[llvm] [AMDGPU] Add DS loop wait optimization infrastructure (1/4) (PR #171942)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Fri Dec 12 03:18:23 PST 2025


jayfoad wrote:

I do have some high level concerns about the whole series:
1. It's highly specific to your use case. I don't see why we can't do the same optimization for _all_ wait types in _all_ loops (or at least all inner loops). Why only DS? Why only loops with lots of WMMA? Etc.
2. It adds a lot of new code that is not integrated into the existing flow. For example we already have FlushVmCnt which does something pretty similar, but the implementation is completely separate.

https://github.com/llvm/llvm-project/pull/171942


More information about the llvm-commits mailing list