[llvm] [AMDGPU] Insert waitcnt for non-global fence release in GFX12 (PR #159282)

Fabian Ritter via llvm-commits llvm-commits at lists.llvm.org
Fri Sep 19 06:12:39 PDT 2025


ritter-x2a wrote:

@Pierre-vh 

> check if other CacheControl classes have a similar issue

No, the early return for globals was only present for gfx12, the previous generations always called `insertWait`, so they don't have the issue.

> if the acquire equivalent has a similar issue too

Also no (mostly), the waitcnt for that is also inserted unconditionally, even before `insertAcquire` is called.
However, we currently wouldn't emit a GLOBAL_INV for scratch fences with GloballyAddressableScratch, that seems wrong to me. The updated PR changes that as well.

https://github.com/llvm/llvm-project/pull/159282


More information about the llvm-commits mailing list