[llvm] [AMDGPU] Insert waitcnt for non-global fence release in GFX12 (PR #159282)
Fabian Ritter via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 22 04:44:56 PDT 2025
================
@@ -2522,8 +2522,7 @@ bool SIGfx12CacheControl::insertRelease(MachineBasicBlock::iterator &MI,
// sequentially consistent, and no other thread can access scratch
// memory.
- // Other address spaces do not have a cache.
- if ((AddrSpace & SIAtomicAddrSpace::GLOBAL) == SIAtomicAddrSpace::NONE)
+ if (AddrSpace == SIAtomicAddrSpace::SCRATCH)
return false;
----------------
ritter-x2a wrote:
@Pierre-vh
> I think a better fix is to move the code surrounding the switch below into something like if((AddrSpace & SIAtomicAddrSpace::GLOBAL) != SIAtomicAddrSpace::NONE || ((AddrSpace & SIAtomicAddrSpace::SCRATCH) != SIAtomicAddrSpace::NONE) && ST.hasGloballyAddressableScratch())
It looks like you recently ensured with #154710 that atomic scratch instructions are replaced by flat instructions in case of GloballyAddressableScratch, so that part of the suggested condition should never trigger, right?
So, wouldn't this be better handled with an `assert(!ST.hasGloballyAddressableScratch() || ((AS & SIAtomicAddrSpace::GLOBAL) != SIAtomicAddrSpace::NONE) || (AS & SIAtomicAddrSpace::SCRATCH) == SIAtomicAddrSpace::NONE)`?
https://github.com/llvm/llvm-project/pull/159282
More information about the llvm-commits
mailing list