[llvm] [AMDGPU] introduce S_WAITCNT_FENCE_soft emitted by memory legalizer (PR #150167)
Pierre van Houtryve via llvm-commits
llvm-commits at lists.llvm.org
Thu Jul 24 03:52:04 PDT 2025
================
@@ -1381,6 +1383,32 @@ bool WaitcntGeneratorPreGFX12::applyPreexistingWaitcnt(
Modified = true;
} else
WaitcntInstr = ⅈ
+ } else if (Opcode == AMDGPU::S_WAITCNT_FENCE_soft) {
+ // Each direct load to LDS is also a store to LDS, but we do not have a
+ // separate counter for it. Instead these operations increment LOAD_CNT
+ // and need to be waited for at a release fence. So we treat a release
+ // fence as if it depends on any previous LDS DMA stores.
+ unsigned Ordering =
+ TII->getNamedOperand(II, AMDGPU::OpName::Ordering)->getImm();
+ unsigned Scope =
+ TII->getNamedOperand(II, AMDGPU::OpName::Scope)->getImm();
+ unsigned AddrSpace =
+ TII->getNamedOperand(II, AMDGPU::OpName::AddrSpace)->getImm();
+ if (isReleaseOrStronger((AtomicOrdering)Ordering) &&
----------------
Pierre-vh wrote:
Thinking about it, this part bothers me a bit because now InsertWaitCnt has to be aware of atomic orderings and deal with them accordingly. It blurs the separation of concerns between this pass and the MemoryLegalizer.
I know there is a good argument for doing that, but I think this being too generic for what we need at this stage. It's something that needs a lot of planning beforehand (and it's an item on my to-do list, though lower priority).
Can we consider adding a simple `s_wait_lds_dma_soft` instead, targeted exactly for this use case, and emit that ? I would prefer doing the minimum amount of changes, and then removing that pseudo later in favor of a generic one, than locking us into a specific approach right now.
I think what I'm afraid of is that this sets a precedent, and over time I suspect we'll rely more and more on this pseudo here and elsewhere (e.g. instead of fixing something properly, we just check the pseudo elsewhere and hack a fix there instead), and end up with the memory model implementation being spread over multiple files, which will make it difficult to manage.
https://github.com/llvm/llvm-project/pull/150167
More information about the llvm-commits
mailing list