[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

Jay Foad via cfe-commits cfe-commits at lists.llvm.org
Wed Feb 7 06:48:57 PST 2024


jayfoad wrote:

> This logic would need updating again for GFX12. It seems like it's duplicating a lot of knowledge which is already implemented in SIInsertWaitcnts.

Just to demonstrate, you could implement this feature in SIInsertWaitcnts for **all** supported architectures with something like:
```diff
diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 6ecb1c8bf6e1..910cd094f8f2 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -2299,6 +2299,12 @@ bool SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF,
 
     updateEventWaitcntAfter(Inst, &ScoreBrackets);
 
+    AMDGPU::Waitcnt Wait =
+        AMDGPU::Waitcnt::allZeroExceptVsCnt(ST->hasExtendedWaitCounts());
+    ScoreBrackets.simplifyWaitcnt(Wait);
+    Modified |= generateWaitcnt(Wait, std::next(Inst.getIterator()), Block,
+                                ScoreBrackets, /*OldWaitcntInstr=*/nullptr);
+
 #if 0 // TODO: implement resource type check controlled by options with ub = LB.
     // If this instruction generates a S_SETVSKIP because it is an
     // indexed resource, and we are on Tahiti, then it will also force
```
Handling VSCNT/STORECNT correctly is a little more complicated but not much.

https://github.com/llvm/llvm-project/pull/79236


More information about the cfe-commits mailing list