[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)
Jay Foad via cfe-commits
cfe-commits at lists.llvm.org
Wed Feb 7 06:48:57 PST 2024
jayfoad wrote:
> This logic would need updating again for GFX12. It seems like it's duplicating a lot of knowledge which is already implemented in SIInsertWaitcnts.
Just to demonstrate, you could implement this feature in SIInsertWaitcnts for **all** supported architectures with something like:
```diff
diff --git a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
index 6ecb1c8bf6e1..910cd094f8f2 100644
--- a/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
+++ b/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
@@ -2299,6 +2299,12 @@ bool SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF,
updateEventWaitcntAfter(Inst, &ScoreBrackets);
+ AMDGPU::Waitcnt Wait =
+ AMDGPU::Waitcnt::allZeroExceptVsCnt(ST->hasExtendedWaitCounts());
+ ScoreBrackets.simplifyWaitcnt(Wait);
+ Modified |= generateWaitcnt(Wait, std::next(Inst.getIterator()), Block,
+ ScoreBrackets, /*OldWaitcntInstr=*/nullptr);
+
#if 0 // TODO: implement resource type check controlled by options with ub = LB.
// If this instruction generates a S_SETVSKIP because it is an
// indexed resource, and we are on Tahiti, then it will also force
```
Handling VSCNT/STORECNT correctly is a little more complicated but not much.
https://github.com/llvm/llvm-project/pull/79236
More information about the cfe-commits
mailing list