[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

Matt Arsenault via cfe-commits cfe-commits at lists.llvm.org
Thu Mar 28 11:13:32 PDT 2024


================
@@ -2326,6 +2326,20 @@ bool SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF,
     }
 #endif
 
+    if (ST->isPreciseMemoryEnabled() && Inst.mayLoadOrStore()) {
+      AMDGPU::Waitcnt Wait;
+      if (ST->hasExtendedWaitCounts())
+        Wait = AMDGPU::Waitcnt(0, 0, 0, 0, 0, 0, 0);
+      else
+        Wait = AMDGPU::Waitcnt(0, 0, 0, 0);
+
+      if (!Inst.mayStore())
+        Wait.StoreCnt = ~0u;
----------------
arsenm wrote:

gfx11 doesn't have the separate load/store counters, that's gfx11. I would expect this to only use lgkmcnt here 

https://github.com/llvm/llvm-project/pull/79236


More information about the cfe-commits mailing list