[llvm] [clang] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #68932)

Jay Foad via cfe-commits cfe-commits at lists.llvm.org
Wed Nov 22 03:28:56 PST 2023


================
@@ -1708,6 +1710,19 @@ bool SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF,
     }
 
     ++Iter;
+    if (ST->isPreciseMemoryEnabled() && Inst.mayLoadOrStore()) {
+      auto Builder =
+          BuildMI(Block, Iter, DebugLoc(), TII->get(AMDGPU::S_WAITCNT))
+              .addImm(0);
+      if (IsGFX10Plus) {
+        Builder =
+            BuildMI(Block, Iter, DebugLoc(), TII->get(AMDGPU::S_WAITCNT_VSCNT))
+                .addReg(AMDGPU::SGPR_NULL, RegState::Undef)
+                .addImm(0);
+      }
+      OldWaitcntInstr = Builder.getInstr();
----------------
jayfoad wrote:

Nit: if you're going to set OldWaitcntInstr then really it ought to point to the first in a sequence of waitcnts, not the last.

https://github.com/llvm/llvm-project/pull/68932


More information about the cfe-commits mailing list