[llvm] AMDGPU/GFX12: Insert waitcnts before stores with scope_sys (PR #82996)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 26 05:51:49 PST 2024


================
@@ -2381,6 +2387,34 @@ bool SIGfx12CacheControl::enableVolatileAndOrNonTemporal(
   return Changed;
 }
 
+bool SIGfx12CacheControl::expandSystemScopeStore(
+    MachineBasicBlock::iterator &MI) const {
+
+  MachineOperand *CPol = TII->getNamedOperand(*MI, OpName::cpol);
+  if (CPol && ((CPol->getImm() & CPol::SCOPE) == CPol::SCOPE_SYS)) {
+    // Stores with system scope (SCOPE_SYS) need to wait for:
+    // - loads or atomics(returning) - wait for {LOAD|SAMPLE|BVH|KM}CNT==0
+    // - non-returning-atomics       - wait for STORECNT==0
+    //   TODO: SIInsertWaitcnts will not always be able to remove STORECNT waits
+    //   since it does not distinguish atomics-with-return from regular stores.
+
+    // There is no need to wait if memory is cached (mtype != UC).
+    // For example shader-visible memory is cached.
+    // TODO: implement flag for frontend to give us a hint not to insert waits.
+    MachineBasicBlock &MBB = *MI->getParent();
+    DebugLoc DL = MI->getDebugLoc();
----------------
arsenm wrote:

const ref 

https://github.com/llvm/llvm-project/pull/82996


More information about the llvm-commits mailing list