[llvm] AMDGPU/GFX12: Insert waitcnts before stores with scope_sys (PR #82996)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 26 05:51:49 PST 2024
================
@@ -2381,6 +2387,34 @@ bool SIGfx12CacheControl::enableVolatileAndOrNonTemporal(
return Changed;
}
+bool SIGfx12CacheControl::expandSystemScopeStore(
+ MachineBasicBlock::iterator &MI) const {
+
+ MachineOperand *CPol = TII->getNamedOperand(*MI, OpName::cpol);
+ if (CPol && ((CPol->getImm() & CPol::SCOPE) == CPol::SCOPE_SYS)) {
+ // Stores with system scope (SCOPE_SYS) need to wait for:
+ // - loads or atomics(returning) - wait for {LOAD|SAMPLE|BVH|KM}CNT==0
+ // - non-returning-atomics - wait for STORECNT==0
+ // TODO: SIInsertWaitcnts will not always be able to remove STORECNT waits
+ // since it does not distinguish atomics-with-return from regular stores.
+
+ // There is no need to wait if memory is cached (mtype != UC).
+ // For example shader-visible memory is cached.
+ // TODO: implement flag for frontend to give us a hint not to insert waits.
+ MachineBasicBlock &MBB = *MI->getParent();
+ DebugLoc DL = MI->getDebugLoc();
----------------
arsenm wrote:
const ref
https://github.com/llvm/llvm-project/pull/82996
More information about the llvm-commits
mailing list