[llvm] AMDGPU/GFX12: Insert waitcnts before stores with scope_sys (PR #82996)

Jay Foad via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 28 06:23:59 PST 2024


================
@@ -589,6 +593,16 @@ class SIGfx12CacheControl : public SIGfx11CacheControl {
   bool setScope(const MachineBasicBlock::iterator MI,
                 AMDGPU::CPol::CPol Value) const;
 
+  // Stores with system scope (SCOPE_SYS) need to wait for:
+  // - loads or atomics(returning) - wait for {LOAD|SAMPLE|BVH|KM}CNT==0
+  // - non-returning-atomics       - wait for STORECNT==0
+  //   TODO: SIInsertWaitcnts will not always be able to remove STORECNT waits
+  //   since it does not distinguish atomics-with-return from regular stores.
+  // There is no need to wait if memory is cached (mtype != UC).
+  // For example shader-visible memory is cached.
----------------
jayfoad wrote:

I don't understand the statement that "shader-visible memory is cached". Surely we are compiling a shader, so any memory the shader refers to is "shader-visible", so why do we need to worry about uncached memory?

https://github.com/llvm/llvm-project/pull/82996


More information about the llvm-commits mailing list