[llvm] [Doc][AMDGPU] Document the waitcnts required before SCOPE_SYS stores on GFX12 (PR #156424)

Pierre van Houtryve via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 3 00:26:55 PDT 2025


================
@@ -14510,6 +14510,14 @@ For GFX12:
 * A memory attached last level (MALL) cache exists for GPU memory.
   The MALL cache is fully coherent with GPU memory and has no impact on system
   coherence. All agents (GPU and CPU) access GPU memory through the MALL cache.
+* The wait instructions below must be added before any ``SCOPE_SYS`` store in
+  order for the store to remain in order with previous memory operations.
+
+  * ``s_wait_loadcnt 0x0``
+  * ``s_wait_storecnt 0x0``
+  * ``s_wait_kmcnt 0x0``
+  * ``s_wait_samplecnt 0x0``
+  * ``s_wait_bvhcnt 0x0``
----------------
Pierre-vh wrote:

No dscnt isn't needed, this only deals with system scope level stores as the reordering in this case occurs beyond L2.
LDS is always WG level so that causes no issues

https://github.com/llvm/llvm-project/pull/156424


More information about the llvm-commits mailing list