[llvm-branch-commits] [llvm] [AMDGPU] always emit a soft wait even if it is trivially ~0 (PR #147257)

Sameer Sahasrabuddhe via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Mon Jul 7 03:49:13 PDT 2025


================
@@ -669,6 +679,7 @@ define amdgpu_kernel void @global_volatile_store_1(
 ; GFX12-WGP-NEXT:    s_wait_kmcnt 0x0
 ; GFX12-WGP-NEXT:    s_wait_storecnt 0x0
 ; GFX12-WGP-NEXT:    global_store_b32 v0, v1, s[0:1] scope:SCOPE_SYS
+; GFX12-WGP-NEXT:    s_wait_loadcnt 0x3f
----------------
ssahasra wrote:

If we agree with the basic design, then these are expected. There's a whole bunch of tests that either stop at the memory legalizer, or they run llc with `-O0`, like this one. The "trivial" wait counts show up in all these tests because SIInsertWaitcnts did not get a chance to clean it up. In particular, see how `TrySimplify` in that pass controls whether or not to clean up these wait counts. They disappear in the optimized ISA output.

https://github.com/llvm/llvm-project/pull/147257


More information about the llvm-branch-commits mailing list