[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)

Pierre van Houtryve via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Mon Apr 14 00:49:12 PDT 2025


================
@@ -19,7 +19,7 @@ body: |
     ; GFX12-NEXT: {{  $}}
     ; GFX12-NEXT: renamable $vgpr0 = GLOBAL_LOAD_DWORD_SADDR renamable $sgpr2_sgpr3, killed $vgpr0, 0, 0, implicit $exec :: (load (s32), addrspace 1)
     ; GFX12-NEXT: GLOBAL_INV 16, implicit $exec
-    ; GFX12-NEXT: S_WAIT_LOADCNT 0
+    ; GFX12-NEXT: S_WAIT_LOADCNT 1
----------------
Pierre-vh wrote:

This looks incorrect, I think we need to wait on loadcnt to be zero because the load/inv can complete out of order. If the inv were somehow faster than the load, $vgpr0 would not be ready (?)

https://github.com/llvm/llvm-project/pull/135340


More information about the llvm-branch-commits mailing list