[llvm-branch-commits] [llvm] [AMDGPU][InsertWaitCnts] Track global_wb/inv/wbinv (PR #135340)
Pierre van Houtryve via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Mon Apr 14 00:49:12 PDT 2025
================
@@ -19,7 +19,7 @@ body: |
; GFX12-NEXT: {{ $}}
; GFX12-NEXT: renamable $vgpr0 = GLOBAL_LOAD_DWORD_SADDR renamable $sgpr2_sgpr3, killed $vgpr0, 0, 0, implicit $exec :: (load (s32), addrspace 1)
; GFX12-NEXT: GLOBAL_INV 16, implicit $exec
- ; GFX12-NEXT: S_WAIT_LOADCNT 0
+ ; GFX12-NEXT: S_WAIT_LOADCNT 1
----------------
Pierre-vh wrote:
This looks incorrect, I think we need to wait on loadcnt to be zero because the load/inv can complete out of order. If the inv were somehow faster than the load, $vgpr0 would not be ready (?)
https://github.com/llvm/llvm-project/pull/135340
More information about the llvm-branch-commits
mailing list