[llvm-branch-commits] [llvm] [AMDGPU] GFX12 VMEM instructions can write VGPR results out of order (PR #105549)
Pierre van Houtryve via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Thu Aug 22 01:37:23 PDT 2024
================
@@ -754,13 +754,21 @@ define amdgpu_kernel void @constant_load_v16i16_align2(ptr addrspace(4) %ptr0) #
; GFX12-NEXT: global_load_u16 v6, v8, s[0:1] offset:8
; GFX12-NEXT: global_load_u16 v5, v8, s[0:1] offset:4
; GFX12-NEXT: global_load_u16 v4, v8, s[0:1]
+; GFX12-NEXT: s_wait_loadcnt 0x7
----------------
Pierre-vh wrote:
I'm not sure i understand exactly what's happening here. Why do we need the extra `s_wait_loadcnt`? What happens when two `global_load_d16_hi_b16` execute back-to-back?
https://github.com/llvm/llvm-project/pull/105549
More information about the llvm-branch-commits
mailing list