[llvm-branch-commits] [llvm] [AMDGPU] GFX12 VMEM instructions can write VGPR results out of order (PR #105549)

Jay Foad via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Thu Aug 22 02:51:59 PDT 2024


================
@@ -4371,8 +4375,10 @@ define amdgpu_kernel void @global_sextload_v64i16_to_v64i32(ptr addrspace(1) %ou
 ; GCN-NOHSA-SI-NEXT:    buffer_store_dwordx4 v[8:11], off, s[0:3], 0 offset:48
 ; GCN-NOHSA-SI-NEXT:    buffer_store_dwordx4 v[4:7], off, s[0:3], 0
 ; GCN-NOHSA-SI-NEXT:    buffer_load_dword v0, off, s[12:15], 0 ; 4-byte Folded Reload
+; GCN-NOHSA-SI-NEXT:    s_waitcnt vmcnt(0)
----------------
jayfoad wrote:

The first RUN line does not specify a CPU so it will get some generic CPU that does not have the new feature.

https://github.com/llvm/llvm-project/pull/105549


More information about the llvm-branch-commits mailing list