[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Merge constrained sloads (PR #96162)

Matt Arsenault via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Wed Jul 3 09:19:13 PDT 2024


================
@@ -183,10 +183,10 @@ define <2 x half> @local_atomic_fadd_v2f16_rtn(ptr addrspace(3) %ptr, <2 x half>
 define amdgpu_kernel void @local_atomic_fadd_v2bf16_noret(ptr addrspace(3) %ptr, <2 x i16> %data) {
 ; GFX940-LABEL: local_atomic_fadd_v2bf16_noret:
 ; GFX940:       ; %bb.0:
-; GFX940-NEXT:    s_load_dwordx2 s[0:1], s[0:1], 0x24
+; GFX940-NEXT:    s_load_dwordx2 s[2:3], s[0:1], 0x24
----------------
arsenm wrote:

LSV should have gotten this case, I don't see why it didn't. Someone should look into this 

https://github.com/llvm/llvm-project/pull/96162


More information about the llvm-branch-commits mailing list