[llvm] [AMDGPU]: Accept constant zero bytes in v_perm OrCombine (PR #66533)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 26 07:27:24 PST 2024


================
@@ -2466,8 +2465,8 @@ define <2 x i16> @load_global_v2i16_split(ptr addrspace(1) %in) #0 {
 ; GFX803-NEXT:    s_waitcnt vmcnt(0)
 ; GFX803-NEXT:    flat_load_ushort v1, v[2:3] glc
 ; GFX803-NEXT:    s_waitcnt vmcnt(0)
-; GFX803-NEXT:    v_lshlrev_b32_e32 v1, 16, v1
-; GFX803-NEXT:    v_or_b32_e32 v0, v0, v1
+; GFX803-NEXT:    s_mov_b32 s4, 0x1000504
+; GFX803-NEXT:    v_perm_b32 v0, v0, v1, s4
----------------
arsenm wrote:

This is still a regression, even without lshl_or

https://github.com/llvm/llvm-project/pull/66533


More information about the llvm-commits mailing list