[llvm] [AMDGPU]: Accept constant zero bytes in v_perm OrCombine (PR #66533)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Mon Feb 26 07:27:24 PST 2024
================
@@ -2466,8 +2465,8 @@ define <2 x i16> @load_global_v2i16_split(ptr addrspace(1) %in) #0 {
; GFX803-NEXT: s_waitcnt vmcnt(0)
; GFX803-NEXT: flat_load_ushort v1, v[2:3] glc
; GFX803-NEXT: s_waitcnt vmcnt(0)
-; GFX803-NEXT: v_lshlrev_b32_e32 v1, 16, v1
-; GFX803-NEXT: v_or_b32_e32 v0, v0, v1
+; GFX803-NEXT: s_mov_b32 s4, 0x1000504
+; GFX803-NEXT: v_perm_b32 v0, v0, v1, s4
----------------
arsenm wrote:
This is still a regression, even without lshl_or
https://github.com/llvm/llvm-project/pull/66533
More information about the llvm-commits
mailing list