[llvm] [AMDGPU]: Accept constant zero bytes in v_perm OrCombine (PR #66533)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 6 21:18:43 PST 2024


================
@@ -357,9 +357,9 @@ define void @load_local_lo_v2i16_reglo_vreg_zexti8(ptr addrspace(3) %in, i16 %re
 ; GFX803-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
 ; GFX803-NEXT:    s_mov_b32 m0, -1
 ; GFX803-NEXT:    ds_read_u8 v0, v0
-; GFX803-NEXT:    v_lshlrev_b32_e32 v1, 16, v1
+; GFX803-NEXT:    s_mov_b32 s4, 0x1000c04
----------------
arsenm wrote:

This is an 8->16 byte code size regression Plus requires the extra register for the constant 

https://github.com/llvm/llvm-project/pull/66533


More information about the llvm-commits mailing list