[llvm] [AMDGPU]: Accept constant zero bytes in v_perm OrCombine (PR #66533)

Jeffrey Byrnes via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 14 12:50:30 PST 2024


================
@@ -357,9 +357,9 @@ define void @load_local_lo_v2i16_reglo_vreg_zexti8(ptr addrspace(3) %in, i16 %re
 ; GFX803-NEXT:    s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0)
 ; GFX803-NEXT:    s_mov_b32 m0, -1
 ; GFX803-NEXT:    ds_read_u8 v0, v0
-; GFX803-NEXT:    v_lshlrev_b32_e32 v1, 16, v1
+; GFX803-NEXT:    s_mov_b32 s4, 0x1000c04
----------------
jrbyrnes wrote:

Or did you mean v_perm replacements of v_lshl_or_b32 (see load-local.128.ll:load_lds_v4i32_align1) ? 

https://github.com/llvm/llvm-project/pull/66533


More information about the llvm-commits mailing list