[llvm] [AMDGPU]: Accept constant zero bytes in v_perm OrCombine (PR #66533)

Matt Arsenault via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 27 22:35:28 PST 2024


================
@@ -1927,9 +1927,9 @@ define void @load_constant_lo_v2f16_reglo_vreg_zexti8(ptr addrspace(4) %in, i32
 ; GFX803-NEXT:    v_add_u32_e32 v0, vcc, 0xfffff001, v0
 ; GFX803-NEXT:    v_addc_u32_e32 v1, vcc, -1, v1, vcc
 ; GFX803-NEXT:    flat_load_ubyte v0, v[0:1]
-; GFX803-NEXT:    v_and_b32_e32 v1, 0xffff0000, v2
+; GFX803-NEXT:    s_mov_b32 s4, 0x3020c04
 ; GFX803-NEXT:    s_waitcnt vmcnt(0)
-; GFX803-NEXT:    v_or_b32_e32 v0, v0, v1
+; GFX803-NEXT:    v_perm_b32 v0, v0, v2, s4
----------------
arsenm wrote:

This is another size regression 

https://github.com/llvm/llvm-project/pull/66533


More information about the llvm-commits mailing list