[llvm] [AMDGPU]: Accept constant zero bytes in v_perm OrCombine (PR #66533)
Matt Arsenault via llvm-commits
llvm-commits at lists.llvm.org
Tue Feb 27 22:35:28 PST 2024
================
@@ -1927,9 +1927,9 @@ define void @load_constant_lo_v2f16_reglo_vreg_zexti8(ptr addrspace(4) %in, i32
; GFX803-NEXT: v_add_u32_e32 v0, vcc, 0xfffff001, v0
; GFX803-NEXT: v_addc_u32_e32 v1, vcc, -1, v1, vcc
; GFX803-NEXT: flat_load_ubyte v0, v[0:1]
-; GFX803-NEXT: v_and_b32_e32 v1, 0xffff0000, v2
+; GFX803-NEXT: s_mov_b32 s4, 0x3020c04
; GFX803-NEXT: s_waitcnt vmcnt(0)
-; GFX803-NEXT: v_or_b32_e32 v0, v0, v1
+; GFX803-NEXT: v_perm_b32 v0, v0, v2, s4
----------------
arsenm wrote:
This is another size regression
https://github.com/llvm/llvm-project/pull/66533
More information about the llvm-commits
mailing list