[PATCH] D134463: [AMDGPU] Use V_PERM to match buildvectors when inputs are not canonicalized (i.e. can't use V_PACK)

Thu Sep 29 01:09:33 PDT 2022

foad added inline comments.

================
Comment at: llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.private.ll:240
 ; GFX9-NEXT:    s_waitcnt vmcnt(0)
-; GFX9-NEXT:    v_bfi_b32 v1, v1, 0, v0
+; GFX9-NEXT:    v_bfi_b32 v1, s4, 0, v0
 ; GFX9-NEXT:    v_and_or_b32 v0, v0, s4, v1
----------------
rampitec wrote:
> jrbyrnes wrote:
> > This seems illegal to me -- using SGPR and literal as operands to VALU. Looking into it. 
> 0 is inline literal and is free.
As a code quality thing, this could have been optimized to `v_and_b32 v1, 0xffff0000, v0`

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134463/new/

https://reviews.llvm.org/D134463