[PATCH] D134463: [AMDGPU] Use V_PERM to match buildvectors when inputs are not canonicalized (i.e. can't use V_PACK)

Wed Sep 28 11:35:08 PDT 2022

jrbyrnes updated this revision to Diff 463635.
jrbyrnes added a comment.

Use V_BFI for V[1].hi : V[0].low . This allows for a bitmask which is more likely to be reused by other instructions (0xffff vs 0x7060100), potentially enabling other optimizations (e.g. CSE)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134463/new/

https://reviews.llvm.org/D134463

Files:
  llvm/lib/Target/AMDGPU/SIInstructions.td
  llvm/test/CodeGen/AMDGPU/add.v2i16.ll
  llvm/test/CodeGen/AMDGPU/build-vector-packed-partial-undef.ll
  llvm/test/CodeGen/AMDGPU/chain-hi-to-lo.ll
  llvm/test/CodeGen/AMDGPU/divergence-driven-buildvector.ll
  llvm/test/CodeGen/AMDGPU/extract-subvector-16bit.ll
  llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.global.ll
  llvm/test/CodeGen/AMDGPU/fast-unaligned-load-store.private.ll
  llvm/test/CodeGen/AMDGPU/fcanonicalize.f16.ll
  llvm/test/CodeGen/AMDGPU/fmax_legacy.f16.ll
  llvm/test/CodeGen/AMDGPU/fmin_legacy.f16.ll
  llvm/test/CodeGen/AMDGPU/fshr.ll
  llvm/test/CodeGen/AMDGPU/idot4s.ll
  llvm/test/CodeGen/AMDGPU/idot4u.ll
  llvm/test/CodeGen/AMDGPU/idot8s.ll
  llvm/test/CodeGen/AMDGPU/idot8u.ll
  llvm/test/CodeGen/AMDGPU/insert_vector_elt.v2i16.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.gather4.a16.dim.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.a16.dim.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.cd.a16.dim.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.cd.g16.encode.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.cd.g16.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.a16.dim.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.encode.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.image.sample.g16.ll
  llvm/test/CodeGen/AMDGPU/load-hi16.ll
  llvm/test/CodeGen/AMDGPU/load-lo16.ll
  llvm/test/CodeGen/AMDGPU/pack.v2f16.ll
  llvm/test/CodeGen/AMDGPU/pack.v2i16.ll
  llvm/test/CodeGen/AMDGPU/partial-shift-shrink.ll
  llvm/test/CodeGen/AMDGPU/strict_fadd.f16.ll
  llvm/test/CodeGen/AMDGPU/strict_fma.f16.ll
  llvm/test/CodeGen/AMDGPU/strict_fmul.f16.ll
  llvm/test/CodeGen/AMDGPU/strict_fsub.f16.ll
  llvm/test/CodeGen/AMDGPU/sub.v2i16.ll
  llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D134463.463635.patch
Type: text/x-patch
Size: 413548 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220928/5de7d8e8/attachment-0001.bin>