[PATCH] D125238: [X86] Prefer MOVHLPS for shuffle(x, 1, -1) extraction patterns (PR26515)

Mon May 9 08:23:27 PDT 2022

RKSimon created this revision.
RKSimon added reviewers: craig.topper, spatel, pengfei.
Herald added subscribers: armkevincheng, eric-k256, StephenFan, hiraditya.
Herald added a reviewer: sjarus.
Herald added a project: All.
RKSimon requested review of this revision.
Herald added a project: LLVM.

We currently lower to UNPCKH but that means the source vector must match the destination vector causing an additional move, which MOVHLPS can avoid.

Fixes #26889

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D125238

Files:
  llvm/lib/Target/X86/X86ISelLowering.cpp
  llvm/lib/Target/X86/X86InstrSSE.td
  llvm/test/CodeGen/X86/cast-vsel.ll
  llvm/test/CodeGen/X86/combine-fcopysign.ll
  llvm/test/CodeGen/X86/complex-fastmath.ll
  llvm/test/CodeGen/X86/extractelement-load.ll
  llvm/test/CodeGen/X86/fma.ll
  llvm/test/CodeGen/X86/fp-intrinsics-fma.ll
  llvm/test/CodeGen/X86/fp-round.ll
  llvm/test/CodeGen/X86/fp-roundeven.ll
  llvm/test/CodeGen/X86/fp128-extract.ll
  llvm/test/CodeGen/X86/fpclamptosat_vec.ll
  llvm/test/CodeGen/X86/fptosi-sat-vector-128.ll
  llvm/test/CodeGen/X86/fptoui-sat-vector-128.ll
  llvm/test/CodeGen/X86/ftrunc.ll
  llvm/test/CodeGen/X86/haddsub-2.ll
  llvm/test/CodeGen/X86/haddsub-3.ll
  llvm/test/CodeGen/X86/haddsub-shuf.ll
  llvm/test/CodeGen/X86/haddsub-undef.ll
  llvm/test/CodeGen/X86/haddsub.ll
  llvm/test/CodeGen/X86/half.ll
  llvm/test/CodeGen/X86/horizontal-reduce-fadd.ll
  llvm/test/CodeGen/X86/horizontal-sum.ll
  llvm/test/CodeGen/X86/inline-asm-x-i128.ll
  llvm/test/CodeGen/X86/insertps-combine.ll
  llvm/test/CodeGen/X86/load-partial-dot-product.ll
  llvm/test/CodeGen/X86/masked_compressstore.ll
  llvm/test/CodeGen/X86/masked_store.ll
  llvm/test/CodeGen/X86/pow.ll
  llvm/test/CodeGen/X86/pr11334.ll
  llvm/test/CodeGen/X86/scalar-int-to-fp.ll
  llvm/test/CodeGen/X86/split-vector-rem.ll
  llvm/test/CodeGen/X86/sse-intrinsics-fast-isel.ll
  llvm/test/CodeGen/X86/sse-scalar-fp-arith.ll
  llvm/test/CodeGen/X86/sse3-avx-addsub-2.ll
  llvm/test/CodeGen/X86/vec-strict-128.ll
  llvm/test/CodeGen/X86/vec-strict-cmp-128.ll
  llvm/test/CodeGen/X86/vec-strict-fptoint-128.ll
  llvm/test/CodeGen/X86/vec_fp_to_int.ll
  llvm/test/CodeGen/X86/vec_fpext.ll
  llvm/test/CodeGen/X86/vector-constrained-fp-intrinsics.ll
  llvm/test/CodeGen/X86/vector-intrinsics.ll
  llvm/test/CodeGen/X86/vector-narrow-binop.ll
  llvm/test/CodeGen/X86/vector-reduce-fadd-fast.ll
  llvm/test/CodeGen/X86/vector-reduce-fadd.ll
  llvm/test/CodeGen/X86/vector-reduce-fmax-fmin-fast.ll
  llvm/test/CodeGen/X86/vector-reduce-fmax-nnan.ll
  llvm/test/CodeGen/X86/vector-reduce-fmax.ll
  llvm/test/CodeGen/X86/vector-reduce-fmin-nnan.ll
  llvm/test/CodeGen/X86/vector-reduce-fmin.ll
  llvm/test/CodeGen/X86/vector-reduce-fmul-fast.ll
  llvm/test/CodeGen/X86/vector-reduce-fmul.ll
  llvm/test/CodeGen/X86/vector-rem.ll
  llvm/test/CodeGen/X86/vector-shuffle-128-v2.ll
  llvm/test/CodeGen/X86/widen_conv-3.ll
  llvm/test/CodeGen/X86/widen_conv-4.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D125238.428094.patch
Type: text/x-patch
Size: 477857 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20220509/a1ea4a7b/attachment-0001.bin>