[PATCH] D21148: [X86][SSE] Added support for combining target shuffles to (V)PSHUFD/VPERMILPD/VPERMILPS immediate permute

Thu Jun 23 10:54:33 PDT 2016

RKSimon updated this revision to Diff 61699.
RKSimon added a comment.

Updated to prefer binary shuffle (unpck mainly) over permutes - although this will prevent some folding its shouldn't affect register pressure. We don't handle i32 unpcks as pshufd typically has similar performance. The other changes you see from unary shuffle (e.g. movddup to pshufd) are typically because the target shuffle combine is a bit more ruthless at looking through bitcasts, hopefully reducing domain stalls.

Repository:
  rL LLVM

http://reviews.llvm.org/D21148

Files:
  lib/Target/X86/X86ISelLowering.cpp
  test/CodeGen/X86/2012-01-12-extract-sv.ll
  test/CodeGen/X86/2012-04-26-sdglue.ll
  test/CodeGen/X86/avx-intrinsics-fast-isel.ll
  test/CodeGen/X86/avx-intrinsics-x86.ll
  test/CodeGen/X86/avx-splat.ll
  test/CodeGen/X86/avx-vbroadcast.ll
  test/CodeGen/X86/pshufb-mask-comments.ll
  test/CodeGen/X86/sse3.ll
  test/CodeGen/X86/vector-compare-results.ll
  test/CodeGen/X86/vector-shuffle-128-v16.ll
  test/CodeGen/X86/vector-shuffle-256-v16.ll
  test/CodeGen/X86/vector-shuffle-256-v4.ll
  test/CodeGen/X86/vector-shuffle-256-v8.ll
  test/CodeGen/X86/vector-shuffle-combining-avx.ll
  test/CodeGen/X86/vector-shuffle-combining-avx2.ll
  test/CodeGen/X86/vector-shuffle-combining-avx512bw.ll
  test/CodeGen/X86/vector-shuffle-combining-ssse3.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D21148.61699.patch
Type: text/x-patch
Size: 32989 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160623/c1da4898/attachment.bin>