[PATCH] D50328: [X86][SSE] Combine (some) target shuffles with multiple uses

Simon Pilgrim via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Aug 6 05:51:51 PDT 2018


RKSimon created this revision.
RKSimon added reviewers: craig.topper, spatel, andreadb, lebedev.ri.

As discussed on https://reviews.llvm.org/D41794, we have many cases where we fail to combine shuffles as the input operands have other uses.

This patch permits these shuffles to be combined as long as they don't introduce additional variable shuffle masks, which should allow the total number of shuffles to still drop without increasing the constant pool.

However, this may mean that some memory folds may no longer occur, and on pre-AVX require the occasional extra register move.

This also exposes some poor PMULDQ/PMULUDQ codegen which was doing unnecessary upper/lower calculations which will in fact fold to zero/undef - I've included the fix in this patch but can commit it separately as a followup if you wish to better show the effect


Repository:
  rL LLVM

https://reviews.llvm.org/D50328

Files:
  lib/Target/X86/X86ISelLowering.cpp
  test/CodeGen/X86/2012-01-12-extract-sv.ll
  test/CodeGen/X86/avx2-intrinsics-fast-isel.ll
  test/CodeGen/X86/avx512-intrinsics-fast-isel.ll
  test/CodeGen/X86/bitcast-and-setcc-128.ll
  test/CodeGen/X86/bitcast-setcc-128.ll
  test/CodeGen/X86/combine-shl.ll
  test/CodeGen/X86/extractelement-load.ll
  test/CodeGen/X86/madd.ll
  test/CodeGen/X86/mmx-arith.ll
  test/CodeGen/X86/oddshuffles.ll
  test/CodeGen/X86/pmul.ll
  test/CodeGen/X86/pr29112.ll
  test/CodeGen/X86/pr34592.ll
  test/CodeGen/X86/shrink_vmul.ll
  test/CodeGen/X86/sse2-schedule.ll
  test/CodeGen/X86/sse41-intrinsics-fast-isel.ll
  test/CodeGen/X86/vec_insert-3.ll
  test/CodeGen/X86/vector-constrained-fp-intrinsics.ll
  test/CodeGen/X86/vector-reduce-mul.ll
  test/CodeGen/X86/vector-sext.ll
  test/CodeGen/X86/vector-shuffle-128-v4.ll
  test/CodeGen/X86/vector-shuffle-256-v4.ll
  test/CodeGen/X86/vector-shuffle-256-v8.ll
  test/CodeGen/X86/vector-shuffle-combining.ll
  test/CodeGen/X86/vector-trunc-math.ll
  test/CodeGen/X86/x86-interleaved-access.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D50328.159280.patch
Type: text/x-patch
Size: 181249 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180806/434395ae/attachment-0001.bin>


More information about the llvm-commits mailing list