[PATCH] D41794: [X86] Improve AVX1 shuffle lowering for v8f32 shuffles where the low half comes from V1 and the high half comes from V2 and the halves do the same operation

Wed Jul 4 12:45:45 PDT 2018

craig.topper updated this revision to Diff 154148.
craig.topper added a comment.

Replace the contents of lowerVectorShuffleByMerging128BitLanes. Now we try to form 2 lane correcting shuffles and a repeated shuffle mask.

The code goes through 2 passes to determine the repeated mask. First if looks for any lanes that need 2 source lanes and build a repeating mask from that if any exist. Then it will try to fit one source lanes to that mask duplicating the same source lane on both inputs if necessary to match the repeated mask.

This gets the case in PR36933 now too. I haven't added the test for it to the patch yet.

It's regressing some of the avx512 cases now though because we don't seem to be able to combine 2 source permi2q with a perm2f128 feeding it.

https://reviews.llvm.org/D41794

Files:
  lib/Target/X86/X86ISelLowering.cpp
  test/CodeGen/X86/avx512-shuffles/partial_permute.ll
  test/CodeGen/X86/prefer-avx256-mask-shuffle.ll
  test/CodeGen/X86/vector-shuffle-256-v16.ll
  test/CodeGen/X86/vector-shuffle-256-v32.ll
  test/CodeGen/X86/vector-shuffle-256-v4.ll
  test/CodeGen/X86/vector-shuffle-256-v8.ll
  test/CodeGen/X86/vector-shuffle-512-v64.ll
  test/CodeGen/X86/vector-shuffle-avx512.ll
  test/CodeGen/X86/vector-shuffle-combining.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D41794.154148.patch
Type: text/x-patch
Size: 109282 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180704/ee17596c/attachment.bin>