[all-commits] [llvm/llvm-project] 35dc91: [X86][SSE] lowerShuffleAsDecomposedShuffleBlend - ...
Simon Pilgrim via All-commits
all-commits at lists.llvm.org
Sat Sep 12 05:41:06 PDT 2020
Branch: refs/heads/master
Home: https://github.com/llvm/llvm-project
Commit: 35dc91aee2013ce1a57dfee965fa5fdee1987ee0
https://github.com/llvm/llvm-project/commit/35dc91aee2013ce1a57dfee965fa5fdee1987ee0
Author: Simon Pilgrim <llvm-dev at redking.me.uk>
Date: 2020-09-12 (Sat, 12 Sep 2020)
Changed paths:
M llvm/lib/Target/X86/X86ISelLowering.cpp
M llvm/test/CodeGen/X86/vector-shuffle-128-v16.ll
M llvm/test/CodeGen/X86/vector-shuffle-256-v16.ll
M llvm/test/CodeGen/X86/vector-shuffle-256-v32.ll
M llvm/test/CodeGen/X86/vector-shuffle-512-v32.ll
Log Message:
-----------
[X86][SSE] lowerShuffleAsDecomposedShuffleBlend - support decomposed unpacks for some vXi8/vXi16 cases
Follow up to D86429 to handle the remaining regressions.
This patch generalizes lowerShuffleAsDecomposedShuffleBlend to lowerShuffleAsDecomposedShuffleMerge, and attempts to use an UNPCKL shuffle mask instead of a blend for the cases where the inputs are coming from alternating vXi8/vXi16 sources. Technically they don't have to be alternating (just as long as they can fit into a lower lane half for the unpack) but I didn't find as many general cases and it needed a lot more of the function to be altered.
For vXi32/vXi64 cases this could still be beneficial but in most cases the existing permute+blend approach was better.
Differential Revision: https://reviews.llvm.org/D87405
More information about the All-commits
mailing list