[PATCH] D24683: [DAGCombine] Generalize build_vector -> vector_shuffle combine for more than 2 inputs
Michael Kuperstein via llvm-commits
llvm-commits at lists.llvm.org
Sat Sep 17 17:06:31 PDT 2016
mkuper added a comment.
In https://reviews.llvm.org/D24683#545621, @RKSimon wrote:
> Nice work! There's a lot going on here which makes we wonder whether this should be split into several patches:
>
> 1 - createBuildVecShuffle - generalise support for the shuffling of different sized inputs + outputs
> 2 - shuffling with constant values (not just zeros)
> 3 - support any number of inputs
Thanks, Simon!
I'm not sure what you mean by "split". :-)
This patch only contains item 3.
Re 1 - splitting the shuffle logic into "createBuildVecShuffle" doesn't generalize anything, it's just a refactoring, that only makes sense once we have more than 2 inputs. The restrictions on input sizes existed before this patch.
Re 2 - we have only been blending with zeroes for years. I agree we should probably blend with any constant vectors, and will be happy to add yet another TODO. However, that seems orthogonal to this patch.
In any case, I don't see a reason to gate (3) on either (1) or (2).
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12900
@@ -12977,13 +12899,3 @@
if (InVT1 != VT || InVT2 != VT) {
- // Both inputs and the output must have the same base element type.
- EVT ElemType = VT.getVectorElementType();
- if (ElemType != InVT1.getVectorElementType() ||
- ElemType != InVT2.getVectorElementType())
- return SDValue();
-
- // TODO: Canonicalize this so that if the vectors have different lengths,
- // VecIn1 is always longer.
-
- // The element types match, now figure out the lengths.
if (InVT1.getSizeInBits() * 2 == VT.getSizeInBits() && InVT1 == InVT2) {
// If both input vectors are exactly half the size of the output, concat
----------------
RKSimon wrote:
> Support (VT.getSizeInBits() % InVT1.getSizeInBits()) == 0 ? Not just x2?
There's a TODO for this in line 12934.
This isn't really new code, we've been doing this only for x2 for ages - some of it is really old, some I've added sometime in 2014. (I refactored it somewhat in r281283 to make this patch easier, but that didn't change the logic.)
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12906
@@ -12993,8 +12905,3 @@
VecIn2 = SDValue();
- // If we have one "real" input and are blending with zero, we need the
- // zero elements to come from the second input, not the undef part of the
- // first input.
- if (UsesZeroVector)
- Vec2Offset = NumElems;
} else if (InVT1.getSizeInBits() == VT.getSizeInBits() * 2) {
if (!TLI.isExtractSubvectorCheap(VT, NumElems))
----------------
RKSimon wrote:
> Support (InVT1.getSizeInBits() % VT.getSizeInBits()) == 0 ? Not just x2?
Same as above.
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12981
@@ +12980,3 @@
+// at most two distinct vectors, turn this into a shuffle node.
+// TODO: Support more than two inputs by constructing a tree of shuffles.
+SDValue DAGCombiner::reduceBuildVecToShuffle(SDNode *N) {
----------------
RKSimon wrote:
> This comment needs a rewrite now that you're supporting any number of inputs.
Argh, right, missed it. Thanks.
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:13005
@@ +13004,3 @@
+ // TODO: Would it be better to replace VecMap with a search in VecIn?
+ // The vector is expected to be very small...
+ SmallVector<int, 8> VectorMask;
----------------
RKSimon wrote:
> Yes VecMap is probably overkill.
Ok, will rewrite with a search, unless someone objects. Too bad we don't have a SmallMap.
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:13006
@@ +13005,3 @@
+ // The vector is expected to be very small...
+ SmallVector<int, 8> VectorMask;
+ SmallVector<SDValue, 8> VecIn;
----------------
RKSimon wrote:
> Initialise to VectorMask(NumElems, -1) ?
As usual. :-)
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:13020
@@ +13019,3 @@
+ // See if we can use a blend with a zero vector.
+ if (isNullConstant(Op) || isNullFPConstant(Op)) {
+ UsesZeroVector = true;
----------------
RKSimon wrote:
> Could we not generalise this to any constant? I'm thinking about cases such as float4 where the .w component is set to 1.0.
I think we could.
Note that this isn't really new - the blend with zero logic has been here for a while.
I assume the motivation was that blends with zero are especially cheap, because you don't even need to load the constant. But I agree that general constants should probably also be a win. That's for a separate patch, though. I'll add a TODO.
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:13050
@@ +13049,3 @@
+ // If we didn't find at least one input vector, bail out.
+ if (VecIn.size() < 2)
+ return SDValue();
----------------
RKSimon wrote:
> Rewrite the comment to explain its one input vector + zero vector.
I don't think it's true - it can really be a single input vector.
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:13073
@@ +13072,3 @@
+ SmallVector<SDValue, 4> Shuffles;
+ for (unsigned In = 0; In < (VecIn.size() / 2); ++In) {
+ unsigned LeftIdx = 2 * In + 1;
----------------
RKSimon wrote:
> for (unsigned In = 0, Len = (VecIn.size() / 2); In < Len; ++In) {
Right, thanks.
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:13123
@@ +13122,3 @@
+ }
+ for (unsigned In = 0; In < CurSize / 2; ++In) {
+ int Left = 2 * In;
----------------
RKSimon wrote:
> for (unsigned In = 0, Len = CurSize / 2; In < Len; ++In) {
Right, thanks.
https://reviews.llvm.org/D24683
More information about the llvm-commits
mailing list