[PATCH] D24491: [DAG] Allow build_vector-to-vector_shuffle combine to combine builds from two overly-wide vectors.
Michael Kuperstein via llvm-commits
llvm-commits at lists.llvm.org
Tue Sep 13 09:25:54 PDT 2016
mkuper added a comment.
Thanks Alexey, Simon, Andrea!
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12962-12973
@@ -12957,11 +12961,14 @@
// We can't generate a shuffle node with mismatched input and output types.
// Try to make the types match.
if (InVT1 != VT || InVT2 != VT) {
// Both inputs and the output must have the same base element type.
EVT ElemType = VT.getVectorElementType();
if (ElemType != InVT1.getVectorElementType() ||
ElemType != InVT2.getVectorElementType())
return SDValue();
+ // TODO: Canonicalize this so that if the vectors have different lengths,
+ // VecIn1 is always longer.
+
// The element types match, now figure out the lengths.
----------------
andreadb wrote:
> My understanding is that we check if VT is a legal type for the target at the very beginning of this method.
>
> However, (correct me if I am wrong) we don't check if InVT1 and InVT2 are legal types. So, we potentially always enable this canonicalization even on illegal vector types.
>
> I wonder if we should add a early exit for the case where the DAG is not "type legalized" and InVT1 and InVT2 are not legal types for the target. The legalizer would adjust/canonicalize those vector types. We would still end up running your code if eventually types still don't match VT.
>
> Not sure if this makes sense.
I'm not entirely sure. If the inputs are longer than the output, you're probably right.
The case that worries me, though, is something like a build_vector of <4 x i32> from 2 * <2 x i32>. I think we may want to hit this pre-legalization.
I'll add (yet another :-) ) TODO, and we can think about this post-commit. I'm going to keep working on this code anyway.
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12996
@@ +12995,3 @@
+ // output, split it in two.
+ VecIn2 = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, VT, VecIn1,
+ DAG.getConstant(NumElems, dl, IdxTy));
----------------
RKSimon wrote:
> Its not necessarily twice the size here is it? I'm thinking xmm from zmm.
You're right, that should be another TODO. (This isn't a change made in this patch, we've always supported it only in the "twice the size" case).
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:13054-13055
@@ +13053,4 @@
+ // only care about the low elements anyway. Pad the mask with undef.
+ for (unsigned i = NumElems; i != ShuffleNumElems; ++i)
+ Mask.push_back(-1);
+
----------------
RKSimon wrote:
> mkuper wrote:
> > ABataev wrote:
> > > ```
> > > if (ShuffleNumElems > NumElems)
> > > Mask.append(ShuffleNumElems-NumElems, -1)
> > > ```
> > > ?
> > Yeah, I keep writing these loops on auto-pilot. :-)
> > This should either be an append, or Mask should just be initialized to -1. Thanks!
> It'd be tidier if you just did SmallVector<int, 8> Mask(ShuffleNumElems, -1) and replaced the push_back() calls with Mask[i]
Yes, that's what I meant by "should just be initialized to -1".
I think we've already had this conversation in a previous patch, as I said, this is just auto-pilot. Thanks again!
================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:13079
@@ +13078,3 @@
+ Shuffle = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, VT, Shuffle,
+ DAG.getConstant(0, dl, IdxTy));
+
----------------
RKSimon wrote:
> Use ZeroIdx?
Right, just need to define it in the right scope.
https://reviews.llvm.org/D24491
More information about the llvm-commits
mailing list