[PATCH] D24491: [DAG] Allow build_vector-to-vector_shuffle combine to combine builds from two overly-wide vectors.

Tue Sep 13 09:25:54 PDT 2016

mkuper added a comment.

Thanks Alexey, Simon, Andrea!


================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12962-12973
@@ -12957,11 +12961,14 @@
 
   // We can't generate a shuffle node with mismatched input and output types.
   // Try to make the types match.
   if (InVT1 != VT || InVT2 != VT) {
     // Both inputs and the output must have the same base element type.
     EVT ElemType = VT.getVectorElementType();
     if (ElemType != InVT1.getVectorElementType() ||
         ElemType != InVT2.getVectorElementType())
       return SDValue();
 
+    // TODO: Canonicalize this so that if the vectors have different lengths,
+    // VecIn1 is always longer.
+
     // The element types match, now figure out the lengths.
----------------
andreadb wrote:
> My understanding is that we check if VT is a legal type for the target at the very beginning of this method.
> 
> However, (correct me if I am wrong) we don't check if InVT1 and InVT2 are legal types. So, we potentially always enable this canonicalization even on illegal vector types.
> 
> I wonder if we should add a early exit for the case where the DAG is not "type legalized" and InVT1 and InVT2 are not legal types for the target. The legalizer would adjust/canonicalize those vector types. We would still end up running your code if eventually types still don't match VT.
> 
> Not sure if this makes sense.
I'm not entirely sure. If the inputs are longer than the output, you're probably right.
The case that worries me, though, is something like a build_vector of <4 x i32> from 2 * <2 x i32>. I think we may want to hit this pre-legalization.
I'll add (yet another :-) ) TODO, and we can think about this post-commit. I'm going to keep working on this code anyway.

================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:12996
@@ +12995,3 @@
+        // output, split it in two.        
+        VecIn2 = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, VT, VecIn1,
+                             DAG.getConstant(NumElems, dl, IdxTy));
----------------
RKSimon wrote:
> Its not necessarily twice the size here is it? I'm thinking xmm from zmm.
You're right, that should be another TODO. (This isn't a change made in this patch, we've always supported it only in the "twice the size" case). 

================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:13054-13055
@@ +13053,4 @@
+  // only care about the low elements anyway. Pad the mask with undef.
+  for (unsigned i = NumElems; i != ShuffleNumElems; ++i)
+    Mask.push_back(-1);
+
----------------
RKSimon wrote:
> mkuper wrote:
> > ABataev wrote:
> > > ```
> > > if (ShuffleNumElems > NumElems)
> > >   Mask.append(ShuffleNumElems-NumElems, -1)
> > > ```
> > > ?
> > Yeah, I keep writing these loops on auto-pilot. :-)
> > This should either be an append, or Mask should just be initialized to -1. Thanks!
> It'd be tidier if you just did SmallVector<int, 8> Mask(ShuffleNumElems, -1) and replaced the push_back() calls with Mask[i]
Yes, that's what I meant by "should just be initialized to -1".
I think we've already had this conversation in a previous patch, as I said, this is just auto-pilot. Thanks again!

================
Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:13079
@@ +13078,3 @@
+    Shuffle = DAG.getNode(ISD::EXTRACT_SUBVECTOR, dl, VT, Shuffle,
+                          DAG.getConstant(0, dl, IdxTy));
+
----------------
RKSimon wrote:
> Use ZeroIdx?
Right, just need to define it in the right scope.


https://reviews.llvm.org/D24491