[llvm] r212517 - [x86, SDAG] Sink the logic for folding shuffles of splats more

Tue Jul 8 01:45:39 PDT 2014

Author: chandlerc
Date: Tue Jul  8 03:45:38 2014
New Revision: 212517

URL: http://llvm.org/viewvc/llvm-project?rev=212517&view=rev
Log:
[x86,SDAG] Sink the logic for folding shuffles of splats more
aggressively from the x86 shuffle lowering to the generic SDAG vector
shuffle formation code.

This code already tried to fold away shuffles of splats! It just had
lots of bugs and couldn't handle the case my new x86 shuffle lowering
needed.

First, it failed to correctly compute whether N2 was undef because it
pre-computed this, then did transformations which could *make* N2 undef,
then failed to ever re-consider the precomputed state.

Second, it didn't look through bitcasts at all, even in the safe cases
where they are just element-type bitcasts with no change to the number
of elements.

Third, it didn't handle all-zero bit casts nicely the way my code in the
x86 side of things did, which is essential to getting good zext-shuffle
lowerings.

But all of these are generic. I just ported the code down to this layer
and fixed the surrounding bugs. Tests exercising this in the x86 backend
still pass and some silly code in widen_cast-6.ll gets better. I updated
that test to be a bit more precise but it's still pretty unclear what
the value of the test is in this day and age.

Modified:
    llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
    llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
    llvm/trunk/test/CodeGen/X86/widen_cast-6.ll

Modified: llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp?rev=212517&r1=212516&r2=212517&view=diff
==============================================================================

--- llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp (original)
+++ llvm/trunk/lib/CodeGen/SelectionDAG/SelectionDAG.cpp Tue Jul  8 03:45:38 2014
@@ -1496,6 +1496,11 @@ SDValue SelectionDAG::getVectorShuffle(E
     N1 = getUNDEF(VT);
     commuteShuffle(N1, N2, MaskVec);
   }
+  // Reset our undef status after accounting for the mask.
+  N2Undef = N2.getOpcode() == ISD::UNDEF;
+  // Re-check whether both sides ended up undef.
+  if (N1.getOpcode() == ISD::UNDEF && N2Undef)
+    return getUNDEF(VT);
 
   // If Identity shuffle return that node.
   bool Identity = true;
@@ -1506,11 +1511,36 @@ SDValue SelectionDAG::getVectorShuffle(E
     return N1;
 
   // Shuffling a constant splat doesn't change the result.
-  bool SplatHasUndefs;
-  if (N2Undef && N1.getOpcode() == ISD::BUILD_VECTOR)
-    if (cast<BuildVectorSDNode>(N1)->getConstantSplatNode(SplatHasUndefs) &&
-        !SplatHasUndefs)
-      return N1;
+  if (N2Undef) {
+    SDValue V = N1;
+
+    // Look through any bitcasts. We check that these don't change the number
+    // (and size) of elements and just changes their types.
+    while (V.getOpcode() == ISD::BITCAST)
+      V = V->getOperand(0);
+
+    // A splat should always show up as a build vector node.
+    if (auto *BV = dyn_cast<BuildVectorSDNode>(V)) {
+      bool SplatHasUndefs;
+      SDValue Splat = BV->getSplatValue(SplatHasUndefs);
+      // If this is a splat of an undef, shuffling it is also undef.
+      if (Splat && Splat.getOpcode() == ISD::UNDEF)
+        return getUNDEF(VT);
+
+      // We only have a splat which can skip shuffles if there is a splatted
+      // value and no undef lanes rearranged by the shuffle.
+      if (Splat && !SplatHasUndefs) {
+        // Splat of <x, x, ..., x>, return <x, x, ..., x>, provided that the
+        // number of elements match or the value splatted is a zero constant.
+        if (V.getValueType().getVectorNumElements() ==
+            VT.getVectorNumElements())
+          return N1;
+        if (auto *C = dyn_cast<ConstantSDNode>(Splat))
+          if (C->isNullValue())
+            return N1;
+      }
+    }
+  }
 
   FoldingSetNodeID ID;
   SDValue Ops[2] = { N1, N2 };

Modified: llvm/trunk/lib/Target/X86/X86ISelLowering.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/X86/X86ISelLowering.cpp?rev=212517&r1=212516&r2=212517&view=diff
==============================================================================
--- llvm/trunk/lib/Target/X86/X86ISelLowering.cpp (original)
+++ llvm/trunk/lib/Target/X86/X86ISelLowering.cpp Tue Jul  8 03:45:38 2014
@@ -7924,47 +7924,6 @@ static SDValue lowerVectorShuffle(SDValu
         return DAG.getVectorShuffle(VT, dl, V1, V2, NewMask);
       }
 
-  // Check for a shuffle of a splat, and return just the splat. While DAG
-  // combining will do a similar transformation, this shows up with the
-  // internally created shuffles and so we handle it specially here as we won't
-  // have another chance to DAG-combine the generic shuffle instructions.
-  if (V2IsUndef) {
-    SDValue V = V1;
-
-    // Look through any bitcasts. These can't change the size, just the number
-    // of elements which we check later.
-    while (V.getOpcode() == ISD::BITCAST)
-      V = V->getOperand(0);
-
-    // A splat should always show up as a build vector node.
-    if (V.getOpcode() == ISD::BUILD_VECTOR) {
-      SDValue Base;
-      bool AllSame = true;
-      for (unsigned i = 0; i != V->getNumOperands(); ++i)
-        if (V->getOperand(i).getOpcode() != ISD::UNDEF) {
-          Base = V->getOperand(i);
-          break;
-        }
-      // Splat of <u, u, ..., u>, return <u, u, ..., u>
-      if (!Base)
-        return V1;
-      for (unsigned i = 0; i != V->getNumOperands(); ++i)
-        if (V->getOperand(i) != Base) {
-          AllSame = false;
-          break;
-        }
-      // Splat of <x, x, ..., x>, return <x, x, ..., x>, provided that the
-      // number of elements match or the value splatted is a zero constant.
-      if (AllSame) {
-        if (V.getValueType().getVectorNumElements() == (unsigned)NumElements)
-          return V1;
-        if (auto *C = dyn_cast<ConstantSDNode>(Base))
-          if (C->isNullValue())
-            return V1;
-      }
-    }
-  }
-
   // For integer vector shuffles, try to collapse them into a shuffle of fewer
   // lanes but wider integers. We cap this to not form integers larger than i64
   // but it might be interesting to form i128 integers to handle flipping the

Modified: llvm/trunk/test/CodeGen/X86/widen_cast-6.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/CodeGen/X86/widen_cast-6.ll?rev=212517&r1=212516&r2=212517&view=diff
==============================================================================
--- llvm/trunk/test/CodeGen/X86/widen_cast-6.ll (original)
+++ llvm/trunk/test/CodeGen/X86/widen_cast-6.ll Tue Jul  8 03:45:38 2014
@@ -1,9 +1,13 @@
 ; RUN: llc < %s -march=x86 -mattr=+sse4.1 | FileCheck %s
-; CHECK: movd
 
 ; Test bit convert that requires widening in the operand.
 
 define i32 @return_v2hi() nounwind {
+; CHECK-LABEL: @return_v2hi
+; CHECK:      pushl
+; CHECK-NEXT: xorl %eax, %eax
+; CHECK-NEXT: popl
+; CHECK-NEXT: ret
 entry:
 	%retval12 = bitcast <2 x i16> zeroinitializer to i32		; <i32> [#uses=1]
 	ret i32 %retval12