[PATCH][X86] Fix assertion failure caused by a wrong folding of vector shifts by immediate count (regression introduced by my r198113).

Mon Jan 13 08:24:46 PST 2014

Hi,

This patch fixes an assertion failure caused by the algorithm
introduced by my r198113. Revision 198113 introduced an algorithm that
tries to fold a vector shift by immediate count into a build_vector if
the input vector to the shift is a known vector of constants.

However, the algorithm only works under the assumption that the input
vector type and the shift type are exactly the same.
This patch conservatively disables the folding of the vector shift by
immediate count if the input vector type and the shift valuetype are
not the same.

Thanks to Patrik Hägglund H for providing the test case.

Please let me know if ok to submit.

Thanks,
Andrea Di Biagio
SN Systems - Sony Computer Entertainment Group
-------------- next part --------------
Index: test/CodeGen/X86/vshift-6.ll
===================================================================

--- test/CodeGen/X86/vshift-6.ll	(revision 0)
+++ test/CodeGen/X86/vshift-6.ll	(revision 0)
@@ -0,0 +1,36 @@
+; RUN: llc < %s -mcpu=corei7 -march=x86-64 -mattr=+sse2  | FileCheck %s
+
+; This test makes sure that the compiler does not crash with an
+; assertion failure when trying to fold a vector shift left
+; by immediate count if the type of the input vector is different
+; to the result type.
+;
+; This happens for example when lowering a shift left of a MVT::v16i8 vector.
+; This is custom lowered into the following sequence:
+;     count << 5
+;     A =  VSHLI(MVT::v8i16, r & (char16)15, 4)
+;     B = BITCAST MVT::v16i8, A
+;     VSELECT(r, B, count);
+;     count += count
+;     C = VSHLI(MVT::v8i16, r & (char16)63, 2)
+;     D = BITCAST MVT::v16i8, C
+;     r = VSELECT(r, C, count);
+;     count += count
+;     VSELECT(r, r+r, count);
+;     count = count << 5;
+;
+; Where 'r' is a vector of type MVT::v16i8, and
+; 'count' is the vector shift count.
+
+define <16 x i8> @do_not_crash(i8*, i32*, i64*, i32, i64, i8) {
+entry:
+  store i8 %5, i8* %0
+  %L5 = load i8* %0
+  %I8 = insertelement <16 x i8> <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>, i8 %L5, i32 7
+  %B51 = shl <16 x i8> <i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1, i8 -1>, %I8
+  ret <16 x i8> %B51
+}
+
+; CHECK-LABEL: do_not_crash
+; CHECK: ret
+
Index: lib/Target/X86/X86ISelLowering.cpp
===================================================================
--- lib/Target/X86/X86ISelLowering.cpp	(revision 199114)
+++ lib/Target/X86/X86ISelLowering.cpp	(working copy)
@@ -11118,8 +11118,9 @@
          && "Unknown target vector shift-by-constant node");
 
   // Fold this packed vector shift into a build vector if SrcOp is a
-  // vector of ConstantSDNodes or UNDEFs.
-  if (ISD::isBuildVectorOfConstantSDNodes(SrcOp.getNode())) {
+  // vector of Constants or UNDEFs, and SrcOp valuetype is the same as VT.
+  if (VT == SrcOp.getSimpleValueType() &&
+      ISD::isBuildVectorOfConstantSDNodes(SrcOp.getNode())) {
     SmallVector<SDValue, 8> Elts;
     unsigned NumElts = SrcOp->getNumOperands();
     ConstantSDNode *ND;