[PATCH] D14261: [X86][SSE] Recursive search for zeroable shuffle elements

Andrea Di Biagio via llvm-commits llvm-commits at lists.llvm.org
Fri Nov 6 04:47:02 PST 2015


andreadb added a comment.

Hi Simon,


================
Comment at: lib/Target/X86/X86ISelLowering.cpp:6732-6734
@@ +6731,5 @@
+    // SHUFFLE_VECTOR - recursive call to computeZeroableShuffleElements.
+    else if (ShuffleVectorSDNode *S = dyn_cast<ShuffleVectorSDNode>(V))
+      return computeZeroableShuffleElements(S->getMask(), S->getOperand(0),
+                                            S->getOperand(1));
+
----------------
Do you think it would make sense to limit the recursion level?

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:6746-6749
@@ -6722,4 +6745,6 @@
 
   bool V1IsZero = ISD::isBuildVectorAllZeros(V1.getNode());
   bool V2IsZero = ISD::isBuildVectorAllZeros(V2.getNode());
+  SmallBitVector V1Zeroables = GetSubZeroable(V1);
+  SmallBitVector V2Zeroables = GetSubZeroable(V2);
 
----------------
I don't think you need to compute V1IsZero and V2IsZero anymore. The two calls to 'isBuildVectorAllZeros' are now made redundant by the calls to 'GetSubZeroable'.

You can also simplify the if statement at line 6754. In particular, you would only need to check for the presence of an undef index. All the remaining cases should be taken care by the checks after line 6760.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:6777-6787
@@ -6745,1 +6776,13 @@
+    // If more (narrower) elements - all aliased 'sub-elements' must be zero.
+    if (Size < SubSize) {
+      assert(0 == (SubSize % Size) && "Bad scale");
+      unsigned Scale = SubSize / Size;
+      bool Zero = true;
+      for (unsigned j = 0; j != Scale; ++j)
+        Zero &= SubZeroable[(Idx * Scale) + j];
+      Zeroable[i] = Zero;
+      continue;
+    }
+
+    llvm_unreachable("Unexpected mask size");
   }
----------------
At line 6777, you don't need to check if 'Size < SubSize'. If control reaches line 6777, then Size can never be bigger than or equal to SubSize.

The motivation is that the check for (Size < SubSize) is dominated by the checks for (Size == SubSize) and the check for (Size > SubSize). You can also remove the llvm_unreachable at line 6787 as it is not needed.


Repository:
  rL LLVM

http://reviews.llvm.org/D14261





More information about the llvm-commits mailing list