[PATCH] [X86][SSE] psrl(w/d/q) and psll(w/d/q) bit shifts for SSE2

Mon Dec 15 10:56:59 PST 2014

REPOSITORY
  rL LLVM

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:7755-7770
@@ +7754,18 @@
+                                            SelectionDAG &DAG) {
+  const MVT::SimpleValueType ShiftMapping[][2] = {
+    // SSE2
+    { MVT::v4i32, MVT::v2i64 },
+    { MVT::v8i16, MVT::v4i32 },
+    { MVT::v8i16, MVT::v2i64 },
+    { MVT::v16i8, MVT::v8i16 },
+    { MVT::v16i8, MVT::v4i32 },
+    { MVT::v16i8, MVT::v2i64 },
+    // AVX2
+    { MVT::v8i32, MVT::v4i64 },
+    { MVT::v16i16, MVT::v8i32 },
+    { MVT::v16i16, MVT::v4i64 },
+    { MVT::v32i8, MVT::v16i16 },
+    { MVT::v32i8, MVT::v8i32 },
+    { MVT::v32i8, MVT::v4i64 }
+  };
+
----------------
Is this clang-formatted? I would have expected a slightly different layout, but  I  don't want to quibble with  whatever clang-format chooses.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:7772-7783
@@ +7771,14 @@
+
+  auto isSequential = [](int Base, int StartIndex, int EndIndex, int IndexOffset,
+    int MaskOffset, ArrayRef<int> Mask) {
+    StartIndex += IndexOffset;
+    EndIndex += IndexOffset;
+    for (int i = StartIndex; i < EndIndex; i++) {
+      if (Mask[i] < 0)
+        continue;
+      if (i + Base != Mask[i] - MaskOffset)
+        return false;
+    }
+    return true;
+  };
+
----------------
Have you checkde whether we already have such a  predicate? I thought we did, but maybe this is sligthly different. Either  way, a comment clarifying the exact thing it is  checking would be really helpful I think. At the very least, the argument set is quite surprising.

Also, for lambdas, I've been consistently naming them as variables, and so IsSequential.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:7787
@@ +7786,3 @@
+
+  for (unsigned m = 0, e = array_lengthof(ShiftMapping); m != e; m++) {
+    if (VT.SimpleTy == ShiftMapping[m][0]) {
----------------
Does a range based for loop not work here?

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:7788
@@ +7787,3 @@
+  for (unsigned m = 0, e = array_lengthof(ShiftMapping); m != e; m++) {
+    if (VT.SimpleTy == ShiftMapping[m][0]) {
+      MVT ShiftVT = MVT(ShiftMapping[m][1]);
----------------
Please use early exit patterns to reduce indentation, and test the inverse and continue.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:7790-7791
@@ +7789,4 @@
+      MVT ShiftVT = MVT(ShiftMapping[m][1]);
+      unsigned ShiftSize = ShiftVT.getVectorNumElements();
+      unsigned ShiftScale = VT.getVectorNumElements() / ShiftSize;
+
----------------
I would prefer to just use integers to represent numbers unless you explicitly want modular arithmetic (here and throughout this patch)

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:7793
@@ +7792,3 @@
+
+      for (unsigned Shift = 1; Shift != ShiftScale; Shift++) {
+        unsigned ShiftAmt = Shift * VT.getScalarSizeInBits();
----------------
I think it would be really helpful to actually comment on the algorithm you're using to search for bit-shift equivalent shuffle masks. It makes it a bit easier to check that the code is actually doing what is intended.

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:7807-7808
@@ +7806,4 @@
+        if (ZeroableRight) {
+          bool ValidShiftRight1 = true;
+          bool ValidShiftRight2 = true;
+
----------------
Rather than use variables, maybe sink this into the predicate function so that you can early-exit from the loops?

================
Comment at: lib/Target/X86/X86ISelLowering.cpp:7812
@@ +7811,3 @@
+            ValidShiftRight1 &= isSequential(Shift, 0, ShiftScale - Shift, i*ShiftScale, 0, Mask);
+            ValidShiftRight2 &= isSequential(Shift, 0, ShiftScale - Shift, i*ShiftScale, Mask.size(), Mask);
+          }
----------------
80-columns.

http://reviews.llvm.org/D6649

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/