[PATCH] Generation of PSAD in LoopVectorizer

Wed Apr 1 05:43:50 PDT 2015

Hi Vijender,

Thanks for doing this! I have a bunch of comments below. High-level, why are you lowering this as an intrinsic rather than an IR-level pattern? I ask because we really do want this pattern recognized in straight-line (non vectorized) code too, and that is something the DAGCombiner could do.

Cheers,

James


REPOSITORY
  rL LLVM

================
Comment at: include/llvm/Analysis/TargetTransformInfo.h:455
@@ -454,1 +454,3 @@
 
+  /// \returns the cost of SAD instruction.
+  unsigned getSADInstrCost(Type *RetTy, Type *op1,
----------------
Please spell out "SAD" = "Sum of absolute differences" = "C += abs(A - B)" in the docstring, so it is clear (not all backends will use the same mnemonic).

================
Comment at: include/llvm/CodeGen/ISDOpcodes.h:647
@@ +646,3 @@
+    /// SAD - This corresponds to a Sum of Absolute Difference(SAD) instruction.
+    /// The operands are vectors of char type with length of vector being power
+    /// of 2 and the result is a scalar integer.
----------------
"of char type"? Do you mean i8? Why only i8? Why is the (scalar part of the) input type not the same as the output type?

================
Comment at: include/llvm/IR/Intrinsics.td:596
@@ +595,3 @@
+// Calculate the Sum of Absolute Differences (SAD) of the two input vectors.
+// Only vectors of char Type are allowed.
+def int_sad  : Intrinsic<[llvm_i32_ty],
----------------
Char? Why i32? why not i64? why not i8/i16?

================
Comment at: include/llvm/IR/Intrinsics.td:597
@@ +596,3 @@
+// Only vectors of char Type are allowed.
+def int_sad  : Intrinsic<[llvm_i32_ty],
+                         [llvm_anyvector_ty, llvm_anyvector_ty],
----------------
What is the signedness of this operation? Is there a floating point equivalent? If not, why not?

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:630
@@ +629,3 @@
+    SpecialPatternDescriptor(SpecialPatternKind SPKind,
+                             SpecialActionKind SAK, unsigned miVF,
+                             unsigned maVF)
----------------
Just use "minVF" here, it won't conflict as long as you reference it in the initializer list.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:642
@@ +641,3 @@
+    // this pattern. For example, for x86 the minimum VF for SAD is 8.
+    unsigned minVF;
+    // maxVF - Specifies the maximum VF for which the target supports
----------------
I don't like the assumption here that the legality of an operation is a range [min,max]. I don't think we should assume continuity in the range. Perhaps this should be a bitset of supported types instead?

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:767
@@ +766,3 @@
+  /// which is part of special pattern in a basic block.
+  typedef DenseMap<Instruction *, SpecialPatternDescriptor> ActionList;
+
----------------
Shouldn't this be a multimap? I can imagine that there might be complex patterns and simple patterns, and a complex pattern may subsume a simple pattern. So an instruction could be part of two patterns (until one is selected).

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:1014
@@ -942,1 +1013,3 @@
+  /// If there is no special pattern then the widest type is returned. 
+  unsigned selectBestTypeForPattern();
 
----------------
Uh, surely this needs to be just the type of the phi? If smaller types are legal, the phi should have a smaller type, right?

Your proposed algorithm falls down in this case:

    int32_t a;
    int8_t b;
    int32_t *c;
    for (...) {
      a += abs(c[i] - c[i+1]);
      b += (int8_t)c[i];
    }

Oh dear, you'll select "i8" as your type but that is illegal (it's not part of this PHI).

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:1050
@@ +1049,3 @@
+  /// for a given vector width.
+  unsigned SpecialPhiCost(PHINode *p, unsigned VF);
+
----------------
I generally don't like the use of "Special". Can we not think of a more descriptive name for it, like "Pattern"?

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:2556
@@ -2478,2 +2555,3 @@
   case RK_IntegerOr:
+  case RK_IntegerSpecial:
     // Adding, Xoring, Oring zero to a number does not change it.
----------------
Well this isn't right - Who says the identity for all pattern based reductions is integer zero?

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:2582
@@ -2503,1 +2581,3 @@
       return Instruction::Add;
+    case LoopVectorizationLegality::RK_IntegerSpecial:
+      return Instruction::Add;
----------------
If it assumed that all patterns can be added together, that isn't documented anywhere. 

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:3014
@@ -2876,3 +3013,3 @@
 
-    if (VF > 1) {
+    if (VF > 1 && !isSAD) {
       // VF is a power of 2 so we can emit the reduction using log2(VF) shuffles
----------------
I don't like this at all. You've introduced a new generic pattern type then binned that and special-cased for SAD. Why is SAD special? why is it different from ADD for this case?

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:3295
@@ -3157,2 +3294,3 @@
   // For each instruction in the old loop.
+  SmallVector<Value *, 4> args;
   for (BasicBlock::iterator it = BB->begin(), e = BB->end(); it != e; ++it) {
----------------
Args

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:3312
@@ +3311,3 @@
+          assert(0 && "Cannot be Invalid here!!");
+          break;
+        }
----------------
No default case?

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4862
@@ +4861,3 @@
+          switch(SPD.Kind) {
+            case LoopVectorizationLegality::SPK_Sad:
+              specialPatternWidth = std::min(specialPatternWidth,(unsigned)8);
----------------
I think here you're special casing the SpecialPattern stuff when really it shouldn't be special-cased - this is a problem (finding the smallest valid type for an operation) that applies in many cases to most operators, not just your patterns.

I think it gives the biggest impact in ARM/MIPS world though as i8 and i16 aren't legal scalar types for us.

http://reviews.llvm.org/D8136

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/