[PATCH] D21620: [LV] Don't widen trivial induction variables

Wed Jun 29 13:55:29 PDT 2016

anemet added inline comments.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:308-314
@@ -307,8 +307,9 @@
 
   // Perform the actual loop widening (vectorization).
   // MinimumBitWidths maps scalar integer values to the smallest bitwidth they
   // can be validly truncated to. The cost model has assumed this truncation
   // will happen when vectorizing.
   void vectorize(LoopVectorizationLegality *L,
-                 const MapVector<Instruction *, uint64_t> &MinimumBitWidths) {
+                 const MapVector<Instruction *, uint64_t> &MinimumBitWidths,
+                 SmallPtrSetImpl<const Value *> &VecValuesToIgnore) {
     MinBWs = &MinimumBitWidths;
----------------
Please comment the new argument

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:316
@@ -314,2 +315,3 @@
     MinBWs = &MinimumBitWidths;
+    ValuesNotWidened = &VecValuesToIgnore;
     Legal = L;
----------------
Since you're holding on to this in a member, it's less surprising if the reference is live throughout the lifetime of the InnerLoopVectorizer.  I think that we should pass this in the ctor.  I see no particular reason not to do this way.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:420
@@ +419,3 @@
+  /// arithmetic instead. The results of the computation are inserted into a
+  /// new vector with VF elements. Step is a value.
+  Value *getScalarizedStepVector(Value *Start, int Index, Value *Step,
----------------
\p Step, please explain at least \p Index as well.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:426
@@ +425,3 @@
+  /// arithmetic instead. The results of the computation are inserted into a
+  /// new vector with VF elements. Step is a SCEV, and we use a SCEVExpander to
+  /// create a value from it.
----------------
same

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4219-4233
@@ -4164,7 +4218,17 @@
       }
+
+      // If an induction variable is only used for counting loop iterations or
+      // calculating addresses, it shouldn't be widened. Scalarize the step
+      // vector to give InstCombine a better chance of simplifying it.
+      if (VF > 1 && ValuesNotWidened->count(P)) {
+        for (unsigned part = 0; part < UF; ++part)
+          Entry[part] = getScalarizedStepVector(V, VF * part, II.getStep(), VF);
+        return;
+      }
+
       Value *Broadcasted = getBroadcastInstrs(V);
       // After broadcasting the induction variable we need to make the vector
       // consecutive by adding 0, 1, 2, etc.
       for (unsigned part = 0; part < UF; ++part)
         Entry[part] = getStepVector(Broadcasted, VF * part, II.getStep());
     } else {
----------------
Looks like this and the code below under trunc are quite similar.  Can we factor this is out to a helper in a prequel to this patch and then add the special case in the to the helper?

================
Comment at: test/Transforms/LoopVectorize/induction.ll:70-80
@@ -68,1 +69,13 @@
 
+; Make sure we don't widen induction variables that don't have interesting
+; users inside the loop. Doing so causes us to create unnecessary phi nodes and
+; computation inside the loop.
+;
+; for (int i = 0; i < n; ++i)
+;   sum += a[i];
+;
+; IND-LABEL:    @scalarize_induction_variable(
+; IND:          vector.body:
+; IND:            %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
+; IND-NOT:        add i64 {{.*}}, 2
+; IND:            getelementptr inbounds i64, i64* %a, i64 %index
----------------
I think that the testcase from the PR I mentioned would better highlight the problem of vectorizing the induction variable.  I.e. in this case we don't actually end up with a vector induction variable but a secondary unnecessary IV.


http://reviews.llvm.org/D21620