[PATCH] D32871: [LV] Using VPlan to model the vectorized code and drive its transformation

Fri Jun 9 18:21:23 PDT 2017

mkuper added inline comments.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:2265
+    while (!VPlans.empty()) {
+      VPlan *Plan = VPlans.back();
+      VPlans.pop_back();
----------------
Ayal wrote:
> rengolin wrote:
> > SmallVectorImpl should destroy its own range, you don't need to do it yourself.
> Leaving ~LoopVectorizationPlanner() {} does not seem to deallocate the VPlan objects created by buildVPlan(). Is there another way to prevent them from leaking?
A vector of unique_ptr, maybe?
Anyway, I'm fine with this as is.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:1953
   bool isProfitableToScalarize(Instruction *I, unsigned VF) const {
+    // Unroller also calls this method, but does not collectInstsToScalarize.
+    if (VF == 1)
----------------
Why?

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:2170

+public:
   /// Collects the instructions to scalarize for each predicated instruction in
----------------
As above, please group public/private members together.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:2324
+  /// returned value holds for the entire \p Range.
+  bool testVFRange(const std::function<bool(unsigned)> &Predicate,
+                   VFRange &Range);
----------------
This seems like a somewhat odd API.
Why do we only support cutting from the top, and not from the bottom or the middle? Do we expect to only ever prune from the top? Otherwise, I'd expect the range to be represented as a set, rather than an interval, and use a filter over that set. The current state may be fine for simplicity's sake, but i'd like to understand this better.

Regardless, please rename the method. It's really surprising that a bool test...() modifies one of its arguments.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4687
   case InductionDescriptor::IK_FpInduction:
-    return widenIntOrFpInduction(P);
+    llvm_unreachable("Integer/fp induction is handled elsewhere.");
   case InductionDescriptor::IK_PtrInduction: {
----------------
Why the split between how we handle int and ptr inductions? To minimize the patch?

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:4735

 void InnerLoopVectorizer::vectorizeInstruction(Instruction &I) {
   switch (I.getOpcode()) {
----------------
Perhaps this should now be named widenInstruction?

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:7634
+
+  for (auto *VPlanIter = VPlans.begin(); VPlanIter != VPlans.end();) {
+    VPlan *Plan = *VPlanIter;
----------------
I suppose this (and the rest of this code) is constructed this way because we intend to support more than one VPlan per VF/UF  pair soon?

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:7750

+namespace {
+/// VPWidenRecipe is a recipe for producing a copy of vector type for each
----------------
Maybe break all of the recipe stuff out into a separate file? Two files, even? (header/implementation)

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:8242
+  for (BasicBlock *BB : make_range(DFS.beginRPO(), DFS.endRPO())) {
+    // Relevent instructions from basic block BB will be grouped into VPRecipe
+    // ingredients and fill a new VPBasicBlock.
----------------
Relevent -> Relevant

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:8250
+
+    for (Instruction &I : *BB) {
+      Instruction *Instr = &I;
----------------
Please add a comment somewhere explaining that the order in which we try the recipes matters. (e.g. tryToInterleave needs to come first.)
I'm not sure how we should be handling recipe priority in general, if at all, but that's not for this review.

================
Comment at: lib/Transforms/Vectorize/LoopVectorize.cpp:8282
+          VPBB->appendRecipe(Recipe);
+        LastWidenRecipe = cast<VPWidenRecipe>(Recipe);
+        continue;
----------------
Shouldn't we be resetting LastWidenRecipe somewhere, if we encountered an instruction that uses a different recipe?

================
Comment at: lib/Transforms/Vectorize/VPlan.h:222
+
+  /// \return the closest ancestor starting from "this", which has successors.
+  /// Returns the root ancestor if all ancestors have no successors.
----------------
Ayal wrote:
> mkuper wrote:
> > The terminology here is somewhat confusing, since it's not immediately clear what the difference between an "ancestor" and a "predecessor" is. Maybe "enclosing region" instead of ancestor?
> Added the following definition to clarify the terminology:
>   /// An Ancestor of a block B is any block containing B, including B itself.
> So it's either an "enclosing region" or the block itself, which in turn may be a region or a basic-block.
I would strongly prefer less confusing terminology (I think ancestor very strongly evokes "direct or indirect predecessor"), but if I'm the only one getting confused by this, feel free to keep it.

================
Comment at: test/Transforms/LoopVectorize/if-pred-non-void.ll:222
 ; CHECK:   %[[T00:.+]] = extractelement <2 x i32> %wide.load, i32 0
-; CHECK:   %[[T01:.+]] = extractelement <2 x i32> %wide.load, i32 0
-; CHECK:   %[[T02:.+]] = add nsw i32 %[[T01]], %x
-; CHECK:   %[[T03:.+]] = udiv i32 %[[T00]], %[[T02]]
+; CHECK:   %[[T01:.+]] = add nsw i32 %[[T00]], %x
+; CHECK:   %[[T02:.+]] = extractelement <2 x i32> %wide.load, i32 0
----------------
The test changes look benign, but I'm curious about why we have them.

https://reviews.llvm.org/D32871