[PATCH] D147964: [VPlan] Introduce new entry block to VPlan for early SCEV expansion.

Florian Hahn via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 3 03:52:16 PDT 2023


fhahn added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:571
+  /// Used to set the trip count after ILV's construction and after the
+  /// preheader block has been executed.
+  void setTripCount(Value *TC) { TripCount = TC; }
----------------
Ayal wrote:
> nit: also worth clarifying that the same trip count - that of the original loop - is (generated once and) set for both ILV's of main and epilog loops?
added, thanks!


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3260
+       [ ] <-- old preheader - loop iteration number check and SCEVs in Plans
+    /   |       preheader are expanded here.
    /    v
----------------
Ayal wrote:
> nit: "SCEVs in Plans [preheader] are expanded here" >> "all SCEV expansions needed by vectorized loops"?
Added an extra sentence that eventually all SCEV expansion should happen there, as this is not the case yet.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3360
       VPValue *StepVPV = Plan.getSCEVExpansion(II.getStep());
-      Value *Step = StepVPV->getDefiningRecipe() ? State.get(StepVPV, 0)
+      Value *Step = StepVPV->getDefiningRecipe() ? State.get(StepVPV, {0, 0})
                                                  : StepVPV->getLiveInIRValue();
----------------
Ayal wrote:
> Curious about this needed change in handling invariant "live-ins"?
> > As a consequence some places needed to be updated to specifically requests Part & Lane 0 to handle live-ins properly in VPTransformState::get.
VPTransformState::get with part only for a live-in will try to create a vector broadcast of the live-in because the part-only `get` in general produces vectors as result. Here we need the scalar value, and requesting a part 0/lane 0 allows us to do that with the current API.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7692
+    State.Builder.SetInsertPoint(OrigLoop->getLoopPreheader()->getTerminator());
+    BestVPlan.getPreheader()->execute(&State);
+  }
----------------
Ayal wrote:
> How/do we prevent expanding trip count SCEV again needlessly when vectorizing epilog loop?
> 
> Other SCEV expands in preheader are connected directly to distinct VPUsers so need be expanded even if common to both main and epilog loop, hopefully cse'd later?
> How/do we prevent expanding trip count SCEV again needlessly when vectorizing epilog loop?

Updated the patch to remove the trip count expansion recipe in the epilogue plan together with an assertion that it's not used.

> Other SCEV expands in preheader are connected directly to distinct VPUsers so need be expanded even if common to both main and epilog loop, hopefully cse'd later?

Yes. We could transfer the values for the SCECV expansions from the state of the main vector loop to the state of the epilogue vector loop to re-use them without expanding them again. But if we decide to do that it would probably be better to do that separately.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7695
+  if (!ILV.getTripCount())
+    ILV.setTripCount(State.get(BestVPlan.getTripCount(), {0, 0}));
 
----------------
Ayal wrote:
> nit: worth a comment that the 'else' case refers to vectorizing the epilog loop, whose ILV borrows the trip count from that of the main loop's ILV (the latter being set here)?
Added an assert to the else case, thanks!


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8847
+      vputils::getOrCreateVPValueForSCEVExpr(*Plan, TripCount, *PSE.getSE());
+  Plan->setTripCount(TripCountVPV);
+  return Plan;
----------------
Ayal wrote:
> nit: there is a bit of chicken-and-egg when defining TripCount and Plan. Can let Plan's constructor create its own TripCount - it should be immutable/unsettable.
I updated the patch to make  createInitialVPlan a static function of VPlan and removed the setter.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9825
   if (!hasScalarValue(Def, {Part, LastLane})) {
     // At the moment, VPWidenIntOrFpInductionRecipes and VPScalarIVStepsRecipes can also be uniform.
     assert((isa<VPWidenIntOrFpInductionRecipe>(Def->getDefiningRecipe()) ||
----------------
Ayal wrote:
> nits: "... and VPExpandSCEVRecipes". "can" refers only to WidenIntOrFpInductions?
Added, thanks!


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10415
 
+        EpilogILV.setTripCount(MainILV.getTripCount());
+
----------------
Ayal wrote:
> nit: worth commenting about setting trip count explicitly here/now.
Added a comment here together with removing the expand-recipe if there's one to avoid expanding the SCEV again. + Assert that it is not used in the plan.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:1109
+    for (VPValue *Def : Recipe.definedValues())
+      assignSlot(Def);
+
----------------
Ayal wrote:
> nit: have assignSlots(VPBB)
Updated, thanks!


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:1131-1133
   if (auto *Expanded = Plan.getSCEVExpansion(Expr))
     return Expanded;
   VPValue *Expanded = nullptr;
----------------
Ayal wrote:
> nit: where were the caching get/addSCEVExpansion() added? (Independent of this patch).
The caching has been added in this pending patch: D147963. But if the current patch gets approved first, it will be easy to apply it without those changes.

Will address those separate.y


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:1139-1140
   else {
-    VPBasicBlock *Preheader = Plan.getEntry();
     Expanded = new VPExpandSCEVRecipe(Expr, SE);
-    Preheader->appendRecipe(Expanded->getDefiningRecipe());
+    Plan.getPreheader()->appendRecipe(Expanded->getDefiningRecipe());
   }
----------------
Ayal wrote:
> nit: can cast instead of getting defined recipe.
Will address separately.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:2222
+  /// VPBasicBlock corresponding to the original preheader. Used to place
+  /// VPExpandSCEV recipes for expressions used during skeleton creation.
+  VPBasicBlock *Preheader;
----------------
Ayal wrote:
> nit: "... during skeleton creation and (rest of) VPlan execution."
Extended, thanks!


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:2271
+  /// Construct a VPlan with original preheader \p Preheader and  \p Entry to
+  /// the plan.
+  VPlan(VPBasicBlock *Preheader, VPBasicBlock *Entry)
----------------
Ayal wrote:
> nit: can add that (currently) they are disconnected because bypass blocks between them are not (yet) modelled in VPlan.
added + assert that preheader is disconnected.


================
Comment at: llvm/test/Transforms/LoopVectorize/first-order-recurrence-sink-replicate-region.ll:16
+; CHECK-NEXT: ph:
+; CHECK-NEXT:   EMIT vp<%0> = EXPAND SCEV (1 smax (1 + (sext i8 %y to i32))<nsw>)
+; CHECK-NEXT: No successors
----------------
Ayal wrote:
> Is vp<%0> used, does it match Live-in VEC_TC noted above?
It should match the scalar trip count. Previously there have been no users so the live-in wasn't printed. I updated ::print() to always print the trip count as it is now always (implicitly) used during skeleton creation.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147964/new/

https://reviews.llvm.org/D147964



More information about the llvm-commits mailing list