[PATCH] D113223: [VPlan] Add VPCanonicalIVRecipe, partly retire createInductionVariable.

Ayal Zaks via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Dec 27 10:15:46 PST 2021


Ayal added inline comments.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:1066
 /// Return a value for Step multiplied by VF.
-static Value *createStepForVF(IRBuilder<> &B, Type *Ty, ElementCount VF,
-                              int64_t Step) {
+Value *createStepForVF(IRBuilder<> &B, Type *Ty, ElementCount VF,
+                       int64_t Step) {
----------------
(More formally, "getOrCreateStepForVF()", referring to scalable or non-scalable. Indep. of this patch.)


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3008
 
   IRBuilder<> B(&*Header->getFirstInsertionPt());
   Instruction *OldInst = getDebugLocFromInstOrOperands(OldInduction);
----------------
Initialize B to Latch->getTerminator(), instead of setting it there below after initializing it to first insertion point of Header?


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:4531
+  auto *IVR = Plan->getCanonicalIV();
+  PHINode *CanonicalIV = cast<PHINode>(State.get(IVR, VPIteration(0, 0)));
+
----------------
`State.get(IVR, 0)`, per-part 0?


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7936
+  // When vectorizing the epilogue loop, the canonical induction start value
+  // needs to be changed to the value after the main vector loop.
+  if (CanonicalIVStartValue) {
----------------
"... changed [from zero] to ..."


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8928
+// to the header block.
+static void addCanonicalIV(VPlan &Plan, Type *IdxTy, DebugLoc DL, bool HasNUW,
+                           bool IsVPlanNative) {
----------------
"addCanonicalIV" >> "addCanonicalIVRecipe"?


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8946
+
+  VPBasicBlock *EB = TopRegion->getExit()->getExitBasicBlock();
+  if (IsVPlanNative)
----------------
Can do `TopRegion->getExitBasicBlock()` directly, similar to TopRegion->getEntryBasicBlock() above.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:1308
+void VPCanonicalIVPHIRecipe::execute(VPTransformState &State) {
+  Value *Start = getOperand(0)->getLiveInIRValue();
+  PHINode *EntryPart = PHINode::Create(
----------------
getOperand(0) >> getStartValue()?


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:919
+      auto *ICmp = cast<Instruction>(
+          State->Builder.CreateICmpEQ(Next, State->VectorTripCount));
+      TermBr->setCondition(ICmp);
----------------
fhahn wrote:
> Ayal wrote:
> > Introducing the ICmpEQ feeding TermBr should ideally be done by an appropriate (BCT) recipe in the normal course of VPlan::execute() stage 2 above, rather than during this backedge-rewiring stage 3.
> I added a TODO and this should be addressed soon in a follow-up patch. Or should those changes also be pulled in here?
> I added a TODO and this should be addressed soon in a follow-up patch. Or should those changes also be pulled in here?

As you prefer. A BranchOnCount recipe could potentially take care of the Increment[NUW]-feeding-the-ICmpEQ-feeding-the-branch better, while defining the single VPValue to feed back the CanonicalIVPHI. It could be introduced here instead of Increment[NUW] et al., or replace them in the TODO follow-up patch.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:933
+                       ? State->get(R.getVPSingleValue(), VPIteration(Part, 0))
+                       : State->get(PhiR, Part);
+      Value *Val =
----------------
fhahn wrote:
> Ayal wrote:
> > Storing per-part 0 rather than per-lane 0 and common base class should lead to a common State->get(PhiR, Part) for all, so the loop becomes
> > 
> > ```
> >     for (unsigned Part = 0; Part < NumPartsForNewPhi; ++Part) {
> >       Value *Phi = State->get(PhiR, Part);
> >       Value *Val = State->get(BackedgeValue, IsSinglePart ? SinglePart : Part);
> >       cast<PHINode>(Phi)->addIncoming(Val, VectorLatchBB);
> >     }
> > ```
> Storing a scalar value in the 'vector' part of State seems not ideal, but it does indeed simplify things a lot here. Updated
That is how uniform (/UAV) scalar values are stored in general, namely, in the 'vector' part of State.
The scalar canonical IV is conceptually UAV, only its lane 0 is used.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:824
 void VPlan::execute(VPTransformState *State) {
   // -1. Check if the backedge taken count is needed, and if so build it.
   if (BackedgeTakenCount && BackedgeTakenCount->getNumUsers()) {
----------------
fhahn wrote:
> Ayal wrote:
> > Should all invariant values that are generated in the pre-header be modelled as VPValues similar to BackedgeTakenCount, until the preheader is modelled as VPBB with recipes? I.e., treat VectorTripCount, [Canonical]IVStartValue and "CanonicalIVStepValue" (returned by createStepForVF()) similar to VPlan's BackedgeTakenCount, which is hooked-up to its State->TripCount IR Value here?
> I am not sure if it is worth it at the moment, as long as they are only used by a single recipe. Specifically, the start value is mostly a constant (0), the step is currently only used by the induction increment and the vector trip count is currently only used in post-processing. VectorTripCount is moved to a dedicated VPValue in the patch that introduces recipes for the exit condition. The step value could be moved out as follow up. WDYT? 
> I am not sure if it is worth it at the moment, as long as they are only used by a single recipe. Specifically, the start value is mostly a constant (0), the step is currently only used by the induction increment and the vector trip count is currently only used in post-processing. VectorTripCount is moved to a dedicated VPValue in the patch that introduces recipes for the exit condition. The step value could be moved out as follow up. WDYT?

Perhaps instead of passing everything via one VPTransformState into VPlan::execute(), it would be better to separate and first have VPlan initialize all its live-in/pre-header VPValues, and then dedicate VPTransformState to hold only general context required during code-gen?


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:1099
            V->getVPValueID() == VPValue::VPVReductionPHISC;
   }
 
----------------
Keep above in Lex order?


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:2297
+
+  VPCanonicalIVRecipe *getCanonicalIV() {
+    VPBasicBlock *EntryVPBB = getEntry()->getEntryBasicBlock();
----------------
Ayal wrote:
> Start by calling getVectorLoopRegion()?
> 
> Associate getCanonicalIV() with a (non replicating) Region, instead of 'entire' VPlan?
> Cache a pointer to VPCanonicalIVRecipe in Region?
> (IV's of replicating regions are eliminated by complete unrolling.)
Can add to VPlanVerifier a check that first recipe only is VPCanonicalIVPHIRecipe?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113223/new/

https://reviews.llvm.org/D113223



More information about the llvm-commits mailing list