[PATCH] D113223: [VPlan] Add VPCanonicalIVRecipe, partly retire createInductionVariable.
Ayal Zaks via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Dec 26 14:14:57 PST 2021
Ayal added inline comments.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8938
+
+ auto *PrimaryInd = new VPCanonicalIVRecipe(StartV, DL);
+ VPRegionBlock *TopRegion = Plan.getVectorLoopRegion();
----------------
PrimaryInd - let's settle for CanonicalIV throughout?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3120
- // Create i+1 and fill the PHINode.
- //
- // If the tail is not folded, we know that End - Start >= Step (either
- // statically or through the minimum iteration checks). We also know that both
- // Start % Step == 0 and End % Step == 0. We exit the vector loop if %IV +
- // %Step == %End. Hence we must exit the loop before %IV + %Step unsigned
- // overflows and we can mark the induction increment as NUW.
- Value *Next = B.CreateAdd(Induction, Step, "index.next",
- /*NUW=*/!Cost->foldTailByMasking(), /*NSW=*/false);
- Induction->addIncoming(Start, L->getLoopPreheader());
- Induction->addIncoming(Next, Latch);
// Create the compare.
+ B.CreateCondBr(B.getTrue(), L->getUniqueExitBlock(), Header);
----------------
fhahn wrote:
> Ayal wrote:
> > Above comment is obsolete.
> > Create the latch terminator later along with its compare? Seems more like "replace" latch terminator than "create" it?
> > Above comment is obsolete.
>
> removed.
>
>
> > Create the latch terminator later along with its compare? Seems more like "replace" latch terminator than "create" it?
>
> Unfortunately `widenIntOrFpInduction` still creates and sets the incoming values for vector inductions. To set the incoming value from the backedge for a phi, the latch already needs to be connected to the header.
>>Create the latch terminator later along with its compare? Seems more like "replace" latch terminator than "create" it?
> Unfortunately widenIntOrFpInduction still creates and sets the incoming values for vector inductions. To set the incoming value from the backedge for a phi, the latch already needs to be connected to the header.
ok, and D113183 is where widenIntOrFpInduction will stop doing that, and join other header phi's which hook-up to their backedge values at the end of VPlan::execute.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8950
+ } else
+ cast<VPBasicBlock>(Header)->appendRecipe(PrimaryInd);
+
----------------
fhahn wrote:
> fhahn wrote:
> > Ayal wrote:
> > > In both cases PrimaryInd (CanonicalIV?) should be inserted as the first recipe of Header?
> > Yes, but unfortunately in the native path it looks like there's a pre-header block already. Perhaps we should make sure the native path also wraps the vector loop in a region.
> I looked into this and wrapping the loops in the native path into regions is a bit more tricky, because of the explicit edge from latch to header; this means the header cannot simply be used as entry block to the loop region. Once we have branch-exit instructions, we should be able to address this.
Sorry, didn't mean to modify the native path, just wondering if something like this would work:
```
if (IsVPlanNative)
Header = Header->getSingleSuccessor();
CanonicalIV->insertBefore(&*cast<VPBasicBlock>(Header)->begin());
```
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:902
+ for (VPRecipeBase &R : Header->phis()) {
+ auto *PhiR = dyn_cast<VPWidenPHIRecipe>(&R);
+ if (!isa<VPCanonicalIVRecipe>(&R) &&
----------------
Have VPCanonicalIVRecipe also inherit from VPWidenPHIRecipe, possibly renaming the latter VPHeaderPHIRecipe, so that it too could benefit from getBackedgeValue()?
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:919
+ auto *ICmp = cast<Instruction>(
+ State->Builder.CreateICmpEQ(Next, State->VectorTripCount));
+ TermBr->setCondition(ICmp);
----------------
Introducing the ICmpEQ feeding TermBr should ideally be done by an appropriate (BCT) recipe in the normal course of VPlan::execute() stage 2 above, rather than during this backedge-rewiring stage 3.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:928
+ LastPartForNewPhi = SinglePartNeeded ? 1 : State->UF;
+ BackedgeValue = PhiR->getBackedgeValue();
+ }
----------------
This hopefully could fold into
```
// Is a single part fed across the backedge, or all UF parts.
IsSinglePart = isa<VPFirstOrderRecurrencePHIRecipe>(&R) ||
isa<VPCanonicalIVRecipe>(&R) ||
cast<VPReductionPHIRecipe>(&R)->isOrdered();
SinglePart = isa<VPCanonicalIVRecipe>(&R) ? 0 : State->UF-1;
NumPartsForNewPhi = IsSinglePart ? 1 : State->UF;
BackedgeValue = PhiR->getBackedgeValue();
```
Note that if a single part is fed across the backedge, it must be the last part UF-1 for ordered reductions and FOR, but only the first part 0 is set for the Canonical IV backedge value (it could also set part UF-1...).
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:933
+ ? State->get(R.getVPSingleValue(), VPIteration(Part, 0))
+ : State->get(PhiR, Part);
+ Value *Val =
----------------
Storing per-part 0 rather than per-lane 0 and common base class should lead to a common State->get(PhiR, Part) for all, so the loop becomes
```
for (unsigned Part = 0; Part < NumPartsForNewPhi; ++Part) {
Value *Phi = State->get(PhiR, Part);
Value *Val = State->get(BackedgeValue, IsSinglePart ? SinglePart : Part);
cast<PHINode>(Phi)->addIncoming(Val, VectorLatchBB);
}
```
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:1322
+ EntryPart->setDebugLoc(DL);
+ State.set(this, EntryPart, VPIteration(0, 0));
+}
----------------
fhahn wrote:
> Ayal wrote:
> > Store the Phi value per-part 0, rather than per-lane 0? That's how uniform (UAV) scalars are stored, and also how the induction increment is stored. Admittedly there should be a 'once/singleton' option in addition to per-lane and per-part.
> Done!
` State.set(this, EntryPart, Part);`
and elsewhere?
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:1628
+
+ /// Generate the canonical vector induction phi of the vector loop.
+ void execute(VPTransformState &State) override;
----------------
"vector" still needs to be dropped too.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D113223/new/
https://reviews.llvm.org/D113223
More information about the llvm-commits
mailing list