[PATCH] D150398: [VPlan] Model branch cond to enter scalar epilogue in VPlan.

Ayal Zaks via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Aug 25 04:26:51 PDT 2023


Ayal added a comment.

Further thoughts how to best stage this.



================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3067
       SplitBlock(LoopMiddleBlock, LoopMiddleBlock->getTerminator(), DT, LI,
                  nullptr, Twine(Prefix) + "scalar.ph");
 
----------------
nit: LoopMiddleBlock will end with a branch to LoopScalarPreHeader, following how the latter is constructed here, so it's redundant to replace this branch below with BrInst if a scalar epilog is not required.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3081
+  // FIXME: Model exit block and scalar preheader directly in VPlan to avoid
+  // creating initial branch here and fixing it up during VPlan execution.
   BranchInst *BrInst =
----------------
Can this FIXME be addressed so that VPlan execution will take care of setting the successors of this branch, naturally?

In addition to moving the generation of ICmpEQ from ILV::completeLoopSkeleton() to a VPInstruction introduced in VPlan::createInitialVPlan(), can the generation of LoopMiddleBlock and LoopScalarPreHeader block move from ILV::createVectorLoopSkeleton() to VPlan? Can both be applied as a VPlan transform, perhaps alongside optimizeForVFAndUF() - knowing VF*UF may help determine if it divides TC and thus middle block should jump to exit unconditionally (or have loop latch jump to exit directly). ILV will then detect the middle block and scalar loop pre-header built by VPlan and set its LoopMiddleBlock and LoopScalarPreHeader variables.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7659
   if (!BestVPlan.getPreheader()->empty()) {
     State.CFG.PrevBB = OrigLoop->getLoopPreheader();
     State.Builder.SetInsertPoint(OrigLoop->getLoopPreheader()->getTerminator());
----------------
A VPBB which is invariantly associated with an existing IRBB should perhaps record this association when created, rather than via State.CFG.PrevBB/ExitBB. This includes the (pre-)pre-header, exit block, and potentially original loop blocks from header to latch. Such VPBB's facilitate hooking VPlan's boundary to existing IR, injecting recipes into existing blocks, reusing original loop for default and/or remainder loops, and as a basis to clone and optimize it. Such VPBB's may contain an "in-place" recipe that represents the existing IR Instructions (w/o generating them/anything during execute()), to model their live-ins and live-outs, and as a placeholder for injecting other recipes before/after. The existence of such a recipe implies that its VPBB is "in-place", so the associated IRBB could be retrieved from it.


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8876
+  //    Thus if tail is to be folded, we know we don't need to run the
+  //    remainder and we can use the previous value for the condition (true).
+  // 3) Otherwise, construct a runtime check.
----------------
Here 2) conceptually includes both "tail is to be folded" and "there is no tail", i.e., when VF*UF is known to divide TC.
Can start with one "canonical" form, for arbitrary VF,UF and un/folded tail, and later revise VPlan when fixing VF,UF and (un)folding tail.

The three cases of (1) jump to scalar preheader, (2) jump to exit, (3) jump to either scalar preheader or exit, are compressed here into a single boolean RequiresScalarEpilogueCheck?


================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9091
+                          ScalarLatchTerm->getDebugLoc());
+    VPBB->appendRecipe(Cmp);
+    Plan->addLiveOut(nullptr, new VPLiveOutBr(Cmp));
----------------
fhahn wrote:
> Ayal wrote:
> > This is a first population of MiddleVPBB with recipes, following its introduction in D123457, right?
> > 
> > Presumably, all instructions of MiddleBlock should eventually be generated by VPlan recipes/VPInstructions.
> > 
> > The conditional branch whose condition will be backpatched to Cmp is generated by createVectorLoopSkeleton() as part of the vectorization process, rather than being an **original Instruction** that can be wrapped with a VPLiveOut - like the LCSSA Phi's of the Exit block.
> > Could this conditional branch terminating MiddleVPBB be generated as a recipe too (w/o any further live-out users), along with moving the generation of its compare from skeleton to VPlan, rather than fixing it via LiveOutBr? I.e., extend/have a forward BranchOnCond as mentioned when introduced in D126618 to represent loop/backward branches, and similar to BranchOnMask. Following VPlan's approach of rewiring branches when visiting their successors, the two successors of MiddleBlock (Exit block and ScalarPreHeader/VectorEpilogTripCountCheck) could be wrapped in VPBB's, similar to the current wrapping of MiddleVPBB (only) to rewire the vector loop latch branch. These two successors should reuse pre-built IRBB's, similar to MiddleVPBB, which in turn can have VPlan generate its IRBB instead of createVectorLoopSkeleton().
> > 
> > Other skeletal parts, including bypass basic blocks between Preheader and VecPreheader, will also need similar forward out-of-loop conditional branch recipes, when migrated to VPlan.
> > 
> > WDYT?
> Updated to use `BranchOnCond`; as `BasicBlock` is a subclass of `Value` it is possible to wrap them in live-in VPValues
Better take care of branches as recipes along with taking care of their destinations as VPBB's, consistently, even if it's possible to smuggle IRBB destinations as underlying values of live-in VPValues.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:405
 
   // Hook up the new basic block to its predecessors.
   for (VPBlockBase *PredVPBlock : getHierarchicalPredecessors()) {
----------------
Could the following logic of hooking up forwards branches when executing their target basic blocks be employed to backpatch the successors of the Middle block's branch targeting the exit block and/or scalar preheader? Or rather the complementary logic in VPBB::execute for IR BB's that are reused (and rewire branches to them there) instead of created here.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:595
+  return (ParentRegion && ParentRegion->getExitingBasicBlock() == this) ||
+         (!ParentRegion && getNumSuccessors() == 0);
 }
----------------
The term Exiting block typically means the first case only, i.e., a block inside a loop which can exit the loop (to an Exit block). The proposed change effectively checks if this VPBB has no successors?


================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:1000
     ICmpULE,
+    ICmpEQ,
     SLPLoad,
----------------
nit: try to keep in lex order?

Should the various condition bit Predicates be provided by VPRecipeWithIRFlags as yet another set of flags, having a derived VPInstructions with `ICmp` and `FCmp` opcodes?


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp:273
+    Value *TC = State.get(getOperand(1), Part);
+    return Builder.CreateICmpEQ(IV, TC, Name);
+  }
----------------
nit: ICmpEQ, and ICmpULE above, should name their operands `A` and `B` as in BinOp, rather than `IV` and `TC`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150398/new/

https://reviews.llvm.org/D150398



More information about the llvm-commits mailing list