[PATCH] D150398: [VPlan] Model branch cond to enter scalar epilogue in VPlan.

Sun Aug 27 02:56:48 PDT 2023

Ayal added inline comments.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3081
+  // FIXME: Model exit block and scalar preheader directly in VPlan to avoid
+  // creating initial branch here and fixing it up during VPlan execution.
   BranchInst *BrInst =
----------------
fhahn wrote:
> Ayal wrote:
> > Can this FIXME be addressed so that VPlan execution will take care of setting the successors of this branch, naturally?
> > 
> > In addition to moving the generation of ICmpEQ from ILV::completeLoopSkeleton() to a VPInstruction introduced in VPlan::createInitialVPlan(), can the generation of LoopMiddleBlock and LoopScalarPreHeader block move from ILV::createVectorLoopSkeleton() to VPlan? Can both be applied as a VPlan transform, perhaps alongside optimizeForVFAndUF() - knowing VF*UF may help determine if it divides TC and thus middle block should jump to exit unconditionally (or have loop latch jump to exit directly). ILV will then detect the middle block and scalar loop pre-header built by VPlan and set its LoopMiddleBlock and LoopScalarPreHeader variables.
> Yes I think so, but it will possibly require some extra logic to update the dominator tree. In terms of ordering of changes, I think we would need to first move out the branch to be modeled in VPlan, before moving the block creating itself.
Moving out the branch first, requires a distinct way of hooking up the successors. Moving out the blocks first - even if disconnected at first - as in the pre-pre-header case, could potentially be done initially while keeping the branch creation outside VPlan - as a post-processing, followed by moving out the branch to a recipe - which would be simpler having prepared its successors?

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7659
   if (!BestVPlan.getPreheader()->empty()) {
     State.CFG.PrevBB = OrigLoop->getLoopPreheader();
     State.Builder.SetInsertPoint(OrigLoop->getLoopPreheader()->getTerminator());
----------------
fhahn wrote:
> Ayal wrote:
> > A VPBB which is invariantly associated with an existing IRBB should perhaps record this association when created, rather than via State.CFG.PrevBB/ExitBB. This includes the (pre-)pre-header, exit block, and potentially original loop blocks from header to latch. Such VPBB's facilitate hooking VPlan's boundary to existing IR, injecting recipes into existing blocks, reusing original loop for default and/or remainder loops, and as a basis to clone and optimize it. Such VPBB's may contain an "in-place" recipe that represents the existing IR Instructions (w/o generating them/anything during execute()), to model their live-ins and live-outs, and as a placeholder for injecting other recipes before/after. The existence of such a recipe implies that its VPBB is "in-place", so the associated IRBB could be retrieved from it.
> PrevBB is used during execution to figure out if blocks can be re-used (mostly around replicate regions now we have VPlan based merging). 
> 
> For the others, recording the association explicitly sounds good, but maybe even better to try to replace their uses by moving the logic to VPlan explicitly (e.g how to connect them). 
> 
> An `in-place` recipe could also help gradually moving parts of the skeleton into VPlan.  IIUC this is mostly a. suggestion for future improvements correct?
> PrevBB is used during execution to figure out if blocks can be re-used (mostly around replicate regions now we have VPlan based merging).

Agreed, and once replicate regions get unrolled as a VPlan-to-VPlan transform, their entry and exit VPBB's should be merged in VPlan as well.

> For the others, recording the association explicitly sounds good, but maybe even better to try to replace their uses by moving the logic to VPlan explicitly (e.g how to connect them).

Hmm, intrigued by an even better option - when a VPBB executes it needs to figure out whether to create a new IRBB for its recipes or whether the latter are to populate an existing IRBB?

> An in-place recipe could also help gradually moving parts of the skeleton into VPlan. IIUC this is mostly a. suggestion for future improvements correct?

Yes, possibly. Raised here to consider various pieces in order to help converge on an appropriate order of introducing them.

================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8876
+  //    Thus if tail is to be folded, we know we don't need to run the
+  //    remainder and we can use the previous value for the condition (true).
+  // 3) Otherwise, construct a runtime check.
----------------
fhahn wrote:
> Ayal wrote:
> > Here 2) conceptually includes both "tail is to be folded" and "there is no tail", i.e., when VF*UF is known to divide TC.
> > Can start with one "canonical" form, for arbitrary VF,UF and un/folded tail, and later revise VPlan when fixing VF,UF and (un)folding tail.
> > 
> > The three cases of (1) jump to scalar preheader, (2) jump to exit, (3) jump to either scalar preheader or exit, are compressed here into a single boolean RequiresScalarEpilogueCheck?
> This initially tries to just follow the existing logic, but can be broken down further in future patches.
Following existing logic first is fine. Complement the above explanation about the three options with how they relate to the boolean flag below, for now?

================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:405

   // Hook up the new basic block to its predecessors.
   for (VPBlockBase *PredVPBlock : getHierarchicalPredecessors()) {
----------------
fhahn wrote:
> Ayal wrote:
> > Could the following logic of hooking up forwards branches when executing their target basic blocks be employed to backpatch the successors of the Middle block's branch targeting the exit block and/or scalar preheader? Or rather the complementary logic in VPBB::execute for IR BB's that are reused (and rewire branches to them there) instead of created here.
> Hmm, one issue is that we would need to rely on the IR to figure that out, which means we need to rely on the branch in the middle block implicitly here. The back-patching shouldn't be needed once the successors are also modeled in VPlan (and middle-block is created directly in VPlan). WDYT?
By backpatching I meant VPlan's existing mechanism of first generating the forward IR branch (before its IR successors are generated/visited) and then revisit the branch to set its destinations once the IR successors are generated/visited. Suggestion is indeed to model the successors in VPlan so that this mechanism could be employed.

================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:595
+  return (ParentRegion && ParentRegion->getExitingBasicBlock() == this) ||
+         (!ParentRegion && getNumSuccessors() == 0);
 }
----------------
fhahn wrote:
> Ayal wrote:
> > The term Exiting block typically means the first case only, i.e., a block inside a loop which can exit the loop (to an Exit block). The proposed change effectively checks if this VPBB has no successors?
> I think this check is currently only used to check if the block is supposed to have a conditional terminator, so maybe the function should be renamed to reflect that?
If this is the desired semantics, naming should indeed reflect it, but BranchOnCond implies a conditional terminator. Question is if a destination goes backwards - in which case it is hooked up when the branch is generated, or goes forwards - in which case it is hooked up when the destination is processed (after processing the branch). Currently BranchOnCond's that have backwards destinations are loop control branches having a backwards destination targeting the header block and a forwards destination targeting the exit/middle block.

================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:682
   State->CFG.PrevVPBB = nullptr;
   State->CFG.ExitBB = State->CFG.PrevBB->getSingleSuccessor();
   BasicBlock *VectorPreHeader = State->CFG.PrevBB;
----------------
fhahn wrote:
> Ayal wrote:
> > (Independent of this patch:) Note that this may be a confusing choice of names: CFG.ExitBB is set to "MiddleBlock", which is the **new** Exit-from-loop block, as opposed to the original Exit-from-loop-block(s), aka "Exit".
> Yes, should be updated to MiddleBlock for now? I think once the current patch is in, we could also handle exit block and scalar.ph creation in VPlan, to remove the need to create branches up front.
Or perhaps handle these blocks as VPBB's first, as discussed in other comments here.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D150398/new/

https://reviews.llvm.org/D150398