[PATCH] D147964: [VPlan] Introduce new entry block to VPlan for early SCEV expansion.
Ayal Zaks via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Apr 24 13:17:50 PDT 2023
Ayal added inline comments.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:614
+
+ void setTripCount(Value *TC) { TripCount = TC; }
----------------
nit: can do w/o setTripCount()? TripCount should conceptually be fixed before/when building VPlan and remain invariant when transforming it.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:3288
- [ ] <-- loop iteration number check.
- / |
+ [ ] <-- loop iteration number check and SCEVs in Plans preheader are
+ / | expanded here.
----------------
nit: add its name "(old) preheader"? Note another block called "new preheader" below.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7705
+ // before making any changes to the CFG.
+ VPTransformState State{BestVF, BestUF, LI, DT, ILV.Builder, &ILV, &BestVPlan};
+ State.CFG.PrevBB = OrigLoop->getLoopPreheader();
----------------
nit: State should be defined before 0, its PrevBB and insertion point set after checking if preheader is not empty right before executing it.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8748
// Create the active lane mask instruction in the vplan preheader.
- VPBasicBlock *Preheader = Plan.getEntry()->getEntryBasicBlock();
+ VPBasicBlock *Preheader =
+ cast<VPBasicBlock>(Plan.getVectorLoopRegion()->getSinglePredecessor());
----------------
nit: VecPreheader?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8903-8907
+ // Create initial VPlan skeleton, starting with a block for the pre-header
+ // which contains SCEV expansions that need to happen before the CFG is
+ // modified, followed by the vector pre-header, followed by a region for the
+ // vector loop, followed by the middle block. The skeleton vector loop region
+ // contains a header and latch block.
----------------
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8917
+ VPBasicBlock *VecPreheader = new VPBasicBlock("vector.ph");
+ auto Plan = std::make_unique<VPlan>(VecPreheader, Preheader, TripCount);
----------------
nit: perhaps Preheader should appear before VecPreheader, and maybe TripCount before both?
nit: would be good to outline this low-level part if reasonable as method is exceeding 250 LOC.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9107
// the vectorization pipeline.
assert(!OrigLoop->isInnermost());
assert(EnableVPlanNativePath && "VPlan-native path is not enabled.");
----------------
nit (independent of this patch): assert should have an error message.
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:9110
// Create new empty VPlan
auto Plan = std::make_unique<VPlan>();
----------------
nit: first create Trip Count, then Preheader, then define Plan given both - which should ideally be set at VPlan construction time w/o being (re)set later. HCFGBuilder will then set VecPreheader.
Can share outlined code with tryToBuildVPlanWithVPRecipes()?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:10432
+ // When vectorizing the epilogue loop, re-use the expanded values from
+ // the main vector loop.
+ for (auto &R : make_early_inc_range(*BestEpiPlan.getPreheader()))
----------------
This erasure empties the preheader thereby preventing re-execution of its recipes, and is probably responsible for checking if preheader is empty upon execution.
How do the users of the erased TripCount ExpandSCEV recipe get to re-use the (already) expanded values from the main vector loop, now that the recipe-VPValue feeding them is being deleted?
================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:8925
+ VPBasicBlock *VecPreheader = new VPBasicBlock("vector.ph");
+ VPBlockUtils::insertBlockAfter(VecPreheader, Preheader);
----------------
fhahn wrote:
> Ayal wrote:
> > Should Preheader be detached from VecPreheader (and any other block), given that they represent disparate blocks and also get executed separately? Retaining the latter as Plan's entryBlock, keeping the former as Plan's (currently) isolated block for this and all other ExpandSCEV recipes.
> > Perhaps worth verifying that all ExpandSCEV recipes reside in this unique Preheader VPBB.
> I updated the code to have the preheader separate (pre-header before skeleton creation). I *think* at the moment the skeleton creation code leaves the CFG in a valid state, so for now SCEV expansions only needed by the vector body can remain where they are at the moment.
Better place all SCEV expansions at preheader to ensure their correct generation before any CFG change, and potential reuse by epilog vector loop? It may incur an overhead when the vector loop is bypassed - but some subsequent pass may sink the expanded code into VecPreheader - if not reused by epilog vector loop. Otherwise how to set the insertion block of ExpandSCEV recipes needs to be explained.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.cpp:596
for (VPValue *VPV : VPLiveInsToFree)
delete VPV;
if (BackedgeTakenCount)
----------------
Where is TripCount deleted?
Where is Preheader deleted?
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:2220
VPBasicBlock *Entry;
+ VPBasicBlock *Preheader;
----------------
/// ...
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:2264
public:
+ VPlan(VPBasicBlock *Entry = nullptr, VPBasicBlock *Preheader = nullptr,
----------------
/// ... \p Entry ..
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:2267
+ VPValue *TripCount = nullptr)
+ : Entry(Entry), Preheader(Preheader), TripCount(TripCount) {
if (Entry)
----------------
nits: Entry >> VecPreheader? Should Preheader and TripCount drop their defaults and asserted to be non-null?
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:2293
+ void setPreheader(VPBasicBlock *PH) {
+ Preheader = PH;
----------------
nit: can be dropped? Is Preheader expected to change?
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:2299
/// The trip count of the original loop.
- VPValue *getOrCreateTripCount() {
- if (!TripCount)
- TripCount = new VPValue();
+ VPValue *getTripCount() {
+ assert(TripCount && "trip count needs to be set before accessing it");
----------------
nit: can add a const version, is it useful now?
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:2304
+ void setTripCount(VPValue *V) {
+ assert(!TripCount);
----------------
nit: can be dropped? Is TripCount expected to change?
================
Comment at: llvm/lib/Transforms/Vectorize/VPlan.h:2763
/// pre-header already contains a recipe expanding \p Expr, return it. If not,
/// create a new one.
+VPValue *getOrCreateVPValueForSCEVExpr(VPlan &Plan, VPBasicBlock *Block,
----------------
/// <How does one decide what \p Block should be>
In all cases, it is set to Plan.getEntry(), because ExpandSCEV for TripCount (placed in Preheader) is created directly rather through this API.
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:34
ReversePostOrderTraversal<VPBlockDeepTraversalWrapper<VPBlockBase *>> RPOT(
- Plan->getEntry());
+ Plan->getEntry()->getSingleSuccessor());
for (VPBasicBlock *VPBB : VPBlockUtils::blocksOnly<VPBasicBlock>(RPOT)) {
----------------
Should this still be (Plan->getEntry())?
================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:400
+ if (PredVPBB && PredVPBB->getNumSuccessors() == 1 &&
+ PredVPBB != Plan.getEntry())
WorkList.push_back(VPBB);
----------------
Is this additional check needed - can PredVPBB be Plan.getEntry()?
================
Comment at: llvm/test/Transforms/LoopVectorize/AArch64/epilog-vectorization-widen-inductions.ll:161
; CHECK: vec.epilog.vector.body:
-; CHECK-NEXT: [[OFFSET_IDX:%.*]] = phi i64 [ [[VEC_EPILOG_RESUME_VAL]], [[VEC_EPILOG_PH]] ], [ [[INDEX_NEXT11:%.*]], [[VEC_EPILOG_VECTOR_BODY]] ]
+; CHECK-NEXT: [[INDEX7:%.*]] = phi i64 [ [[VEC_EPILOG_RESUME_VAL]], [[VEC_EPILOG_PH]] ], [ [[INDEX_NEXT11:%.*]], [[VEC_EPILOG_VECTOR_BODY]] ]
; CHECK-NEXT: [[VEC_IND8:%.*]] = phi <2 x i64> [ [[INDUCTION]], [[VEC_EPILOG_PH]] ], [ [[VEC_IND_NEXT10:%.*]], [[VEC_EPILOG_VECTOR_BODY]] ]
----------------
nit: are these OFFSET_IDX >> INDEX# changes needed?
================
Comment at: llvm/test/Transforms/LoopVectorize/X86/gather_scatter.ll:866
; AVX512-NEXT: [[TMP8:%.*]] = add nuw nsw i64 [[TMP7]], 8
-; AVX512-NEXT: [[UGLYGEP:%.*]] = getelementptr i8, ptr [[DEST:%.*]], i64 [[TMP8]]
+; AVX512-NEXT: [[SCEVGEP:%.*]] = getelementptr i8, ptr [[DEST:%.*]], i64 [[TMP8]]
; AVX512-NEXT: [[TMP9:%.*]] = shl nuw i64 [[TMP6]], 2
----------------
nit: are these UGLY >> SCEV changes needed?
================
Comment at: llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll:900
; UNROLL-NO-IC-NEXT: [[E_015:%.*]] = phi i32 [ poison, [[ENTRY]] ], [ [[E_1_LCSSA:%.*]], [[FOR_COND_CLEANUP3]] ]
-; UNROLL-NO-IC-NEXT: [[TMP0:%.*]] = add i32 [[INDVAR]], 1
-; UNROLL-NO-IC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[TMP0]], 8
+; UNROLL-NO-IC-NEXT: [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[I_016]], 8
; UNROLL-NO-IC-NEXT: br i1 [[MIN_ITERS_CHECK]], label [[SCALAR_PH:%.*]], label [[VECTOR_PH:%.*]]
----------------
Nice LSR-type phi redundancy elimination as a by-product?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D147964/new/
https://reviews.llvm.org/D147964
More information about the llvm-commits
mailing list