[PATCH] D135017: [LV] Move exit cond simplification to separate transform.

Ayal Zaks via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Dec 20 02:05:35 PST 2022


Ayal accepted this revision.
Ayal added a comment.
This revision is now accepted and ready to land.

Looks good to me, adding various nits.

In the summary, perhaps indicate that the optimization is moved earlier to take place before creating the skeleton, as soon as the UF is available.



================
Comment at: llvm/lib/Transforms/Vectorize/LoopVectorize.cpp:7623
+  if (!IsEpilogueVectorization)
+    VPlanTransforms::optimizeForConcreteVFAndUF(BestVPlan, BestVF, BestUF, PSE);
+
----------------
fhahn wrote:
> Ayal wrote:
> > nit: can drop "Concrete".
> > 
> > Restrict the VF range of BestPlan to include only BestVF?
> > 
> > When printing BestPlan would be good to now indicate that it applies to BestUF only.
> > nit: can drop "Concrete".
> 
> Done!
> 
> > Restrict the VF range of BestPlan to include only BestVF?
> 
> Done!
> 
> > When printing BestPlan would be good to now indicate that it applies to BestUF only.
> 
> Should I add a field for that?
>> When printing BestPlan would be good to now indicate that it applies to BestUF only.
>
> Should I add a field for that?

Probably, to then print it instead of "UF >= 1". Can also add setUF(), and an assert when executing (the epilog loop) that the plan used works for the current UF.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:471
+  auto *Term = dyn_cast<VPInstruction>(&ExitingVPBB->back());
+  // Try to simplify the branch condition if BTC < VF * UF when preparing to
+  // execute the plan for the main vector loop. We only do this if the
----------------
So we retain "if TC <= VF * UF" ?


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:473
+  // execute the plan for the main vector loop. We only do this if the
+  // terminator is:
+  //  1. BranchOnCount, or
----------------
nit: can fold this comment and logic into "canSimplifyLoopBranch(Term)" (independent of this patch).


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:483
+      Plan.getCanonicalIV()->getStartValue()->getLiveInIRValue()->getType();
+  const SCEV *ExitCount = createTripCountSCEV(IdxTy, PSE);
+  auto *C = dyn_cast<SCEVConstant>(ExitCount);
----------------
nit: ExitCount >> TripCount ?

nit: where is createTripCountSCEV defined? Should VPlan or its loop Region provide it, similar to providing the canonical IV, or even have the canonical IV recipe provide it? When optimizing the loop branch recipe, one can also optimize the canonical IV recipe...


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:486
+  ScalarEvolution &SE = *PSE.getSE();
+  if (!C || ExitCount->isZero() ||
+      C->getAPInt().getZExtValue() > BestVF.getKnownMinValue() * BestUF)
----------------
nit: is the isZero() check needed?


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp:474
+      new VPInstruction(VPInstruction::BranchOnCond,
+                        {Plan.getOrAddExternalDef(ConstantInt::getTrue(Ctx))});
+  Term->eraseFromParent();
----------------
fhahn wrote:
> Ayal wrote:
> > Worth an unconditional branch VPInstruction instead of the effort to generate a redundant True? Can be fixed independent of this patch, which is working a bit harder to generate it.
> It should probably be sufficient to just set a single successor. We will have to also replace & remove the header phis.
OK, right, there should be no need for an unconditional branch VPInstruction, once CFG is straightened, as in the 2nd TODO.


================
Comment at: llvm/lib/Transforms/Vectorize/VPlanTransforms.h:67
+
+  /// Optimize \p Plan based on \p BestVF and \p BestUF. This may restrict the
+  /// resulting plan to \p BestVF and \p BestUF.
----------------
nit: may restrict >> restricts ?


================
Comment at: llvm/test/Transforms/LoopVectorize/lcssa-crashes.ll:16
 ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
-; CHECK-NEXT:    br i1 true, label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
+; CHECK-NEXT:    [[TMP1:%.*]] = icmp eq i64 [[INDEX_NEXT]], 0
+; CHECK-NEXT:    br i1 [[TMP1]], label [[MIDDLE_BLOCK:%.*]], label [[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
----------------
fhahn wrote:
> Ayal wrote:
> > Is this a regression from previously folding the conditional branch?
> The test case is spurious, the vector body will never be executed, as the exit condition compares to undef.
nit: worth fixing the test somehow, to clarify that no folding is lost here?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135017/new/

https://reviews.llvm.org/D135017



More information about the llvm-commits mailing list