[llvm] [LoopVectorize] Enable vectorisation of early exit loops with live-outs (PR #120567)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 19 04:32:17 PST 2024
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-vectorizers
@llvm/pr-subscribers-llvm-transforms
Author: David Sherwood (david-arm)
<details>
<summary>Changes</summary>
This work feeds part of PR #<!-- -->88385, and adds support for vectorising loops with uncountable early exits and outside users of loop-defined variables.
I've added a new fixupEarlyExitIVUsers to mirror what happens in fixupIVUsers when patching up outside users of induction variables in the early exit block. We have to handle these differently for two reasons:
1. We can't work backwards from the end value in the middle block because we didn't leave at the last iteration.
2. We need to generate different IR that calculates the vector lane that triggered the exit, and hence can determine the induction value at the point we exited.
I've added a new 'null' VPValue as a dummy placeholder to manage the incoming operands of PHI nodes in the exit block. We can have situations where one of the incoming values is an induction variable (or its update) and the other is not. For example, both the latch and the early exiting block can jump to the same exit block. However, VPInstruction::generate walks through all predecessors of the PHI assuming the value is *not* an IV. In order to ensure that we process the right value for the right incoming block we use this new 'null' value is a marker to indicate it should be skipped, since it will be handled separately in fixupIVUsers or fixupEarlyExitIVUsers.
All code for calculating the last value when exiting the loop early now lives in a new vector.early.exit block, which sits between the middle.split block and the original exit block. I also had to fix up the vplan verifier because it assumed that the block containing a definition always dominated the parent of the user. That's no longer the case because we can arrive at the exit block via one of the latch or the early exiting block.
I've added a new ExtractFirstActive VPInstruction that extracts the first active lane of a vector, i.e. the lane of the vector predicate that triggered the exit.
---
Patch is 138.11 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/120567.diff
14 Files Affected:
- (modified) llvm/lib/Transforms/Vectorize/LoopVectorize.cpp (+164-25)
- (modified) llvm/lib/Transforms/Vectorize/VPlan.cpp (+3)
- (modified) llvm/lib/Transforms/Vectorize/VPlan.h (+28)
- (modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+17-5)
- (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+6-1)
- (modified) llvm/lib/Transforms/Vectorize/VPlanValue.h (+11)
- (modified) llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp (+5-1)
- (modified) llvm/test/Transforms/LoopVectorize/AArch64/simple_early_exit.ll (+256-57)
- (modified) llvm/test/Transforms/LoopVectorize/early_exit_legality.ll (-2)
- (modified) llvm/test/Transforms/LoopVectorize/multi_early_exit.ll (+3-3)
- (modified) llvm/test/Transforms/LoopVectorize/multi_early_exit_live_outs.ll (+6-6)
- (modified) llvm/test/Transforms/LoopVectorize/single_early_exit.ll (+21-13)
- (modified) llvm/test/Transforms/LoopVectorize/single_early_exit_live_outs.ll (+641-79)
- (modified) llvm/test/Transforms/LoopVectorize/uncountable-early-exit-vplan.ll (+15-6)
``````````diff
diff --git a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
index a8511483e00fbe..4962dff3ab1f8a 100644
--- a/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
+++ b/llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
@@ -563,6 +563,11 @@ class InnerLoopVectorizer {
Value *VectorTripCount, BasicBlock *MiddleBlock,
VPTransformState &State);
+ void fixupEarlyExitIVUsers(PHINode *OrigPhi, const InductionDescriptor &II,
+ BasicBlock *VectorEarlyExitBB,
+ BasicBlock *MiddleBlock, VPlan &Plan,
+ VPTransformState &State);
+
/// Iteratively sink the scalarized operands of a predicated instruction into
/// the block that was created for it.
void sinkScalarOperands(Instruction *PredInst);
@@ -2838,6 +2843,23 @@ BasicBlock *InnerLoopVectorizer::createVectorizedLoopSkeleton(
return LoopVectorPreHeader;
}
+static bool isValueIncomingFromBlock(BasicBlock *ExitingBB, Value *V,
+ Instruction *UI) {
+ PHINode *PHI = dyn_cast<PHINode>(UI);
+ assert(PHI && "Expected LCSSA form");
+
+ // If this loop has an uncountable early exit then there could be
+ // different users of OrigPhi with either:
+ // 1. Multiple users, because each exiting block (countable or
+ // uncountable) jumps to the same exit block, or ..
+ // 2. A single user with an incoming value from a countable or
+ // uncountable exiting block.
+ // In both cases there is no guarantee this came from a countable exiting
+ // block, i.e. the latch.
+ int Index = PHI->getBasicBlockIndex(ExitingBB);
+ return Index != -1 && PHI->getIncomingValue(Index) == V;
+}
+
// Fix up external users of the induction variable. At this point, we are
// in LCSSA form, with all external PHIs that use the IV having one input value,
// coming from the remainder loop. We need those PHIs to also have a correct
@@ -2853,6 +2875,7 @@ void InnerLoopVectorizer::fixupIVUsers(PHINode *OrigPhi,
// We allow both, but they, obviously, have different values.
DenseMap<Value *, Value *> MissingVals;
+ BasicBlock *OrigLoopLatch = OrigLoop->getLoopLatch();
Value *EndValue = cast<PHINode>(OrigPhi->getIncomingValueForBlock(
OrigLoop->getLoopPreheader()))
@@ -2860,12 +2883,12 @@ void InnerLoopVectorizer::fixupIVUsers(PHINode *OrigPhi,
// An external user of the last iteration's value should see the value that
// the remainder loop uses to initialize its own IV.
- Value *PostInc = OrigPhi->getIncomingValueForBlock(OrigLoop->getLoopLatch());
+ Value *PostInc = OrigPhi->getIncomingValueForBlock(OrigLoopLatch);
for (User *U : PostInc->users()) {
Instruction *UI = cast<Instruction>(U);
if (!OrigLoop->contains(UI)) {
- assert(isa<PHINode>(UI) && "Expected LCSSA form");
- MissingVals[UI] = EndValue;
+ if (isValueIncomingFromBlock(OrigLoopLatch, PostInc, UI))
+ MissingVals[cast<PHINode>(UI)] = EndValue;
}
}
@@ -2875,7 +2898,9 @@ void InnerLoopVectorizer::fixupIVUsers(PHINode *OrigPhi,
for (User *U : OrigPhi->users()) {
auto *UI = cast<Instruction>(U);
if (!OrigLoop->contains(UI)) {
- assert(isa<PHINode>(UI) && "Expected LCSSA form");
+ if (!isValueIncomingFromBlock(OrigLoopLatch, OrigPhi, UI))
+ continue;
+
IRBuilder<> B(MiddleBlock->getTerminator());
// Fast-math-flags propagate from the original induction instruction.
@@ -2905,18 +2930,6 @@ void InnerLoopVectorizer::fixupIVUsers(PHINode *OrigPhi,
}
}
- assert((MissingVals.empty() ||
- all_of(MissingVals,
- [MiddleBlock, this](const std::pair<Value *, Value *> &P) {
- return all_of(
- predecessors(cast<Instruction>(P.first)->getParent()),
- [MiddleBlock, this](BasicBlock *Pred) {
- return Pred == MiddleBlock ||
- Pred == OrigLoop->getLoopLatch();
- });
- })) &&
- "Expected escaping values from latch/middle.block only");
-
for (auto &I : MissingVals) {
PHINode *PHI = cast<PHINode>(I.first);
// One corner case we have to handle is two IVs "chasing" each-other,
@@ -2929,6 +2942,102 @@ void InnerLoopVectorizer::fixupIVUsers(PHINode *OrigPhi,
}
}
+void InnerLoopVectorizer::fixupEarlyExitIVUsers(PHINode *OrigPhi,
+ const InductionDescriptor &II,
+ BasicBlock *VectorEarlyExitBB,
+ BasicBlock *MiddleBlock,
+ VPlan &Plan,
+ VPTransformState &State) {
+ // There are two kinds of external IV usages - those that use the value
+ // computed in the last iteration (the PHI) and those that use the penultimate
+ // value (the value that feeds into the phi from the loop latch).
+ // We allow both, but they, obviously, have different values.
+ DenseMap<Value *, Value *> MissingVals;
+ BasicBlock *OrigLoopLatch = OrigLoop->getLoopLatch();
+ BasicBlock *EarlyExitingBB = Legal->getUncountableEarlyExitingBlock();
+ Value *PostInc = OrigPhi->getIncomingValueForBlock(OrigLoopLatch);
+
+ // Obtain the canonical IV, since we have to use the most recent value
+ // before exiting the loop early. This is unlike fixupIVUsers, which has
+ // the luxury of using the end value in the middle block.
+ VPBasicBlock *EntryVPBB = Plan.getVectorLoopRegion()->getEntryBasicBlock();
+ // NOTE: We cannot call Plan.getCanonicalIV() here because the original
+ // recipe created whilst building plans is no longer valid.
+ VPHeaderPHIRecipe *CanonicalIVR =
+ cast<VPHeaderPHIRecipe>(&*EntryVPBB->begin());
+ Value *CanonicalIV = State.get(CanonicalIVR->getVPSingleValue(), true);
+
+ // Search for the mask that drove us to exit early.
+ VPBasicBlock *EarlyExitVPBB = Plan.getVectorLoopRegion()->getEarlyExit();
+ VPBasicBlock *MiddleSplitVPBB =
+ cast<VPBasicBlock>(EarlyExitVPBB->getSinglePredecessor());
+ VPInstruction *BranchOnCond =
+ cast<VPInstruction>(MiddleSplitVPBB->getTerminator());
+ assert(BranchOnCond->getOpcode() == VPInstruction::BranchOnCond &&
+ "Expected middle.split block terminator to be a branch-on-cond");
+ VPInstruction *ScalarEarlyExitCond =
+ cast<VPInstruction>(BranchOnCond->getOperand(0));
+ assert(
+ ScalarEarlyExitCond->getOpcode() == VPInstruction::AnyOf &&
+ "Expected middle.split block terminator branch condition to be any-of");
+ VPValue *VectorEarlyExitCond = ScalarEarlyExitCond->getOperand(0);
+ // Finally get the mask that led us into the early exit block.
+ Value *EarlyExitMask = State.get(VectorEarlyExitCond);
+
+ // Calculate the IV step.
+ VPValue *StepVPV = Plan.getSCEVExpansion(II.getStep());
+ assert(StepVPV && "step must have been expanded during VPlan execution");
+ Value *Step = StepVPV->isLiveIn() ? StepVPV->getLiveInIRValue()
+ : State.get(StepVPV, VPLane(0));
+
+ auto FixUpPhi = [&](Instruction *UI, bool PostInc) -> Value * {
+ IRBuilder<> B(VectorEarlyExitBB->getTerminator());
+ assert(isa<PHINode>(UI) && "Expected LCSSA form");
+
+ // Fast-math-flags propagate from the original induction instruction.
+ if (isa_and_nonnull<FPMathOperator>(II.getInductionBinOp()))
+ B.setFastMathFlags(II.getInductionBinOp()->getFastMathFlags());
+
+ Type *CtzType = CanonicalIV->getType();
+ Value *Ctz = B.CreateCountTrailingZeroElems(CtzType, EarlyExitMask);
+ Ctz = B.CreateAdd(Ctz, cast<PHINode>(CanonicalIV));
+ if (PostInc)
+ Ctz = B.CreateAdd(Ctz, ConstantInt::get(CtzType, 1));
+
+ Value *Escape = emitTransformedIndex(B, Ctz, II.getStartValue(), Step,
+ II.getKind(), II.getInductionBinOp());
+ Escape->setName("ind.early.escape");
+ return Escape;
+ };
+
+ for (User *U : PostInc->users()) {
+ auto *UI = cast<Instruction>(U);
+ if (!OrigLoop->contains(UI)) {
+ if (isValueIncomingFromBlock(EarlyExitingBB, PostInc, UI))
+ MissingVals[UI] = FixUpPhi(UI, true);
+ }
+ }
+
+ for (User *U : OrigPhi->users()) {
+ auto *UI = cast<Instruction>(U);
+ if (!OrigLoop->contains(UI)) {
+ if (isValueIncomingFromBlock(EarlyExitingBB, OrigPhi, UI))
+ MissingVals[UI] = FixUpPhi(UI, false);
+ }
+ }
+
+ for (auto &I : MissingVals) {
+ PHINode *PHI = cast<PHINode>(I.first);
+ // One corner case we have to handle is two IVs "chasing" each-other,
+ // that is %IV2 = phi [...], [ %IV1, %latch ]
+ // In this case, if IV1 has an external use, we need to avoid adding both
+ // "last value of IV1" and "penultimate value of IV2". So, verify that we
+ // don't already have an incoming value for the middle block.
+ if (PHI->getBasicBlockIndex(VectorEarlyExitBB) == -1)
+ PHI->addIncoming(I.second, VectorEarlyExitBB);
+ }
+}
+
namespace {
struct CSEDenseMapInfo {
@@ -3062,6 +3171,13 @@ void InnerLoopVectorizer::fixVectorizedLoop(VPTransformState &State) {
OuterLoop->addBasicBlockToLoop(MiddleSplitBB, *LI);
PredVPBB = PredVPBB->getSinglePredecessor();
}
+
+ BasicBlock *OrigEarlyExitBB = Legal->getUncountableEarlyExitBlock();
+ if (Loop *EEL = LI->getLoopFor(OrigEarlyExitBB)) {
+ BasicBlock *VectorEarlyExitBB =
+ State.CFG.VPBB2IRBB[VectorRegion->getEarlyExit()];
+ EEL->addBasicBlockToLoop(VectorEarlyExitBB, *LI);
+ }
}
// After vectorization, the exit blocks of the original loop will have
@@ -3091,6 +3207,15 @@ void InnerLoopVectorizer::fixVectorizedLoop(VPTransformState &State) {
getOrCreateVectorTripCount(nullptr), LoopMiddleBlock, State);
}
+ if (Legal->hasUncountableEarlyExit()) {
+ VPBasicBlock *VectorEarlyExitVPBB =
+ cast<VPBasicBlock>(VectorRegion->getEarlyExit());
+ BasicBlock *VectorEarlyExitBB = State.CFG.VPBB2IRBB[VectorEarlyExitVPBB];
+ for (const auto &Entry : Legal->getInductionVars())
+ fixupEarlyExitIVUsers(Entry.first, Entry.second, VectorEarlyExitBB,
+ LoopMiddleBlock, Plan, State);
+ }
+
for (Instruction *PI : PredicatedInstructions)
sinkScalarOperands(&*PI);
@@ -8974,6 +9099,9 @@ static void addScalarResumePhis(VPRecipeBuilder &Builder, VPlan &Plan) {
auto *VectorPhiR = cast<VPHeaderPHIRecipe>(Builder.getRecipe(ScalarPhiI));
if (!isa<VPFirstOrderRecurrencePHIRecipe, VPReductionPHIRecipe>(VectorPhiR))
continue;
+ assert(!Plan.getVectorLoopRegion()->getEarlyExit() &&
+ "Cannot handle "
+ "first-order recurrences with uncountable early exits");
// The backedge value provides the value to resume coming out of a loop,
// which for FORs is a vector whose last element needs to be extracted. The
// start value provides the value if the loop is bypassed.
@@ -9032,8 +9160,7 @@ static SetVector<VPIRInstruction *> collectUsersInExitBlocks(
auto *P = dyn_cast<PHINode>(U);
return P && Inductions.contains(P);
}))) {
- if (ExitVPBB->getSinglePredecessor() == MiddleVPBB)
- continue;
+ V = VPValue::getNull();
}
ExitUsersToFix.insert(ExitIRI);
ExitIRI->addOperand(V);
@@ -9061,18 +9188,30 @@ addUsersInExitBlocks(VPlan &Plan,
for (const auto &[Idx, Op] : enumerate(ExitIRI->operands())) {
// Pass live-in values used by exit phis directly through to their users
// in the exit block.
- if (Op->isLiveIn())
+ if (Op->isLiveIn() || Op->isNull())
continue;
// Currently only live-ins can be used by exit values from blocks not
// exiting via the vector latch through to the middle block.
- if (ExitIRI->getParent()->getSinglePredecessor() != MiddleVPBB)
- return false;
-
LLVMContext &Ctx = ExitIRI->getInstruction().getContext();
- VPValue *Ext = B.createNaryOp(VPInstruction::ExtractFromEnd,
- {Op, Plan.getOrAddLiveIn(ConstantInt::get(
- IntegerType::get(Ctx, 32), 1))});
+ VPValue *Ext;
+ VPBasicBlock *PredVPBB =
+ cast<VPBasicBlock>(ExitIRI->getParent()->getPredecessors()[Idx]);
+ if (PredVPBB != MiddleVPBB) {
+ VPBasicBlock *VectorEarlyExitVPBB =
+ Plan.getVectorLoopRegion()->getEarlyExit();
+ VPBuilder B2(VectorEarlyExitVPBB,
+ VectorEarlyExitVPBB->getFirstNonPhi());
+ assert(ExitIRI->getParent()->getNumPredecessors() <= 2);
+ VPValue *EarlyExitMask =
+ Plan.getVectorLoopRegion()->getVectorEarlyExitCond();
+ Ext = B2.createNaryOp(VPInstruction::ExtractFirstActive,
+ {Op, EarlyExitMask});
+ } else {
+ Ext = B.createNaryOp(VPInstruction::ExtractFromEnd,
+ {Op, Plan.getOrAddLiveIn(ConstantInt::get(
+ IntegerType::get(Ctx, 32), 1))});
+ }
ExitIRI->setOperand(Idx, Ext);
}
}
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.cpp b/llvm/lib/Transforms/Vectorize/VPlan.cpp
index 71f43abe534ec0..3578c268b2187f 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlan.cpp
@@ -83,6 +83,9 @@ Value *VPLane::getAsRuntimeExpr(IRBuilderBase &Builder,
llvm_unreachable("Unknown lane kind");
}
+static VPValue NullValue;
+VPValue *VPValue::Null = &NullValue;
+
VPValue::VPValue(const unsigned char SC, Value *UV, VPDef *Def)
: SubclassID(SC), UnderlyingVal(UV), Def(Def) {
if (Def)
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index 8dd94a292f7075..c6d9364cd1e9b4 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -1231,6 +1231,9 @@ class VPInstruction : public VPRecipeWithIRFlags,
// Returns a scalar boolean value, which is true if any lane of its single
// operand is true.
AnyOf,
+ // Extracts the first active lane of a vector, where the first operand is
+ // the predicate, and the second operand is the vector to extract.
+ ExtractFirstActive,
};
private:
@@ -3662,6 +3665,13 @@ class VPRegionBlock : public VPBlockBase {
/// VPRegionBlock.
VPBlockBase *Exiting;
+ /// Hold the Early Exit block of the SEME region, if one exists.
+ VPBasicBlock *EarlyExit;
+
+ /// If one exists, this keeps track of the vector early mask that triggered
+ /// the early exit.
+ VPValue *VectorEarlyExitCond;
+
/// An indicator whether this region is to generate multiple replicated
/// instances of output IR corresponding to its VPBlockBases.
bool IsReplicator;
@@ -3670,6 +3680,7 @@ class VPRegionBlock : public VPBlockBase {
VPRegionBlock(VPBlockBase *Entry, VPBlockBase *Exiting,
const std::string &Name = "", bool IsReplicator = false)
: VPBlockBase(VPRegionBlockSC, Name), Entry(Entry), Exiting(Exiting),
+ EarlyExit(nullptr), VectorEarlyExitCond(nullptr),
IsReplicator(IsReplicator) {
assert(Entry->getPredecessors().empty() && "Entry block has predecessors.");
assert(Exiting->getSuccessors().empty() && "Exit block has successors.");
@@ -3678,6 +3689,7 @@ class VPRegionBlock : public VPBlockBase {
}
VPRegionBlock(const std::string &Name = "", bool IsReplicator = false)
: VPBlockBase(VPRegionBlockSC, Name), Entry(nullptr), Exiting(nullptr),
+ EarlyExit(nullptr), VectorEarlyExitCond(nullptr),
IsReplicator(IsReplicator) {}
~VPRegionBlock() override {
@@ -3717,6 +3729,22 @@ class VPRegionBlock : public VPBlockBase {
ExitingBlock->setParent(this);
}
+ /// Sets the early exit vector mask.
+ void setVectorEarlyExitCond(VPValue *V) {
+ assert(!VectorEarlyExitCond);
+ VectorEarlyExitCond = V;
+ }
+
+ /// Gets the early exit vector mask
+ VPValue *getVectorEarlyExitCond() const { return VectorEarlyExitCond; }
+
+ /// Set the vector early exit block
+ void setEarlyExit(VPBasicBlock *ExitBlock) { EarlyExit = ExitBlock; }
+
+ /// Get the vector early exit block
+ const VPBasicBlock *getEarlyExit() const { return EarlyExit; }
+ VPBasicBlock *getEarlyExit() { return EarlyExit; }
+
/// Returns the pre-header VPBasicBlock of the loop region.
VPBasicBlock *getPreheaderVPBB() {
assert(!isReplicator() && "should only get pre-header of loop regions");
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index 7f8c560270bc0c..67c2595aabd081 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -57,7 +57,6 @@ bool VPRecipeBase::mayWriteToMemory() const {
case Instruction::Or:
case Instruction::ICmp:
case Instruction::Select:
- case VPInstruction::AnyOf:
case VPInstruction::Not:
case VPInstruction::CalculateTripCountMinusVF:
case VPInstruction::CanonicalIVIncrementForPart:
@@ -65,6 +64,8 @@ bool VPRecipeBase::mayWriteToMemory() const {
case VPInstruction::FirstOrderRecurrenceSplice:
case VPInstruction::LogicalAnd:
case VPInstruction::PtrAdd:
+ case VPInstruction::AnyOf:
+ case VPInstruction::ExtractFirstActive:
return false;
default:
return true;
@@ -645,7 +646,13 @@ Value *VPInstruction::generate(VPTransformState &State) {
Value *A = State.get(getOperand(0));
return Builder.CreateOrReduce(A);
}
-
+ case VPInstruction::ExtractFirstActive: {
+ Value *Vec = State.get(getOperand(0));
+ Value *Mask = State.get(getOperand(1));
+ Value *Ctz =
+ Builder.CreateCountTrailingZeroElems(Builder.getInt64Ty(), Mask);
+ return Builder.CreateExtractElement(Vec, Ctz);
+ }
default:
llvm_unreachable("Unsupported opcode for instruction");
}
@@ -654,7 +661,8 @@ Value *VPInstruction::generate(VPTransformState &State) {
bool VPInstruction::isVectorToScalar() const {
return getOpcode() == VPInstruction::ExtractFromEnd ||
getOpcode() == VPInstruction::ComputeReductionResult ||
- getOpcode() == VPInstruction::AnyOf;
+ getOpcode() == VPInstruction::AnyOf ||
+ getOpcode() == VPInstruction::ExtractFirstActive;
}
bool VPInstruction::isSingleScalar() const {
@@ -816,6 +824,9 @@ void VPInstruction::print(raw_ostream &O, const Twine &Indent,
case VPInstruction::AnyOf:
O << "any-of";
break;
+ case VPInstruction::ExtractFirstActive:
+ O << "extract-first-active";
+ break;
default:
O << Instruction::getOpcodeName(getOpcode());
}
@@ -833,8 +844,9 @@ void VPInstruction::print(raw_ostream &O, const Twine &Indent,
void VPIRInstruction::execute(VPTransformState &State) {
assert((isa<PHINode>(&I) || getNumOperands() == 0) &&
"Only PHINodes can have extra operands");
- for (const auto &[Idx, Op] : enumerate(operands())) {
- VPValue *ExitValue = Op;
+ for (const auto &[Idx, ExitValue] : enumerate(operands())) {
+ if (ExitValue->isNull())
+ continue;
auto Lane = vputils::isUniformAfterVectorization(ExitValue)
? VPLane::getFirstLane()
: VPLane::getLastLaneForVF(State.VF);
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index aacb27f9325d07..9b3b54fde5112e 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -1905,14 +1905,19 @@ void VPlanTransforms::handleUncountableEarlyExit(
VPValue *EarlyExitNotTakenCond = RecipeBuilder.getBlockInMask(
OrigLoop->contains(TrueSucc) ? TrueSucc : FalseSucc);
auto *EarlyExitTakenCond = Builder.createNot(EarlyExitNotTakenCond);
+ LoopRegion->setVectorEarlyExitCond(EarlyExitTakenCond);
IsEarlyExitTaken =
Builder.createNaryOp(VPInstruction::AnyOf, {EarlyExitTakenCond});
VPBasicBlock *NewMiddle = new VPBasicBlock("middle.split");
+ VPBasicBlock *EarlyExitVPBB = new VPBasicBlock("vector.early.exit");
VPBlockUtils::insertOnEdge(LoopRegion, MiddleVPBB, NewM...
[truncated]
``````````
</details>
https://github.com/llvm/llvm-project/pull/120567
More information about the llvm-commits
mailing list