[llvm] [LV] Expand VPWidenIntOrFpInductionRecipe into separate recipes (PR #118638)

Wed Dec 4 05:51:53 PST 2024

llvmbot wrote:



@llvm/pr-subscribers-vectorizers

@llvm/pr-subscribers-llvm-transforms

Author: Luke Lau (lukel97)

<details>
<summary>Changes</summary>

The motivation of this PR is to make #115274 easier to implement. It's similar to the idea in #82021 (but admittedly I didn't notice it until I had already written this!), and should allow us to add EVL support by just passing EVL to the VF operand.

The current difficulty with widening IVs with EVL is that VPWidenIntOrFpInductionRecipe generates its own backedge value. Since it's a VPHeaderPHIRecipe the VF operand must be in the preheader, which means we can't use the EVL since it's defined in the loop body.

The gist in this PR is to take the approach in #114305 and expand VPWidenIntOrFpInductionRecipe into several recipes for the initial value, phi and backedge value just before execution. I.e. this example:

```
  Live-in ir<%5> = VF

  vector.ph:
  Successor(s): vector loop

  <x1> vector loop: {
    vector.body:
      WIDEN-INDUCTION %i = phi %bc.resume.val, %i.next, ir<1>, ir<%5>
      ...
      EMIT branch-on-count vp<%index.next>, ir<%n.vec>
    No successors
  }
```

gets expanded to:

``` 
VPlan 'Final VPlan for VF={vscale x 1,vscale x 2},UF={1}' {
  Live-in ir<%5> = VF

  vector.ph:
    vp<%0> = WIDEN-INDUCTION-START ir<0>, ir<1>
  Successor(s): vector loop

  <x1> vector loop: {
    vector.body:
      ir<%i> = WIDEN-INDUCTION-PHI vp<%0>, vp<%4>
      ...
      vp<%4> = WIDEN-INDUCTION-INC ir<1>, ir<%5>, ir<%i>
      EMIT branch-on-count vp<%index.next>, ir<%n.vec>
    No successors
  }
```

This allows us to a value defined in the loop in the backedge value, and also means we can just reuse the existing backedge fixups in VPlan::execute without having to specially handle it ourselves.

I initially tried just splitting up VPWidenIntOrFpInductionRecipe immediately in `createWidenInductionRecipes`, but as pointed out in https://github.com/llvm/llvm-project/pull/114305#issuecomment-2466656054 it turns out to be a total pain trying to detect these in other places like `removeRedundantInductionCasts` and `unrollHeaderPHIByUF`. In a way it's kind of like a pseudo-instruction at the MachineInstr level.

I also tried to avoid the start and increment recipes by expanding them directly into VPInstructions/VPWidenIntrinsicRecipe/VPScalarCastRecipes, but I ran into some difficulties when trying to broadcast a scalar type to use in a widened binary op. (I'm happy to give this another try though!)

I hoped to make this an NFC, but unfortunately some of the splatted values get shuffled about in the preheaders.

I may be duplicating other work here, and this was just the solution I stumbled upon. Open to any other ideas or approaches!

---

Patch is 84.38 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/118638.diff


19 Files Affected:

- (modified) llvm/lib/Transforms/Vectorize/VPlan.cpp (+6-16) 
- (modified) llvm/lib/Transforms/Vectorize/VPlan.h (+151-6) 
- (modified) llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp (+11-8) 
- (modified) llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp (+67-43) 
- (modified) llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp (+79) 
- (modified) llvm/lib/Transforms/Vectorize/VPlanValue.h (+3) 
- (modified) llvm/test/Transforms/LoopVectorize/RISCV/blocks-with-dead-instructions.ll (+18-18) 
- (modified) llvm/test/Transforms/LoopVectorize/RISCV/mask-index-type.ll (+2-2) 
- (modified) llvm/test/Transforms/LoopVectorize/RISCV/pr87378-vpinstruction-or-drop-poison-generating-flags.ll (+3-3) 
- (modified) llvm/test/Transforms/LoopVectorize/RISCV/pr88802.ll (+1-1) 
- (modified) llvm/test/Transforms/LoopVectorize/RISCV/strided-accesses.ll (+5-5) 
- (modified) llvm/test/Transforms/LoopVectorize/RISCV/uniform-load-store.ll (+13-13) 
- (modified) llvm/test/Transforms/LoopVectorize/RISCV/vectorize-force-tail-with-evl-cond-reduction.ll (+2-2) 
- (modified) llvm/test/Transforms/LoopVectorize/create-induction-resume.ll (+5-5) 
- (modified) llvm/test/Transforms/LoopVectorize/epilog-vectorization-any-of-reductions.ll (+2-2) 
- (modified) llvm/test/Transforms/LoopVectorize/first-order-recurrence.ll (+2-2) 
- (modified) llvm/test/Transforms/LoopVectorize/float-induction.ll (+35-35) 
- (modified) llvm/test/Transforms/LoopVectorize/induction.ll (+4-4) 
- (modified) llvm/test/Transforms/LoopVectorize/reduction-inloop-pred.ll (+18-18) 


``````````diff

diff --git a/llvm/lib/Transforms/Vectorize/VPlan.cpp b/llvm/lib/Transforms/Vectorize/VPlan.cpp
index b801d1863e252c..4e4c4dfd461824 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlan.cpp
@@ -1043,18 +1043,12 @@ void VPlan::execute(VPTransformState *State) {
     if (isa<VPWidenPHIRecipe>(&R))
       continue;
 
-    if (isa<VPWidenPointerInductionRecipe>(&R) ||
-        isa<VPWidenIntOrFpInductionRecipe>(&R)) {
-      PHINode *Phi = nullptr;
-      if (isa<VPWidenIntOrFpInductionRecipe>(&R)) {
-        Phi = cast<PHINode>(State->get(R.getVPSingleValue()));
-      } else {
-        auto *WidenPhi = cast<VPWidenPointerInductionRecipe>(&R);
-        assert(!WidenPhi->onlyScalarsGenerated(State->VF.isScalable()) &&
-               "recipe generating only scalars should have been replaced");
-        auto *GEP = cast<GetElementPtrInst>(State->get(WidenPhi));
-        Phi = cast<PHINode>(GEP->getPointerOperand());
-      }
+    if (isa<VPWidenPointerInductionRecipe>(&R)) {
+      auto *WidenPhi = cast<VPWidenPointerInductionRecipe>(&R);
+      assert(!WidenPhi->onlyScalarsGenerated(State->VF.isScalable()) &&
+             "recipe generating only scalars should have been replaced");
+      auto *GEP = cast<GetElementPtrInst>(State->get(WidenPhi));
+      PHINode *Phi = cast<PHINode>(GEP->getPointerOperand());
 
       Phi->setIncomingBlock(1, VectorLatchBB);
 
@@ -1062,10 +1056,6 @@ void VPlan::execute(VPTransformState *State) {
       // consistent placement of all induction updates.
       Instruction *Inc = cast<Instruction>(Phi->getIncomingValue(1));
       Inc->moveBefore(VectorLatchBB->getTerminator()->getPrevNode());
-
-      // Use the steps for the last part as backedge value for the induction.
-      if (auto *IV = dyn_cast<VPWidenIntOrFpInductionRecipe>(&R))
-        Inc->setOperand(0, State->get(IV->getLastUnrolledPartOperand()));
       continue;
     }
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlan.h b/llvm/lib/Transforms/Vectorize/VPlan.h
index e1d828f038f9a2..513e973af8fc1a 100644
--- a/llvm/lib/Transforms/Vectorize/VPlan.h
+++ b/llvm/lib/Transforms/Vectorize/VPlan.h
@@ -2089,7 +2089,9 @@ class VPHeaderPHIRecipe : public VPSingleDefRecipe {
 };
 
 /// A recipe for handling phi nodes of integer and floating-point inductions,
-/// producing their vector values.
+/// producing their vector values. This won't execute any LLVM IR and will get
+/// expanded later into VPWidenIntOrFpInitialRecipe, VPWidenIntOrFpPHIRecipe and
+/// VPWidenIntOrFpBackedgeRecipe.
 class VPWidenIntOrFpInductionRecipe : public VPHeaderPHIRecipe {
   PHINode *IV;
   TruncInst *Trunc;
@@ -2122,9 +2124,10 @@ class VPWidenIntOrFpInductionRecipe : public VPHeaderPHIRecipe {
 
   VP_CLASSOF_IMPL(VPDef::VPWidenIntOrFpInductionSC)
 
-  /// Generate the vectorized and scalarized versions of the phi node as
-  /// needed by their users.
-  void execute(VPTransformState &State) override;
+  void execute(VPTransformState &State) override {
+    llvm_unreachable("cannot execute this recipe, should be expanded via "
+                     "expandVPWidenIntOrFpInductionRecipe");
+  }
 
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
   /// Print the recipe.
@@ -2180,10 +2183,152 @@ class VPWidenIntOrFpInductionRecipe : public VPHeaderPHIRecipe {
   }
 
   /// Returns the VPValue representing the value of this induction at
-  /// the last unrolled part, if it exists. Returns itself if unrolling did not
+  /// the last unrolled part, if it exists. Returns nullptr if unrolling did not
   /// take place.
   VPValue *getLastUnrolledPartOperand() {
-    return getNumOperands() == 5 ? getOperand(4) : this;
+    return getNumOperands() == 5 ? getOperand(4) : nullptr;
+  }
+};
+
+/// A recipe to compute the initial value for a widened IV, expanded from
+/// VPWidenIntOrFpInductionRecipe.
+class VPWidenIntOrFpInductionInitialRecipe : public VPSingleDefRecipe {
+  Instruction *IV;
+  const InductionDescriptor &ID;
+
+public:
+  VPWidenIntOrFpInductionInitialRecipe(Instruction *IV, VPValue *Start,
+                                       VPValue *Step,
+                                       const InductionDescriptor &ID)
+      : VPSingleDefRecipe(VPDef::VPWidenIntOrFpInductionStartSC, {Start, Step}),
+        IV(IV), ID(ID) {
+    assert((isa<PHINode>(IV) || isa<TruncInst>(IV)) &&
+           "Expected either an induction phi-node or a truncate of it!");
+  }
+
+  ~VPWidenIntOrFpInductionInitialRecipe() override = default;
+
+  VPWidenIntOrFpInductionInitialRecipe *clone() override {
+    return new VPWidenIntOrFpInductionInitialRecipe(IV, getOperand(0),
+                                                    getOperand(1), ID);
+  }
+
+  VP_CLASSOF_IMPL(VPDef::VPWidenIntOrFpInductionStartSC)
+
+  void execute(VPTransformState &State) override;
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+  /// Print the recipe.
+  void print(raw_ostream &O, const Twine &Indent,
+             VPSlotTracker &SlotTracker) const override;
+#endif
+
+  VPValue *getStartValue() { return getOperand(0); }
+  const VPValue *getStartValue() const { return getOperand(0); }
+
+  VPValue *getStepValue() { return getOperand(1); }
+  const VPValue *getStepValue() const { return getOperand(1); }
+
+  /// Returns the scalar type of the induction.
+  Type *getScalarType() const { return IV->getType(); }
+
+  bool onlyFirstLaneUsed(const VPValue *Op) const override {
+    assert(is_contained(operands(), Op) &&
+           "Op must be an operand of the recipe");
+    return true;
+  }
+};
+
+/// A recipe to generate the PHI of a widened IV, expanded from
+/// VPWidenIntOrFpInductionRecipe.
+class VPWidenIntOrFpInductionPHIRecipe : public VPHeaderPHIRecipe {
+  Instruction *IV;
+
+public:
+  VPWidenIntOrFpInductionPHIRecipe(Instruction *IV, VPValue *Start)
+      : VPHeaderPHIRecipe(VPDef::VPWidenIntOrFpInductionPHISC, IV, Start),
+        IV(IV) {
+    assert((isa<PHINode>(IV) || isa<TruncInst>(IV)) &&
+           "Expected either an induction phi-node or a truncate of it!");
+  }
+
+  ~VPWidenIntOrFpInductionPHIRecipe() override = default;
+
+  VPWidenIntOrFpInductionPHIRecipe *clone() override {
+    auto *R = new VPWidenIntOrFpInductionPHIRecipe(IV, getOperand(0));
+    R->addOperand(getBackedgeValue());
+    return R;
+  }
+
+  VP_CLASSOF_IMPL(VPDef::VPWidenIntOrFpInductionPHISC)
+
+  void execute(VPTransformState &State) override;
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+  /// Print the recipe.
+  void print(raw_ostream &O, const Twine &Indent,
+             VPSlotTracker &SlotTracker) const override;
+#endif
+};
+
+/// A recipe to compute the backedge value for a widened IV, expanded from
+/// VPWidenIntOrFpInductionRecipe.
+class VPWidenIntOrFpInductionBackedgeRecipe : public VPSingleDefRecipe {
+  Instruction *IV;
+  const InductionDescriptor &ID;
+
+public:
+  VPWidenIntOrFpInductionBackedgeRecipe(Instruction *IV, VPValue *Step,
+                                        VPValue *VF, VPValue *Prev,
+                                        VPValue *SplatVF,
+                                        const InductionDescriptor &ID)
+      : VPSingleDefRecipe(VPDef::VPWidenIntOrFpInductionSC, {Step, VF, Prev}),
+        IV(IV), ID(ID) {
+    assert((isa<PHINode>(IV) || isa<TruncInst>(IV)) &&
+           "Expected either an induction phi-node or a truncate of it!");
+    if (SplatVF)
+      addOperand(SplatVF);
+  }
+
+  ~VPWidenIntOrFpInductionBackedgeRecipe() override = default;
+
+  VPWidenIntOrFpInductionBackedgeRecipe *clone() override {
+    return new VPWidenIntOrFpInductionBackedgeRecipe(
+        IV, getOperand(0), getOperand(1), getOperand(2), getOperand(3), ID);
+  }
+
+  VP_CLASSOF_IMPL(VPDef::VPWidenIntOrFpInductionIncSC)
+
+  void execute(VPTransformState &State) override;
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+  /// Print the recipe.
+  void print(raw_ostream &O, const Twine &Indent,
+             VPSlotTracker &SlotTracker) const override;
+#endif
+
+  VPValue *getStepValue() { return getOperand(0); }
+  const VPValue *getStepValue() const { return getOperand(0); }
+
+  VPValue *getVFValue() { return getOperand(1); }
+  const VPValue *getVFValue() const { return getOperand(1); }
+
+  VPValue *getPrevValue() { return getOperand(2); }
+  const VPValue *getPrevValue() const { return getOperand(2); }
+
+  VPValue *getSplatVFValue() {
+    // If the recipe has been unrolled (4 operands), return the VPValue for the
+    // induction increment.
+    return getNumOperands() == 4 ? getOperand(3) : nullptr;
+  }
+
+  /// Returns the scalar type of the induction.
+  Type *getScalarType() const { return IV->getType(); }
+
+  bool onlyFirstLaneUsed(const VPValue *Op) const override {
+    assert(is_contained(operands(), Op) &&
+           "Op must be an operand of the recipe");
+    return Op == getOperand(0) || Op == getOperand(1);
   }
 };
 
diff --git a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
index 969d07b229e469..8a9e64b00850e2 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp
@@ -214,14 +214,17 @@ Type *VPTypeAnalysis::inferScalarType(const VPValue *V) {
           .Case<VPActiveLaneMaskPHIRecipe, VPCanonicalIVPHIRecipe,
                 VPFirstOrderRecurrencePHIRecipe, VPReductionPHIRecipe,
                 VPWidenPointerInductionRecipe, VPEVLBasedIVPHIRecipe,
-                VPScalarPHIRecipe>([this](const auto *R) {
-            // Handle header phi recipes, except VPWidenIntOrFpInduction
-            // which needs special handling due it being possibly truncated.
-            // TODO: consider inferring/caching type of siblings, e.g.,
-            // backedge value, here and in cases below.
-            return inferScalarType(R->getStartValue());
-          })
-          .Case<VPWidenIntOrFpInductionRecipe, VPDerivedIVRecipe>(
+                VPScalarPHIRecipe, VPWidenIntOrFpInductionPHIRecipe>(
+              [this](const auto *R) {
+                // Handle header phi recipes, except VPWidenIntOrFpInduction
+                // which needs special handling due it being possibly truncated.
+                // TODO: consider inferring/caching type of siblings, e.g.,
+                // backedge value, here and in cases below.
+                return inferScalarType(R->getStartValue());
+              })
+          .Case<VPWidenIntOrFpInductionRecipe, VPDerivedIVRecipe,
+                VPWidenIntOrFpInductionInitialRecipe,
+                VPWidenIntOrFpInductionBackedgeRecipe>(
               [](const auto *R) { return R->getScalarType(); })
           .Case<VPReductionRecipe, VPPredInstPHIRecipe, VPWidenPHIRecipe,
                 VPScalarIVStepsRecipe, VPWidenGEPRecipe, VPVectorPointerRecipe,
diff --git a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
index ef5f6e22f82206..8ae1e7382bce38 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
@@ -1631,47 +1631,73 @@ static Constant *getSignedIntOrFpConstant(Type *Ty, int64_t C) {
                            : ConstantFP::get(Ty, C);
 }
 
-void VPWidenIntOrFpInductionRecipe::execute(VPTransformState &State) {
+void VPWidenIntOrFpInductionInitialRecipe::execute(VPTransformState &State) {
   assert(!State.Lane && "Int or FP induction being replicated.");
 
-  Value *Start = getStartValue()->getLiveInIRValue();
-  const InductionDescriptor &ID = getInductionDescriptor();
-  TruncInst *Trunc = getTruncInst();
+  Value *Start = State.get(getStartValue(), true);
   IRBuilderBase &Builder = State.Builder;
-  assert(IV->getType() == ID.getStartValue()->getType() && "Types must match");
   assert(State.VF.isVector() && "must have vector VF");
 
-  // The value from the original loop to which we are mapping the new induction
-  // variable.
-  Instruction *EntryVal = Trunc ? cast<Instruction>(Trunc) : IV;
-
   // Fast-math-flags propagate from the original induction instruction.
   IRBuilder<>::FastMathFlagGuard FMFG(Builder);
-  if (ID.getInductionBinOp() && isa<FPMathOperator>(ID.getInductionBinOp()))
+  if (isa_and_nonnull<FPMathOperator>(ID.getInductionBinOp()))
     Builder.setFastMathFlags(ID.getInductionBinOp()->getFastMathFlags());
 
   // Now do the actual transformations, and start with fetching the step value.
   Value *Step = State.get(getStepValue(), VPLane(0));
 
-  assert((isa<PHINode>(EntryVal) || isa<TruncInst>(EntryVal)) &&
-         "Expected either an induction phi-node or a truncate of it!");
-
-  // Construct the initial value of the vector IV in the vector loop preheader
-  auto CurrIP = Builder.saveIP();
-  BasicBlock *VectorPH = State.CFG.getPreheaderBBFor(this);
-  Builder.SetInsertPoint(VectorPH->getTerminator());
-  if (isa<TruncInst>(EntryVal)) {
-    assert(Start->getType()->isIntegerTy() &&
-           "Truncation requires an integer type");
-    auto *TruncType = cast<IntegerType>(EntryVal->getType());
-    Step = Builder.CreateTrunc(Step, TruncType);
-    Start = Builder.CreateCast(Instruction::Trunc, Start, TruncType);
-  }
-
+  // Construct the initial value of the vector IV
   Value *Zero = getSignedIntOrFpConstant(Start->getType(), 0);
   Value *SplatStart = Builder.CreateVectorSplat(State.VF, Start);
   Value *SteppedStart = getStepVector(
       SplatStart, Zero, Step, ID.getInductionOpcode(), State.VF, State.Builder);
+  State.set(this, SteppedStart);
+}
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+void VPWidenIntOrFpInductionInitialRecipe::print(
+    raw_ostream &O, const Twine &Indent, VPSlotTracker &SlotTracker) const {
+  O << Indent;
+  printAsOperand(O, SlotTracker);
+  O << " = WIDEN-INDUCTION-START ";
+  printOperands(O, SlotTracker);
+}
+#endif
+
+void VPWidenIntOrFpInductionPHIRecipe::execute(VPTransformState &State) {
+  BasicBlock *VectorPH = State.CFG.getPreheaderBBFor(this);
+
+  Value *Start = State.get(getOperand(0));
+  PHINode *Phi = State.Builder.CreatePHI(Start->getType(), 2, "vec.ind");
+  Phi->addIncoming(Start, VectorPH);
+  Phi->setDebugLoc(IV->getDebugLoc());
+  State.set(this, Phi);
+}
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+void VPWidenIntOrFpInductionPHIRecipe::print(raw_ostream &O,
+                                             const Twine &Indent,
+                                             VPSlotTracker &SlotTracker) const {
+  O << Indent;
+  printAsOperand(O, SlotTracker);
+  O << " = WIDEN-INDUCTION-PHI ";
+  printOperands(O, SlotTracker);
+}
+#endif
+
+void VPWidenIntOrFpInductionBackedgeRecipe::execute(VPTransformState &State) {
+  IRBuilderBase &Builder = State.Builder;
+
+  // Fast-math-flags propagate from the original induction instruction.
+  IRBuilder<>::FastMathFlagGuard FMFG(Builder);
+  if (isa_and_nonnull<FPMathOperator>(ID.getInductionBinOp()))
+    Builder.setFastMathFlags(ID.getInductionBinOp()->getFastMathFlags());
+
+  Value *Step = State.get(getStepValue(), VPLane(0));
+
+  auto CurrIP = Builder.saveIP();
+  BasicBlock *VectorPH = State.CFG.getPreheaderBBFor(this);
+  Builder.SetInsertPoint(VectorPH->getTerminator());
 
   // We create vector phi nodes for both integer and floating-point induction
   // variables. Here, we determine the kind of arithmetic we will perform.
@@ -1706,29 +1732,27 @@ void VPWidenIntOrFpInductionRecipe::execute(VPTransformState &State) {
   }
 
   Builder.restoreIP(CurrIP);
-
-  // We may need to add the step a number of times, depending on the unroll
-  // factor. The last of those goes into the PHI.
-  PHINode *VecInd = PHINode::Create(SteppedStart->getType(), 2, "vec.ind");
-  VecInd->insertBefore(State.CFG.PrevBB->getFirstInsertionPt());
-  VecInd->setDebugLoc(EntryVal->getDebugLoc());
-  State.set(this, VecInd);
+  Value *PrevVal = State.get(getPrevValue());
 
   Instruction *LastInduction = cast<Instruction>(
-      Builder.CreateBinOp(AddOp, VecInd, SplatVF, "vec.ind.next"));
-  if (isa<TruncInst>(EntryVal))
-    State.addMetadata(LastInduction, EntryVal);
-  LastInduction->setDebugLoc(EntryVal->getDebugLoc());
+      Builder.CreateBinOp(AddOp, PrevVal, SplatVF, "vec.ind.next"));
+  if (isa<TruncInst>(IV))
+    State.addMetadata(LastInduction, IV);
+  LastInduction->setDebugLoc(IV->getDebugLoc());
 
-  VecInd->addIncoming(SteppedStart, VectorPH);
-  // Add induction update using an incorrect block temporarily. The phi node
-  // will be fixed after VPlan execution. Note that at this point the latch
-  // block cannot be used, as it does not exist yet.
-  // TODO: Model increment value in VPlan, by turning the recipe into a
-  // multi-def and a subclass of VPHeaderPHIRecipe.
-  VecInd->addIncoming(LastInduction, VectorPH);
+  State.set(this, LastInduction);
 }
 
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+void VPWidenIntOrFpInductionBackedgeRecipe::print(
+    raw_ostream &O, const Twine &Indent, VPSlotTracker &SlotTracker) const {
+  O << Indent;
+  printAsOperand(O, SlotTracker);
+  O << " = WIDEN-INDUCTION-INC ";
+  printOperands(O, SlotTracker);
+}
+#endif
+
 #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
 void VPWidenIntOrFpInductionRecipe::print(raw_ostream &O, const Twine &Indent,
                                           VPSlotTracker &SlotTracker) const {
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index cee83d1015b536..9abf5f28936d53 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -1820,12 +1820,91 @@ void VPlanTransforms::createInterleaveGroups(
   }
 }
 
+/// Expand a VPWidenIntOrFpInduction into separate recipes for the initial
+/// value, phi and backedge value. In the followng example:
+///
+///  vector.ph:
+///  Successor(s): vector loop
+///
+///  <x1> vector loop: {
+///    vector.body:
+///      WIDEN-INDUCTION %i = phi %bc.resume.val, %i.next, ir<1>, ir<%5>
+///      ...
+///      EMIT branch-on-count vp<%index.next>, ir<%n.vec>
+///    No successors
+///  }
+///
+/// WIDEN-INDUCTION will get expanded to:
+///
+///  vector.ph:
+///    vp<%0> = WIDEN-INDUCTION-START ir<0>, ir<1>
+///  Successor(s): vector loop
+///
+///  <x1> vector loop: {
+///    vector.body:
+///      ir<%i> = WIDEN-INDUCTION-PHI vp<%0>, vp<%4>
+///      ...
+///      vp<%4> = WIDEN-INDUCTION-INC ir<1>, ir<%5>, ir<%i>
+///      EMIT branch-on-count vp<%index.next>, ir<%n.vec>
+///    No successors
+///  }
+static void
+expandVPWidenIntOrFpInduction(VPWidenIntOrFpInductionRecipe *WidenIVR) {
+  VPlan *Plan = WidenIVR->getParent()->getPlan();
+  PHINode *PHI = WidenIVR->getPHINode();
+  VPValue *Start = WidenIVR->getStartValue();
+  VPValue *Step = WidenIVR->getStepValue();
+  VPValue *VF = WidenIVR->getVFValue();
+  const InductionDescriptor &ID = WidenIVR->getInductionDescriptor();
+  TruncInst *Trunc = WidenIVR->getTruncInst();
+
+  // The value from the original loop to which we are mapping the new induction
+  // variable.
+  Instruction *IV = Trunc ? cast<Instruction>(Trunc) : PHI;
+
+  // If the phi is truncated, truncate the start and step values.
+  VPBuilder Builder(Plan->getVectorPreheader());
+  if (isa<TruncInst>(IV)) {
+    assert(Start->getUnderlyingValue()->getType()->isIntegerTy() &&
+           "Truncation requires an integer type");
+    auto *TruncType = cast<IntegerType>(IV->getType());
+    Step = Builder.createScalarCast(Instruction::Trunc, Step, TruncType);
+    Start = Builder.createScalarCast(Instruction::Trunc, Start, TruncType);
+  }
+
+  // Construct the initial value of the vector IV in the vector loop preheader.
+  auto *StartR = new VPWidenIntOrFpInductionInitialRecipe(IV, Start, Step, ID);
+  Plan->getVectorPreheader()->insert(StartR, Builder.getInsertPoint());
+
+  // Create the widened phi of the vector IV.
+  auto *PhiR = new VPWidenIntOrFpInductionPHIRecipe(IV, StartR);
+  PhiR->insertBefore(WidenIVR);
+
+  // Create the backedge value for the vector IV.
+  VPValue *Prev = PhiR;
+  // If unrolled, use the last unrolled part in the increment.
+  if (auto *UnrolledPart = WidenIVR->getLastUnrolledPartOperand())
+    Prev = UnrolledPart;
+  auto *IncR = new VPWidenIntOrFpInductionBackedgeRecipe(
+      IV, Step, VF, Prev, WidenIVR->getSplatVFValue(), ID);
+  VPBasicBlock *ExitingBB = Plan->getVectorLoopRegion()->getExitingBasicBlock();
+  ExitingBB->insert(IncR, ExitingBB->getTer...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/118638