[llvm] [DFAJumpThreading] Rewrite the way paths are enumerated (PR #96127)

Usman Nadeem via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 19 17:52:11 PDT 2024


https://github.com/UsmanNadeem created https://github.com/llvm/llvm-project/pull/96127

I tried to add a limit to number of blocks visited in the paths() function but even with a very high limit the transformation coverage was being reduced.

After looking at the code it seemed that the function was trying to create paths of the form `SwitchBB...DeterminatorBB...SwitchPredecessor`. This is inefficient because a lot of nodes in those paths (nodes before DeterminatorBB) would be irrelevant to the optimization. We only care about paths of the form `DeterminatorBB_Pred DeterminatorBB...SwitchBB`. This weeds out a lot of visited nodes.

In this patch I have added a hard limit to the number of nodes visited and changed the algorithm for path calculation. Primarily I am traversing the use-def chain for the PHI nodes that define the state. If we have a hole in the use-def chain (no immediate predecessors) then I call the paths() function.

I also had to the change the select instruction unfolding code to insert redundant one input PHIs to allow the use of the use-def chain in calculating the paths.

The test suite coverage with this patch (including a limit on nodes visited) is as follows:

    Geomean diff:
      dfa-jump-threading.NumTransforms: +13.4%
      dfa-jump-threading.NumCloned: +34.1%
      dfa-jump-threading.NumPaths: -80.7%

Compile time effect vs baseline (pass enabled by default) is mostly positive: https://llvm-compile-time-tracker.com/compare.php?from=ad8705fda25f64dcfeb6264ac4d6bac36bee91ab&to=5a3af6ce7e852f0736f706b4a8663efad5bce6ea&stat=instructions:u

Change-Id: I0fba9e0f8aa079706f633089a8ccd4ecf57547ed

>From ace990da3e6d92ed4ad1bc81c3a3ea05af3740d3 Mon Sep 17 00:00:00 2001
From: "Nadeem, Usman" <mnadeem at quicinc.com>
Date: Wed, 19 Jun 2024 17:46:17 -0700
Subject: [PATCH] [DFAJumpThreading] Rewrite the way paths are enumerated

I tried to add a limit to number of blocks visited in the paths()
function but even with a very high limit the transformation coverage was
being reduced.

After looking at the code it seemed that the function was trying to
create paths of the form `SwitchBB...DeterminatorBB...SwitchPredecessor`.
This is inefficient because a lot of nodes in those paths (nodes before
DeterminatorBB) would be irrelevant to the optimization. We only care
about paths of the form `DeterminatorBB_Pred DeterminatorBB...SwitchBB`.
This weeds out a lot of visited nodes.

In this patch I have added a hard limit to the number of nodes visited
and changed the algorithm for path calculation. Primarily I am
traversing the use-def chain for the PHI nodes that define the state. If
we have a hole in the use-def chain (no immediate predecessors) then I
call the paths() function.

I also had to the change the select instruction unfolding code to insert
redundant one input PHIs to allow the use of the use-def chain in
calculating the paths.

The test suite coverage with this patch (including a limit on nodes
visited) is as follows:

    Geomean diff:
      dfa-jump-threading.NumTransforms: +13.4%
      dfa-jump-threading.NumCloned: +34.1%
      dfa-jump-threading.NumPaths: -80.7%

Compile time effect vs baseline (pass enabled by default) is mostly
positive: https://llvm-compile-time-tracker.com/compare.php?from=ad8705fda25f64dcfeb6264ac4d6bac36bee91ab&to=5a3af6ce7e852f0736f706b4a8663efad5bce6ea&stat=instructions:u

Change-Id: I0fba9e0f8aa079706f633089a8ccd4ecf57547ed
---
 .../Transforms/Scalar/DFAJumpThreading.cpp    | 462 ++++++++++--------
 .../dfa-jump-threading-analysis.ll            |  44 +-
 .../dfa-jump-threading-transform.ll           |  84 ++--
 .../DFAJumpThreading/dfa-unfold-select.ll     | 204 +++++---
 .../DFAJumpThreading/max-path-length.ll       |  71 +--
 5 files changed, 465 insertions(+), 400 deletions(-)

diff --git a/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp b/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp
index 4371b821eae63..42900642edb2c 100644
--- a/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp
+++ b/llvm/lib/Transforms/Scalar/DFAJumpThreading.cpp
@@ -106,6 +106,12 @@ static cl::opt<unsigned> MaxPathLength(
     cl::desc("Max number of blocks searched to find a threading path"),
     cl::Hidden, cl::init(20));
 
+static cl::opt<unsigned> MaxNumVisitiedPaths(
+    "dfa-max-num-visited-paths",
+    cl::desc(
+        "Max number of blocks visited while enumerating paths around a switch"),
+    cl::Hidden, cl::init(2000));
+
 static cl::opt<unsigned>
     MaxNumPaths("dfa-max-num-paths",
                 cl::desc("Max number of paths enumerated around a switch"),
@@ -177,24 +183,6 @@ class DFAJumpThreading {
 
 namespace {
 
-/// Create a new basic block and sink \p SIToSink into it.
-void createBasicBlockAndSinkSelectInst(
-    DomTreeUpdater *DTU, SelectInst *SI, PHINode *SIUse, SelectInst *SIToSink,
-    BasicBlock *EndBlock, StringRef NewBBName, BasicBlock **NewBlock,
-    BranchInst **NewBranch, std::vector<SelectInstToUnfold> *NewSIsToUnfold,
-    std::vector<BasicBlock *> *NewBBs) {
-  assert(SIToSink->hasOneUse());
-  assert(NewBlock);
-  assert(NewBranch);
-  *NewBlock = BasicBlock::Create(SI->getContext(), NewBBName,
-                                 EndBlock->getParent(), EndBlock);
-  NewBBs->push_back(*NewBlock);
-  *NewBranch = BranchInst::Create(EndBlock, *NewBlock);
-  SIToSink->moveBefore(*NewBranch);
-  NewSIsToUnfold->push_back(SelectInstToUnfold(SIToSink, SIUse));
-  DTU->applyUpdates({{DominatorTree::Insert, *NewBlock, EndBlock}});
-}
-
 /// Unfold the select instruction held in \p SIToUnfold by replacing it with
 /// control flow.
 ///
@@ -212,89 +200,42 @@ void unfold(DomTreeUpdater *DTU, LoopInfo *LI, SelectInstToUnfold SIToUnfold,
   BranchInst *StartBlockTerm =
       dyn_cast<BranchInst>(StartBlock->getTerminator());
 
-  assert(StartBlockTerm && StartBlockTerm->isUnconditional());
+  assert(StartBlockTerm);
   assert(SI->hasOneUse());
 
-  // These are the new basic blocks for the conditional branch.
-  // At least one will become an actual new basic block.
-  BasicBlock *TrueBlock = nullptr;
-  BasicBlock *FalseBlock = nullptr;
-  BranchInst *TrueBranch = nullptr;
-  BranchInst *FalseBranch = nullptr;
-
-  // Sink select instructions to be able to unfold them later.
-  if (SelectInst *SIOp = dyn_cast<SelectInst>(SI->getTrueValue())) {
-    createBasicBlockAndSinkSelectInst(DTU, SI, SIUse, SIOp, EndBlock,
-                                      "si.unfold.true", &TrueBlock, &TrueBranch,
-                                      NewSIsToUnfold, NewBBs);
-  }
-  if (SelectInst *SIOp = dyn_cast<SelectInst>(SI->getFalseValue())) {
-    createBasicBlockAndSinkSelectInst(DTU, SI, SIUse, SIOp, EndBlock,
-                                      "si.unfold.false", &FalseBlock,
-                                      &FalseBranch, NewSIsToUnfold, NewBBs);
-  }
-
-  // If there was nothing to sink, then arbitrarily choose the 'false' side
-  // for a new input value to the PHI.
-  if (!TrueBlock && !FalseBlock) {
-    FalseBlock = BasicBlock::Create(SI->getContext(), "si.unfold.false",
-                                    EndBlock->getParent(), EndBlock);
-    NewBBs->push_back(FalseBlock);
-    BranchInst::Create(EndBlock, FalseBlock);
-    DTU->applyUpdates({{DominatorTree::Insert, FalseBlock, EndBlock}});
-  }
-
-  // Insert the real conditional branch based on the original condition.
-  // If we did not create a new block for one of the 'true' or 'false' paths
-  // of the condition, it means that side of the branch goes to the end block
-  // directly and the path originates from the start block from the point of
-  // view of the new PHI.
-  BasicBlock *TT = EndBlock;
-  BasicBlock *FT = EndBlock;
-  if (TrueBlock && FalseBlock) {
-    // A diamond.
-    TT = TrueBlock;
-    FT = FalseBlock;
-
-    // Update the phi node of SI.
-    SIUse->addIncoming(SI->getTrueValue(), TrueBlock);
-    SIUse->addIncoming(SI->getFalseValue(), FalseBlock);
-
-    // Update any other PHI nodes in EndBlock.
-    for (PHINode &Phi : EndBlock->phis()) {
-      if (&Phi != SIUse) {
-        Value *OrigValue = Phi.getIncomingValueForBlock(StartBlock);
-        Phi.addIncoming(OrigValue, TrueBlock);
-        Phi.addIncoming(OrigValue, FalseBlock);
-      }
-
-      // Remove incoming place of original StartBlock, which comes in a indirect
-      // way (through TrueBlock and FalseBlock) now.
-      Phi.removeIncomingValue(StartBlock, /* DeletePHIIfEmpty = */ false);
-    }
-  } else {
-    BasicBlock *NewBlock = nullptr;
+  if (StartBlockTerm->isUnconditional()) {
+    // Arbitrarily choose the 'false' side for a new input value to the PHI.
+    BasicBlock *NewBlock = BasicBlock::Create(
+        SI->getContext(), Twine(SI->getName(), ".si.unfold.false"),
+        EndBlock->getParent(), EndBlock);
+    NewBBs->push_back(NewBlock);
+    BranchInst::Create(EndBlock, NewBlock);
+    DTU->applyUpdates({{DominatorTree::Insert, NewBlock, EndBlock}});
+
+    // StartBlock
+    //   |  \
+    //   |  NewBlock
+    //   |  /
+    // EndBlock
     Value *SIOp1 = SI->getTrueValue();
     Value *SIOp2 = SI->getFalseValue();
 
-    // A triangle pointing right.
-    if (!TrueBlock) {
-      NewBlock = FalseBlock;
-      FT = FalseBlock;
-    }
-    // A triangle pointing left.
-    else {
-      NewBlock = TrueBlock;
-      TT = TrueBlock;
-      std::swap(SIOp1, SIOp2);
-    }
+    PHINode *NewPhi = PHINode::Create(SIUse->getType(), 1,
+                                      Twine(SIOp2->getName(), ".si.unfold.phi"),
+                                      NewBlock->getFirstInsertionPt());
+    NewPhi->addIncoming(SIOp2, StartBlock);
+
+    if (auto *OpSi = dyn_cast<SelectInst>(SIOp1))
+      NewSIsToUnfold->push_back(SelectInstToUnfold(OpSi, SIUse));
+    if (auto *OpSi = dyn_cast<SelectInst>(SIOp2))
+      NewSIsToUnfold->push_back(SelectInstToUnfold(OpSi, NewPhi));
 
     // Update the phi node of SI.
     for (unsigned Idx = 0; Idx < SIUse->getNumIncomingValues(); ++Idx) {
       if (SIUse->getIncomingBlock(Idx) == StartBlock)
         SIUse->setIncomingValue(Idx, SIOp1);
     }
-    SIUse->addIncoming(SIOp2, NewBlock);
+    SIUse->addIncoming(NewPhi, NewBlock);
 
     // Update any other PHI nodes in EndBlock.
     for (auto II = EndBlock->begin(); PHINode *Phi = dyn_cast<PHINode>(II);
@@ -302,11 +243,87 @@ void unfold(DomTreeUpdater *DTU, LoopInfo *LI, SelectInstToUnfold SIToUnfold,
       if (Phi != SIUse)
         Phi->addIncoming(Phi->getIncomingValueForBlock(StartBlock), NewBlock);
     }
+
+    StartBlockTerm->eraseFromParent();
+
+    // Insert the real conditional branch based on the original condition.
+    BranchInst::Create(EndBlock, NewBlock, SI->getCondition(), StartBlock);
+    DTU->applyUpdates({{DominatorTree::Insert, StartBlock, EndBlock},
+                       {DominatorTree::Insert, StartBlock, NewBlock}});
+  } else {
+    BasicBlock *NewBlockT = BasicBlock::Create(
+        SI->getContext(), Twine(SI->getName(), ".si.unfold.true"),
+        EndBlock->getParent(), EndBlock);
+    BasicBlock *NewBlockF = BasicBlock::Create(
+        SI->getContext(), Twine(SI->getName(), ".si.unfold.false"),
+        EndBlock->getParent(), EndBlock);
+
+    NewBBs->push_back(NewBlockT);
+    NewBBs->push_back(NewBlockF);
+
+    // Def only has one use in EndBlock.
+    // Before transformation:
+    // StartBlock(Def)
+    //   |      \
+    // EndBlock  OtherBlock
+    //  (Use)
+    //
+    // After transformation:
+    // StartBlock(Def)
+    //   |      \
+    //   |       OtherBlock
+    // NewBlockT
+    //   |     \
+    //   |   NewBlockF
+    //   |      /
+    //   |     /
+    // EndBlock
+    //  (Use)
+    BranchInst::Create(EndBlock, NewBlockF);
+    // Insert the real conditional branch based on the original condition.
+    BranchInst::Create(EndBlock, NewBlockF, SI->getCondition(), NewBlockT);
+    DTU->applyUpdates({{DominatorTree::Insert, NewBlockT, NewBlockF},
+                       {DominatorTree::Insert, NewBlockT, EndBlock},
+                       {DominatorTree::Insert, NewBlockF, EndBlock}});
+
+    Value *TrueVal = SI->getTrueValue();
+    Value *FalseVal = SI->getFalseValue();
+
+    PHINode *NewPhiT = PHINode::Create(
+        SIUse->getType(), 1, Twine(TrueVal->getName(), ".si.unfold.phi"),
+        NewBlockT->getFirstInsertionPt());
+    PHINode *NewPhiF = PHINode::Create(
+        SIUse->getType(), 1, Twine(FalseVal->getName(), ".si.unfold.phi"),
+        NewBlockF->getFirstInsertionPt());
+    NewPhiT->addIncoming(TrueVal, StartBlock);
+    NewPhiF->addIncoming(FalseVal, NewBlockT);
+
+    if (auto *TrueSI = dyn_cast<SelectInst>(TrueVal))
+      NewSIsToUnfold->push_back(SelectInstToUnfold(TrueSI, NewPhiT));
+    if (auto *FalseSi = dyn_cast<SelectInst>(FalseVal))
+      NewSIsToUnfold->push_back(SelectInstToUnfold(FalseSi, NewPhiF));
+
+    SIUse->addIncoming(NewPhiT, NewBlockT);
+    SIUse->addIncoming(NewPhiF, NewBlockF);
+    SIUse->removeIncomingValue(StartBlock);
+
+    // Update any other PHI nodes in EndBlock.
+    for (auto II = EndBlock->begin(); PHINode *Phi = dyn_cast<PHINode>(II);
+         ++II) {
+      if (Phi != SIUse) {
+        Phi->addIncoming(Phi->getIncomingValueForBlock(StartBlock), NewBlockT);
+        Phi->addIncoming(Phi->getIncomingValueForBlock(StartBlock), NewBlockF);
+        Phi->removeIncomingValue(StartBlock);
+      }
+    }
+
+    // Update the appropriate successor of the start block to point to the new
+    // unfolded block.
+    unsigned SuccNum = StartBlockTerm->getSuccessor(1) == EndBlock ? 1 : 0;
+    StartBlockTerm->setSuccessor(SuccNum, NewBlockT);
+    DTU->applyUpdates({{DominatorTree::Delete, StartBlock, EndBlock},
+                       {DominatorTree::Insert, StartBlock, NewBlockT}});
   }
-  StartBlockTerm->eraseFromParent();
-  BranchInst::Create(TT, FT, SI->getCondition(), StartBlock);
-  DTU->applyUpdates({{DominatorTree::Insert, StartBlock, TT},
-                     {DominatorTree::Insert, StartBlock, FT}});
 
   // Preserve loop info
   if (Loop *L = LI->getLoopFor(SI->getParent())) {
@@ -372,6 +389,11 @@ struct ThreadingPath {
   /// Path is a list of basic blocks.
   const PathType &getPath() const { return Path; }
   void setPath(const PathType &NewPath) { Path = NewPath; }
+  void push_back(BasicBlock *BB) { Path.push_back(BB); }
+  void push_front(BasicBlock *BB) { Path.push_front(BB); }
+  void appendExcludingFirst(const PathType &OtherPath) {
+    Path.insert(Path.end(), OtherPath.begin() + 1, OtherPath.end());
+  }
 
   void print(raw_ostream &OS) const {
     OS << Path << " [ " << ExitVal << ", " << DBB->getName() << " ]";
@@ -530,9 +552,9 @@ struct MainSwitch {
 
 struct AllSwitchPaths {
   AllSwitchPaths(const MainSwitch *MSwitch, OptimizationRemarkEmitter *ORE,
-                 LoopInfo *LI)
+                 LoopInfo *LI, Loop *L)
       : Switch(MSwitch->getInstr()), SwitchBlock(Switch->getParent()), ORE(ORE),
-        LI(LI) {}
+        LI(LI), SwitchOuterLoop(L) {}
 
   std::vector<ThreadingPath> &getThreadingPaths() { return TPaths; }
   unsigned getNumThreadingPaths() { return TPaths.size(); }
@@ -540,10 +562,7 @@ struct AllSwitchPaths {
   BasicBlock *getSwitchBlock() { return SwitchBlock; }
 
   void run() {
-    VisitedBlocks Visited;
-    PathsType LoopPaths = paths(SwitchBlock, Visited, /* PathDepth = */ 1);
-    StateDefMap StateDef = getStateDefMap(LoopPaths);
-
+    StateDefMap StateDef = getStateDefMap();
     if (StateDef.empty()) {
       ORE->emit([&]() {
         return OptimizationRemarkMissed(DEBUG_TYPE, "SwitchNotPredictable",
@@ -553,42 +572,118 @@ struct AllSwitchPaths {
       return;
     }
 
-    for (const PathType &Path : LoopPaths) {
-      ThreadingPath TPath;
-
-      const BasicBlock *PrevBB = Path.back();
-      for (const BasicBlock *BB : Path) {
-        if (StateDef.contains(BB)) {
-          const PHINode *Phi = dyn_cast<PHINode>(StateDef[BB]);
-          assert(Phi && "Expected a state-defining instr to be a phi node.");
-
-          const Value *V = Phi->getIncomingValueForBlock(PrevBB);
-          if (const ConstantInt *C = dyn_cast<const ConstantInt>(V)) {
-            TPath.setExitValue(C);
-            TPath.setDeterminator(BB);
-            TPath.setPath(Path);
-          }
-        }
+    auto *SwitchPhi = cast<PHINode>(Switch->getOperand(0));
+    auto *SwitchPhiDefBB = SwitchPhi->getParent();
+    VisitedBlocks VB;
+    // Get paths from the determinator BBs to SwitchPhiDefBB
+    std::vector<ThreadingPath> PathsToPhiDef =
+        getPathsFromStateDefMap(StateDef, SwitchPhi, VB);
+    if (SwitchPhiDefBB == SwitchBlock) {
+      TPaths = std::move(PathsToPhiDef);
+      return;
+    }
 
-        // Switch block is the determinator, this is the final exit value.
-        if (TPath.isExitValueSet() && BB == Path.front())
-          break;
+    // Find and append paths from SwitchPhiDefBB to SwitchBlock.
+    PathsType PathsToSwitchBB =
+        paths(SwitchPhiDefBB, SwitchBlock, VB, /* PathDepth = */ 1);
+    if (PathsToSwitchBB.empty())
+      return;
 
-        PrevBB = BB;
+    std::vector<ThreadingPath> TempList;
+    for (const ThreadingPath &Path : PathsToPhiDef) {
+      for (const PathType &PathToSw : PathsToSwitchBB) {
+        ThreadingPath PathCopy(Path);
+        PathCopy.appendExcludingFirst(PathToSw);
+        TempList.push_back(PathCopy);
       }
-
-      if (TPath.isExitValueSet() && isSupported(TPath))
-        TPaths.push_back(TPath);
     }
+    TPaths = std::move(TempList);
   }
 
 private:
   // Value: an instruction that defines a switch state;
   // Key: the parent basic block of that instruction.
   typedef DenseMap<const BasicBlock *, const PHINode *> StateDefMap;
+  std::vector<ThreadingPath> getPathsFromStateDefMap(StateDefMap &StateDef,
+                                                     PHINode *Phi,
+                                                     VisitedBlocks &VB) {
+    std::vector<ThreadingPath> Res;
+    auto *PhiBB = Phi->getParent();
+    VB.insert(PhiBB);
+
+    VisitedBlocks UniqueBlocks;
+    for (auto *IncomingBB : Phi->blocks()) {
+      if (!UniqueBlocks.insert(IncomingBB).second)
+        continue;
+      if (!SwitchOuterLoop->contains(IncomingBB))
+        continue;
+
+      Value *IncomingValue = Phi->getIncomingValueForBlock(IncomingBB);
+      // We found the determinator. The is the start of our path.
+      if (auto *C = dyn_cast<ConstantInt>(IncomingValue)) {
+        // SwitchBlock is the determinator, unsupported unless its also the def.
+        if (PhiBB == SwitchBlock &&
+            SwitchBlock != cast<PHINode>(Switch->getOperand(0))->getParent())
+          continue;
+        ThreadingPath NewPath;
+        NewPath.setDeterminator(PhiBB);
+        NewPath.setExitValue(C);
+        // Don't add SwitchBlock at the start, this is handled later.
+        if (IncomingBB != SwitchBlock)
+          NewPath.push_back(IncomingBB);
+        NewPath.push_back(PhiBB);
+        Res.push_back(NewPath);
+        continue;
+      }
+      // Don't get into a cycle.
+      if (VB.contains(IncomingBB) || IncomingBB == SwitchBlock)
+        continue;
+      // Recurse up the PHI chain.
+      auto *IncomingPhi = dyn_cast<PHINode>(IncomingValue);
+      if (!IncomingPhi)
+        continue;
+      auto *IncomingPhiDefBB = IncomingPhi->getParent();
+      if (!StateDef.contains(IncomingPhiDefBB))
+        continue;
+
+      // Direct prececessor, just add to the path.
+      if (IncomingPhiDefBB == IncomingBB) {
+        std::vector<ThreadingPath> PredPaths =
+            getPathsFromStateDefMap(StateDef, IncomingPhi, VB);
+        for (ThreadingPath &Path : PredPaths) {
+          Path.push_back(PhiBB);
+          Res.push_back(std::move(Path));
+        }
+        continue;
+      }
+      // Not a direct prececessor, find intermediate paths to append to the
+      // existing path.
+      if (VB.contains(IncomingPhiDefBB))
+        continue;
 
-  PathsType paths(BasicBlock *BB, VisitedBlocks &Visited,
-                  unsigned PathDepth) const {
+      PathsType IntermediatePaths;
+      IntermediatePaths =
+          paths(IncomingPhiDefBB, IncomingBB, VB, /* PathDepth = */ 1);
+      if (IntermediatePaths.empty())
+        continue;
+
+      std::vector<ThreadingPath> PredPaths =
+          getPathsFromStateDefMap(StateDef, IncomingPhi, VB);
+      for (const ThreadingPath &Path : PredPaths) {
+        for (const PathType &IPath : IntermediatePaths) {
+          ThreadingPath NewPath(Path);
+          NewPath.appendExcludingFirst(IPath);
+          NewPath.push_back(PhiBB);
+          Res.push_back(NewPath);
+        }
+      }
+    }
+    VB.erase(PhiBB);
+    return Res;
+  }
+
+  PathsType paths(BasicBlock *BB, BasicBlock *ToBB, VisitedBlocks &Visited,
+                  unsigned PathDepth) {
     PathsType Res;
 
     // Stop exploring paths after visiting MaxPathLength blocks
@@ -603,11 +698,12 @@ struct AllSwitchPaths {
     }
 
     Visited.insert(BB);
+    if (++NumVisited > MaxNumVisitiedPaths)
+      return Res;
 
     // Stop if we have reached the BB out of loop, since its successors have no
     // impact on the DFA.
-    // TODO: Do we need to stop exploring if BB is the outer loop of the switch?
-    if (!LI->getLoopFor(BB))
+    if (!SwitchOuterLoop->contains(BB))
       return Res;
 
     // Some blocks have multiple edges to the same successor, and this set
@@ -617,9 +713,12 @@ struct AllSwitchPaths {
       if (!Successors.insert(Succ).second)
         continue;
 
-      // Found a cycle through the SwitchBlock
-      if (Succ == SwitchBlock) {
-        Res.push_back({BB});
+      // Found a cycle through the final block.
+      if (Succ == ToBB) {
+        PathType NewPath;
+        NewPath.push_back(BB);
+        NewPath.push_back(ToBB);
+        Res.push_back(NewPath);
         continue;
       }
 
@@ -627,11 +726,19 @@ struct AllSwitchPaths {
       if (Visited.contains(Succ))
         continue;
 
-      PathsType SuccPaths = paths(Succ, Visited, PathDepth + 1);
-      for (const PathType &Path : SuccPaths) {
-        PathType NewPath(Path);
-        NewPath.push_front(BB);
-        Res.push_back(NewPath);
+      auto *CurrLoop = LI->getLoopFor(BB);
+      // Unlikely to be beneficial.
+      if (Succ == CurrLoop->getHeader())
+        continue;
+      // Skip for now, revisit this condition later to see the impact on
+      // coverage and compile time.
+      if (LI->getLoopFor(Succ) != CurrLoop)
+        continue;
+
+      PathsType SuccPaths = paths(Succ, ToBB, Visited, PathDepth + 1);
+      for (PathType &Path : SuccPaths) {
+        Path.push_front(BB);
+        Res.push_back(Path);
         if (Res.size() >= MaxNumPaths) {
           return Res;
         }
@@ -648,18 +755,9 @@ struct AllSwitchPaths {
   ///
   /// Return an empty map if unpredictable values encountered inside the basic
   /// blocks of \p LoopPaths.
-  StateDefMap getStateDefMap(const PathsType &LoopPaths) const {
+  StateDefMap getStateDefMap() const {
     StateDefMap Res;
-
-    // Basic blocks belonging to any of the loops around the switch statement.
-    SmallPtrSet<BasicBlock *, 16> LoopBBs;
-    for (const PathType &Path : LoopPaths) {
-      for (BasicBlock *BB : Path)
-        LoopBBs.insert(BB);
-    }
-
     Value *FirstDef = Switch->getOperand(0);
-
     assert(isa<PHINode>(FirstDef) && "The first definition must be a phi.");
 
     SmallVector<PHINode *, 8> Stack;
@@ -674,7 +772,7 @@ struct AllSwitchPaths {
 
       for (BasicBlock *IncomingBB : CurPhi->blocks()) {
         Value *Incoming = CurPhi->getIncomingValueForBlock(IncomingBB);
-        bool IsOutsideLoops = LoopBBs.count(IncomingBB) == 0;
+        bool IsOutsideLoops = !SwitchOuterLoop->contains(IncomingBB);
         if (Incoming == FirstDef || isa<ConstantInt>(Incoming) ||
             SeenValues.contains(Incoming) || IsOutsideLoops) {
           continue;
@@ -691,67 +789,13 @@ struct AllSwitchPaths {
     return Res;
   }
 
-  /// The determinator BB should precede the switch-defining BB.
-  ///
-  /// Otherwise, it is possible that the state defined in the determinator block
-  /// defines the state for the next iteration of the loop, rather than for the
-  /// current one.
-  ///
-  /// Currently supported paths:
-  /// \code
-  /// < switch bb1 determ def > [ 42, determ ]
-  /// < switch_and_def bb1 determ > [ 42, determ ]
-  /// < switch_and_def_and_determ bb1 > [ 42, switch_and_def_and_determ ]
-  /// \endcode
-  ///
-  /// Unsupported paths:
-  /// \code
-  /// < switch bb1 def determ > [ 43, determ ]
-  /// < switch_and_determ bb1 def > [ 43, switch_and_determ ]
-  /// \endcode
-  bool isSupported(const ThreadingPath &TPath) {
-    Instruction *SwitchCondI = dyn_cast<Instruction>(Switch->getCondition());
-    assert(SwitchCondI);
-    if (!SwitchCondI)
-      return false;
-
-    const BasicBlock *SwitchCondDefBB = SwitchCondI->getParent();
-    const BasicBlock *SwitchCondUseBB = Switch->getParent();
-    const BasicBlock *DeterminatorBB = TPath.getDeterminatorBB();
-
-    assert(
-        SwitchCondUseBB == TPath.getPath().front() &&
-        "The first BB in a threading path should have the switch instruction");
-    if (SwitchCondUseBB != TPath.getPath().front())
-      return false;
-
-    // Make DeterminatorBB the first element in Path.
-    PathType Path = TPath.getPath();
-    auto ItDet = llvm::find(Path, DeterminatorBB);
-    std::rotate(Path.begin(), ItDet, Path.end());
-
-    bool IsDetBBSeen = false;
-    bool IsDefBBSeen = false;
-    bool IsUseBBSeen = false;
-    for (BasicBlock *BB : Path) {
-      if (BB == DeterminatorBB)
-        IsDetBBSeen = true;
-      if (BB == SwitchCondDefBB)
-        IsDefBBSeen = true;
-      if (BB == SwitchCondUseBB)
-        IsUseBBSeen = true;
-      if (IsDetBBSeen && IsUseBBSeen && !IsDefBBSeen)
-        return false;
-    }
-
-    return true;
-  }
-
+  unsigned NumVisited = 0;
   SwitchInst *Switch;
   BasicBlock *SwitchBlock;
   OptimizationRemarkEmitter *ORE;
   std::vector<ThreadingPath> TPaths;
   LoopInfo *LI;
+  Loop *SwitchOuterLoop;
 };
 
 struct TransformDFA {
@@ -905,9 +949,9 @@ struct TransformDFA {
     BasicBlock *SwitchBlock = SwitchPaths->getSwitchBlock();
     for (ThreadingPath &TPath : SwitchPaths->getThreadingPaths()) {
       LLVM_DEBUG(dbgs() << TPath << "\n");
-      PathType NewPath(TPath.getPath());
-      NewPath.push_back(SwitchBlock);
-      TPath.setPath(NewPath);
+      // TODO: Fix exit path creation logic so that we dont need this
+      // placeholder.
+      TPath.push_front(SwitchBlock);
     }
 
     // Transform the ThreadingPaths and keep track of the cloned values
@@ -1310,8 +1354,11 @@ bool DFAJumpThreading::run(Function &F) {
                       << " is a candidate\n");
     MainSwitch Switch(SI, LI, ORE);
 
-    if (!Switch.getInstr())
+    if (!Switch.getInstr()) {
+      LLVM_DEBUG(dbgs() << "\nSwitchInst in BB " << BB.getName() << " is not a "
+                        << "candidate for jump threading\n");
       continue;
+    }
 
     LLVM_DEBUG(dbgs() << "\nSwitchInst in BB " << BB.getName() << " is a "
                       << "candidate for jump threading\n");
@@ -1321,7 +1368,8 @@ bool DFAJumpThreading::run(Function &F) {
     if (!Switch.getSelectInsts().empty())
       MadeChanges = true;
 
-    AllSwitchPaths SwitchPaths(&Switch, ORE, LI);
+    AllSwitchPaths SwitchPaths(&Switch, ORE, LI,
+                               LI->getLoopFor(&BB)->getOutermostLoop());
     SwitchPaths.run();
 
     if (SwitchPaths.getNumThreadingPaths() > 0) {
diff --git a/llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-analysis.ll b/llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-analysis.ll
index 673e2d96693d2..e7b7dffa516c6 100644
--- a/llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-analysis.ll
+++ b/llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-analysis.ll
@@ -6,10 +6,10 @@
 ; state, and the block that determines the next state.
 ; < path of BBs that form a cycle > [ state, determinator ]
 define i32 @test1(i32 %num) {
-; CHECK: < for.body for.inc > [ 1, for.inc ]
-; CHECK-NEXT: < for.body case1 for.inc > [ 2, for.inc ]
-; CHECK-NEXT: < for.body case2 for.inc > [ 1, for.inc ]
-; CHECK-NEXT: < for.body case2 si.unfold.false for.inc > [ 2, for.inc ]
+; CHECK: < case2 for.inc for.body > [ 1, for.inc ]
+; CHECK-NEXT: < for.inc for.body > [ 1, for.inc ]
+; CHECK-NEXT: < case1 for.inc for.body > [ 2, for.inc ]
+; CHECK-NEXT: < case2 sel.si.unfold.false for.inc for.body > [ 2, sel.si.unfold.false ]
 entry:
   br label %for.body
 
@@ -43,16 +43,12 @@ for.end:
 ; complicated CFG. Here the FSM is represented as a nested loop, with
 ; fallthrough cases.
 define i32 @test2(i32 %init) {
-; CHECK: < loop.3 case2 > [ 3, loop.3 ]
-; CHECK-NEXT: < loop.3 case2 loop.1.backedge loop.1 loop.2 > [ 1, loop.1 ]
-; CHECK-NEXT: < loop.3 case2 loop.1.backedge si.unfold.false loop.1 loop.2 > [ 4, loop.1.backedge ]
-; CHECK-NEXT: < loop.3 case3 loop.2.backedge loop.2 > [ 0, loop.2.backedge ]
-; CHECK-NEXT: < loop.3 case3 case4 loop.2.backedge loop.2 > [ 3, loop.2.backedge ]
-; CHECK-NEXT: < loop.3 case3 case4 loop.1.backedge loop.1 loop.2 > [ 1, loop.1 ]
-; CHECK-NEXT: < loop.3 case3 case4 loop.1.backedge si.unfold.false loop.1 loop.2 > [ 2, loop.1.backedge ]
-; CHECK-NEXT: < loop.3 case4 loop.2.backedge loop.2 > [ 3, loop.2.backedge ]
-; CHECK-NEXT: < loop.3 case4 loop.1.backedge loop.1 loop.2 > [ 1, loop.1 ]
-; CHECK-NEXT: < loop.3 case4 loop.1.backedge si.unfold.false loop.1 loop.2 > [ 2, loop.1.backedge ]
+; CHECK: < loop.1.backedge loop.1 loop.2 loop.3 > [ 1, loop.1 ]
+; CHECK-NEXT: < case4 loop.1.backedge state.1.be2.si.unfold.false loop.1 loop.2 loop.3 > [ 2, loop.1.backedge ]
+; CHECK-NEXT: < case2 loop.1.backedge state.1.be2.si.unfold.false loop.1 loop.2 loop.3 > [ 4, loop.1.backedge ]
+; CHECK-NEXT: < case4 loop.2.backedge loop.2 loop.3 > [ 3, loop.2.backedge ]
+; CHECK-NEXT: < case3 loop.2.backedge loop.2 loop.3 > [ 0, loop.2.backedge ]
+; CHECK-NEXT: < case2 loop.3 > [ 3, loop.3 ]
 entry:
   %cmp = icmp eq i32 %init, 0
   %sel = select i1 %cmp, i32 0, i32 2
@@ -117,8 +113,8 @@ declare void @baz()
 ; current one.
 define i32 @wrong_bb_order() {
 ; CHECK-LABEL: DFA Jump threading: wrong_bb_order
-; CHECK-NOT: < bb43 bb59 bb3 bb31 bb41 > [ 77, bb43 ]
-; CHECK-NOT: < bb43 bb49 bb59 bb3 bb31 bb41 > [ 77, bb43 ]
+; CHECK-NOT: [ 77, bb43 ]
+; CHECK-NOT: [ 77, bb43 ]
 bb:
   %i = alloca [420 x i8], align 1
   %i2 = getelementptr inbounds [420 x i8], ptr %i, i64 0, i64 390
@@ -187,16 +183,12 @@ bb66:                                             ; preds = %bb59
 
 ; Value %init is not predictable but it's okay since it is the value initial to the switch.
 define i32 @initial.value.positive1(i32 %init) {
-; CHECK: < loop.3 case2 > [ 3, loop.3 ]
-; CHECK-NEXT: < loop.3 case2 loop.1.backedge loop.1 loop.2 > [ 1, loop.1 ]
-; CHECK-NEXT: < loop.3 case2 loop.1.backedge si.unfold.false loop.1 loop.2 > [ 4, loop.1.backedge ]
-; CHECK-NEXT: < loop.3 case3 loop.2.backedge loop.2 > [ 0, loop.2.backedge ]
-; CHECK-NEXT: < loop.3 case3 case4 loop.2.backedge loop.2 > [ 3, loop.2.backedge ]
-; CHECK-NEXT: < loop.3 case3 case4 loop.1.backedge loop.1 loop.2 > [ 1, loop.1 ]
-; CHECK-NEXT: < loop.3 case3 case4 loop.1.backedge si.unfold.false loop.1 loop.2 > [ 2, loop.1.backedge ]
-; CHECK-NEXT: < loop.3 case4 loop.2.backedge loop.2 > [ 3, loop.2.backedge ]
-; CHECK-NEXT: < loop.3 case4 loop.1.backedge loop.1 loop.2 > [ 1, loop.1 ]
-; CHECK-NEXT: < loop.3 case4 loop.1.backedge si.unfold.false loop.1 loop.2 > [ 2, loop.1.backedge ]
+; CHECK: < loop.1.backedge loop.1 loop.2 loop.3 > [ 1, loop.1 ]
+; CHECK-NEXT: < case4 loop.1.backedge state.1.be2.si.unfold.false loop.1 loop.2 loop.3 > [ 2, loop.1.backedge ]
+; CHECK-NEXT: < case2 loop.1.backedge state.1.be2.si.unfold.false loop.1 loop.2 loop.3 > [ 4, loop.1.backedge ]
+; CHECK-NEXT: < case4 loop.2.backedge loop.2 loop.3 > [ 3, loop.2.backedge ]
+; CHECK-NEXT: < case3 loop.2.backedge loop.2 loop.3 > [ 0, loop.2.backedge ]
+; CHECK-NEXT: < case2 loop.3 > [ 3, loop.3 ]
 entry:
   %cmp = icmp eq i32 %init, 0
   br label %loop.1
diff --git a/llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-transform.ll b/llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-transform.ll
index 72a37802d888f..c38f81d0f046e 100644
--- a/llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-transform.ll
+++ b/llvm/test/Transforms/DFAJumpThreading/dfa-jump-threading-transform.ll
@@ -10,7 +10,7 @@ define i32 @test1(i32 %num) {
 ; CHECK-NEXT:    br label [[FOR_BODY:%.*]]
 ; CHECK:       for.body:
 ; CHECK-NEXT:    [[COUNT:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[INC:%.*]], [[FOR_INC:%.*]] ]
-; CHECK-NEXT:    [[STATE:%.*]] = phi i32 [ 1, [[ENTRY]] ], [ poison, [[FOR_INC]] ]
+; CHECK-NEXT:    [[STATE:%.*]] = phi i32 [ 1, [[ENTRY]] ], [ [[STATE_NEXT:%.*]], [[FOR_INC]] ]
 ; CHECK-NEXT:    switch i32 [[STATE]], label [[FOR_INC_JT1:%.*]] [
 ; CHECK-NEXT:      i32 1, label [[CASE1:%.*]]
 ; CHECK-NEXT:      i32 2, label [[CASE2:%.*]]
@@ -29,21 +29,25 @@ define i32 @test1(i32 %num) {
 ; CHECK:       case2:
 ; CHECK-NEXT:    [[COUNT1:%.*]] = phi i32 [ [[COUNT_JT2]], [[FOR_BODY_JT2:%.*]] ], [ [[COUNT]], [[FOR_BODY]] ]
 ; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[COUNT1]], 50
-; CHECK-NEXT:    br i1 [[CMP]], label [[FOR_INC_JT1]], label [[SI_UNFOLD_FALSE:%.*]]
-; CHECK:       si.unfold.false:
+; CHECK-NEXT:    br i1 [[CMP]], label [[FOR_INC_JT1]], label [[SEL_SI_UNFOLD_FALSE_JT2:%.*]]
+; CHECK:       sel.si.unfold.false:
+; CHECK-NEXT:    br label [[FOR_INC]]
+; CHECK:       sel.si.unfold.false.jt2:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI_JT2:%.*]] = phi i32 [ 2, [[CASE2]] ]
 ; CHECK-NEXT:    br label [[FOR_INC_JT2]]
 ; CHECK:       for.inc:
+; CHECK-NEXT:    [[STATE_NEXT]] = phi i32 [ poison, [[SEL_SI_UNFOLD_FALSE:%.*]] ]
 ; CHECK-NEXT:    [[INC]] = add nsw i32 undef, 1
 ; CHECK-NEXT:    [[CMP_EXIT:%.*]] = icmp slt i32 [[INC]], [[NUM:%.*]]
 ; CHECK-NEXT:    br i1 [[CMP_EXIT]], label [[FOR_BODY]], label [[FOR_END:%.*]]
 ; CHECK:       for.inc.jt2:
-; CHECK-NEXT:    [[COUNT4:%.*]] = phi i32 [ [[COUNT1]], [[SI_UNFOLD_FALSE]] ], [ [[COUNT2]], [[CASE1]] ]
-; CHECK-NEXT:    [[STATE_NEXT_JT2]] = phi i32 [ 2, [[CASE1]] ], [ 2, [[SI_UNFOLD_FALSE]] ]
+; CHECK-NEXT:    [[COUNT4:%.*]] = phi i32 [ [[COUNT1]], [[SEL_SI_UNFOLD_FALSE_JT2]] ], [ [[COUNT2]], [[CASE1]] ]
+; CHECK-NEXT:    [[STATE_NEXT_JT2]] = phi i32 [ 2, [[CASE1]] ], [ [[DOTSI_UNFOLD_PHI_JT2]], [[SEL_SI_UNFOLD_FALSE_JT2]] ]
 ; CHECK-NEXT:    [[INC_JT2]] = add nsw i32 [[COUNT4]], 1
 ; CHECK-NEXT:    [[CMP_EXIT_JT2:%.*]] = icmp slt i32 [[INC_JT2]], [[NUM]]
 ; CHECK-NEXT:    br i1 [[CMP_EXIT_JT2]], label [[FOR_BODY_JT2]], label [[FOR_END]]
 ; CHECK:       for.inc.jt1:
-; CHECK-NEXT:    [[COUNT3:%.*]] = phi i32 [ [[COUNT1]], [[CASE2]] ], [ [[COUNT]], [[FOR_BODY]] ]
+; CHECK-NEXT:    [[COUNT3:%.*]] = phi i32 [ [[COUNT]], [[FOR_BODY]] ], [ [[COUNT1]], [[CASE2]] ]
 ; CHECK-NEXT:    [[STATE_NEXT_JT1]] = phi i32 [ 1, [[CASE2]] ], [ 1, [[FOR_BODY]] ]
 ; CHECK-NEXT:    [[INC_JT1]] = add nsw i32 [[COUNT3]], 1
 ; CHECK-NEXT:    [[CMP_EXIT_JT1:%.*]] = icmp slt i32 [[INC_JT1]], [[NUM]]
@@ -85,42 +89,46 @@ define i32 @test2(i32 %init) {
 ; CHECK-LABEL: @test2(
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[INIT:%.*]], 0
-; CHECK-NEXT:    br i1 [[CMP]], label [[LOOP_1:%.*]], label [[SI_UNFOLD_FALSE1:%.*]]
-; CHECK:       si.unfold.false:
+; CHECK-NEXT:    br i1 [[CMP]], label [[LOOP_1:%.*]], label [[SEL_SI_UNFOLD_FALSE:%.*]]
+; CHECK:       state.1.be2.si.unfold.false:
+; CHECK-NEXT:    [[STATE_1_BE_SI_UNFOLD_PHI:%.*]] = phi i32 [ poison, [[LOOP_1_BACKEDGE:%.*]] ]
 ; CHECK-NEXT:    br label [[LOOP_1]]
-; CHECK:       si.unfold.false.jt2:
-; CHECK-NEXT:    br label [[LOOP_1_JT2:%.*]]
-; CHECK:       si.unfold.false.jt4:
+; CHECK:       state.1.be2.si.unfold.false.jt4:
+; CHECK-NEXT:    [[STATE_1_BE_SI_UNFOLD_PHI_JT4:%.*]] = phi i32 [ [[STATE_1_BE_JT4:%.*]], [[LOOP_1_BACKEDGE_JT4:%.*]] ]
 ; CHECK-NEXT:    br label [[LOOP_1_JT4:%.*]]
-; CHECK:       si.unfold.false1:
+; CHECK:       state.1.be2.si.unfold.false.jt2:
+; CHECK-NEXT:    [[STATE_1_BE_SI_UNFOLD_PHI_JT2:%.*]] = phi i32 [ [[STATE_1_BE_JT2:%.*]], [[LOOP_1_BACKEDGE_JT2:%.*]] ]
+; CHECK-NEXT:    br label [[LOOP_1_JT2:%.*]]
+; CHECK:       sel.si.unfold.false:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI:%.*]] = phi i32 [ 2, [[ENTRY:%.*]] ]
 ; CHECK-NEXT:    br label [[LOOP_1]]
 ; CHECK:       loop.1:
-; CHECK-NEXT:    [[STATE_1:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ undef, [[SI_UNFOLD_FALSE:%.*]] ], [ 2, [[SI_UNFOLD_FALSE1]] ]
+; CHECK-NEXT:    [[STATE_1:%.*]] = phi i32 [ 0, [[ENTRY]] ], [ [[STATE_1_BE_SI_UNFOLD_PHI]], [[STATE_1_BE2_SI_UNFOLD_FALSE:%.*]] ], [ [[DOTSI_UNFOLD_PHI]], [[SEL_SI_UNFOLD_FALSE]] ]
 ; CHECK-NEXT:    br label [[LOOP_2:%.*]]
-; CHECK:       loop.1.jt2:
-; CHECK-NEXT:    [[STATE_1_JT2:%.*]] = phi i32 [ [[STATE_1_BE_JT2:%.*]], [[SI_UNFOLD_FALSE_JT2:%.*]] ]
-; CHECK-NEXT:    br label [[LOOP_2_JT2:%.*]]
 ; CHECK:       loop.1.jt4:
-; CHECK-NEXT:    [[STATE_1_JT4:%.*]] = phi i32 [ [[STATE_1_BE_JT4:%.*]], [[SI_UNFOLD_FALSE_JT4:%.*]] ]
+; CHECK-NEXT:    [[STATE_1_JT4:%.*]] = phi i32 [ [[STATE_1_BE_SI_UNFOLD_PHI_JT4]], [[STATE_1_BE2_SI_UNFOLD_FALSE_JT4:%.*]] ]
 ; CHECK-NEXT:    br label [[LOOP_2_JT4:%.*]]
+; CHECK:       loop.1.jt2:
+; CHECK-NEXT:    [[STATE_1_JT2:%.*]] = phi i32 [ [[STATE_1_BE_SI_UNFOLD_PHI_JT2]], [[STATE_1_BE2_SI_UNFOLD_FALSE_JT2:%.*]] ]
+; CHECK-NEXT:    br label [[LOOP_2_JT2:%.*]]
 ; CHECK:       loop.1.jt1:
-; CHECK-NEXT:    [[STATE_1_JT1:%.*]] = phi i32 [ 1, [[LOOP_1_BACKEDGE:%.*]] ], [ 1, [[LOOP_1_BACKEDGE_JT4:%.*]] ], [ 1, [[LOOP_1_BACKEDGE_JT2:%.*]] ]
+; CHECK-NEXT:    [[STATE_1_JT1:%.*]] = phi i32 [ 1, [[LOOP_1_BACKEDGE]] ], [ 1, [[LOOP_1_BACKEDGE_JT2]] ], [ 1, [[LOOP_1_BACKEDGE_JT4]] ]
 ; CHECK-NEXT:    br label [[LOOP_2_JT1:%.*]]
 ; CHECK:       loop.2:
 ; CHECK-NEXT:    [[STATE_2:%.*]] = phi i32 [ [[STATE_1]], [[LOOP_1]] ], [ poison, [[LOOP_2_BACKEDGE:%.*]] ]
 ; CHECK-NEXT:    br label [[LOOP_3:%.*]]
-; CHECK:       loop.2.jt2:
-; CHECK-NEXT:    [[STATE_2_JT2:%.*]] = phi i32 [ [[STATE_1_JT2]], [[LOOP_1_JT2]] ]
-; CHECK-NEXT:    br label [[LOOP_3_JT2:%.*]]
-; CHECK:       loop.2.jt3:
-; CHECK-NEXT:    [[STATE_2_JT3:%.*]] = phi i32 [ [[STATE_2_BE_JT3:%.*]], [[LOOP_2_BACKEDGE_JT3:%.*]] ]
-; CHECK-NEXT:    br label [[LOOP_3_JT3:%.*]]
 ; CHECK:       loop.2.jt0:
 ; CHECK-NEXT:    [[STATE_2_JT0:%.*]] = phi i32 [ [[STATE_2_BE_JT0:%.*]], [[LOOP_2_BACKEDGE_JT0:%.*]] ]
 ; CHECK-NEXT:    br label [[LOOP_3_JT0:%.*]]
+; CHECK:       loop.2.jt3:
+; CHECK-NEXT:    [[STATE_2_JT3:%.*]] = phi i32 [ [[STATE_2_BE_JT3:%.*]], [[LOOP_2_BACKEDGE_JT3:%.*]] ]
+; CHECK-NEXT:    br label [[LOOP_3_JT3:%.*]]
 ; CHECK:       loop.2.jt4:
 ; CHECK-NEXT:    [[STATE_2_JT4:%.*]] = phi i32 [ [[STATE_1_JT4]], [[LOOP_1_JT4]] ]
 ; CHECK-NEXT:    br label [[LOOP_3_JT4:%.*]]
+; CHECK:       loop.2.jt2:
+; CHECK-NEXT:    [[STATE_2_JT2:%.*]] = phi i32 [ [[STATE_1_JT2]], [[LOOP_1_JT2]] ]
+; CHECK-NEXT:    br label [[LOOP_3_JT2:%.*]]
 ; CHECK:       loop.2.jt1:
 ; CHECK-NEXT:    [[STATE_2_JT1:%.*]] = phi i32 [ [[STATE_1_JT1]], [[LOOP_1_JT1:%.*]] ]
 ; CHECK-NEXT:    br label [[LOOP_3_JT1:%.*]]
@@ -133,21 +141,21 @@ define i32 @test2(i32 %init) {
 ; CHECK-NEXT:      i32 0, label [[CASE0:%.*]]
 ; CHECK-NEXT:      i32 1, label [[CASE1:%.*]]
 ; CHECK-NEXT:    ]
-; CHECK:       loop.3.jt2:
-; CHECK-NEXT:    [[STATE_JT2:%.*]] = phi i32 [ [[STATE_2_JT2]], [[LOOP_2_JT2]] ]
-; CHECK-NEXT:    br label [[CASE2]]
 ; CHECK:       loop.3.jt0:
 ; CHECK-NEXT:    [[STATE_JT0:%.*]] = phi i32 [ [[STATE_2_JT0]], [[LOOP_2_JT0:%.*]] ]
 ; CHECK-NEXT:    br label [[CASE0]]
+; CHECK:       loop.3.jt3:
+; CHECK-NEXT:    [[STATE_JT3:%.*]] = phi i32 [ 3, [[CASE2]] ], [ [[STATE_2_JT3]], [[LOOP_2_JT3:%.*]] ]
+; CHECK-NEXT:    br label [[CASE3]]
 ; CHECK:       loop.3.jt4:
 ; CHECK-NEXT:    [[STATE_JT4:%.*]] = phi i32 [ [[STATE_2_JT4]], [[LOOP_2_JT4]] ]
 ; CHECK-NEXT:    br label [[CASE4]]
+; CHECK:       loop.3.jt2:
+; CHECK-NEXT:    [[STATE_JT2:%.*]] = phi i32 [ [[STATE_2_JT2]], [[LOOP_2_JT2]] ]
+; CHECK-NEXT:    br label [[CASE2]]
 ; CHECK:       loop.3.jt1:
 ; CHECK-NEXT:    [[STATE_JT1:%.*]] = phi i32 [ [[STATE_2_JT1]], [[LOOP_2_JT1]] ]
 ; CHECK-NEXT:    br label [[CASE1]]
-; CHECK:       loop.3.jt3:
-; CHECK-NEXT:    [[STATE_JT3:%.*]] = phi i32 [ 3, [[CASE2]] ], [ [[STATE_2_JT3]], [[LOOP_2_JT3:%.*]] ]
-; CHECK-NEXT:    br label [[CASE3]]
 ; CHECK:       case2:
 ; CHECK-NEXT:    br i1 [[CMP]], label [[LOOP_3_JT3]], label [[LOOP_1_BACKEDGE_JT4]]
 ; CHECK:       case3:
@@ -155,21 +163,21 @@ define i32 @test2(i32 %init) {
 ; CHECK:       case4:
 ; CHECK-NEXT:    br i1 [[CMP]], label [[LOOP_2_BACKEDGE_JT3]], label [[LOOP_1_BACKEDGE_JT2]]
 ; CHECK:       loop.1.backedge:
-; CHECK-NEXT:    br i1 [[CMP]], label [[LOOP_1_JT1]], label [[SI_UNFOLD_FALSE]]
-; CHECK:       loop.1.backedge.jt2:
-; CHECK-NEXT:    [[STATE_1_BE_JT2]] = phi i32 [ 2, [[CASE4]] ]
-; CHECK-NEXT:    br i1 [[CMP]], label [[LOOP_1_JT1]], label [[SI_UNFOLD_FALSE_JT2]]
+; CHECK-NEXT:    br i1 [[CMP]], label [[LOOP_1_JT1]], label [[STATE_1_BE2_SI_UNFOLD_FALSE]]
 ; CHECK:       loop.1.backedge.jt4:
 ; CHECK-NEXT:    [[STATE_1_BE_JT4]] = phi i32 [ 4, [[CASE2]] ]
-; CHECK-NEXT:    br i1 [[CMP]], label [[LOOP_1_JT1]], label [[SI_UNFOLD_FALSE_JT4]]
+; CHECK-NEXT:    br i1 [[CMP]], label [[LOOP_1_JT1]], label [[STATE_1_BE2_SI_UNFOLD_FALSE_JT4]]
+; CHECK:       loop.1.backedge.jt2:
+; CHECK-NEXT:    [[STATE_1_BE_JT2]] = phi i32 [ 2, [[CASE4]] ]
+; CHECK-NEXT:    br i1 [[CMP]], label [[LOOP_1_JT1]], label [[STATE_1_BE2_SI_UNFOLD_FALSE_JT2]]
 ; CHECK:       loop.2.backedge:
 ; CHECK-NEXT:    br label [[LOOP_2]]
-; CHECK:       loop.2.backedge.jt3:
-; CHECK-NEXT:    [[STATE_2_BE_JT3]] = phi i32 [ 3, [[CASE4]] ]
-; CHECK-NEXT:    br label [[LOOP_2_JT3]]
 ; CHECK:       loop.2.backedge.jt0:
 ; CHECK-NEXT:    [[STATE_2_BE_JT0]] = phi i32 [ 0, [[CASE3]] ]
 ; CHECK-NEXT:    br label [[LOOP_2_JT0]]
+; CHECK:       loop.2.backedge.jt3:
+; CHECK-NEXT:    [[STATE_2_BE_JT3]] = phi i32 [ 3, [[CASE4]] ]
+; CHECK-NEXT:    br label [[LOOP_2_JT3]]
 ; CHECK:       case0:
 ; CHECK-NEXT:    br label [[EXIT:%.*]]
 ; CHECK:       case1:
diff --git a/llvm/test/Transforms/DFAJumpThreading/dfa-unfold-select.ll b/llvm/test/Transforms/DFAJumpThreading/dfa-unfold-select.ll
index 696bd55d2dfdd..366446a1cc9e4 100644
--- a/llvm/test/Transforms/DFAJumpThreading/dfa-unfold-select.ll
+++ b/llvm/test/Transforms/DFAJumpThreading/dfa-unfold-select.ll
@@ -13,7 +13,7 @@ define i32 @test1(i32 %num) {
 ; CHECK-NEXT:    br label [[FOR_BODY:%.*]]
 ; CHECK:       for.body:
 ; CHECK-NEXT:    [[COUNT:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[INC:%.*]], [[FOR_INC:%.*]] ]
-; CHECK-NEXT:    [[STATE:%.*]] = phi i32 [ 1, [[ENTRY]] ], [ poison, [[FOR_INC]] ]
+; CHECK-NEXT:    [[STATE:%.*]] = phi i32 [ 1, [[ENTRY]] ], [ [[STATE_NEXT:%.*]], [[FOR_INC]] ]
 ; CHECK-NEXT:    switch i32 [[STATE]], label [[FOR_INC_JT1:%.*]] [
 ; CHECK-NEXT:      i32 1, label [[CASE1:%.*]]
 ; CHECK-NEXT:      i32 2, label [[CASE2:%.*]]
@@ -32,21 +32,25 @@ define i32 @test1(i32 %num) {
 ; CHECK:       case2:
 ; CHECK-NEXT:    [[COUNT1:%.*]] = phi i32 [ [[COUNT_JT2]], [[FOR_BODY_JT2:%.*]] ], [ [[COUNT]], [[FOR_BODY]] ]
 ; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i32 [[COUNT1]], 50
-; CHECK-NEXT:    br i1 [[CMP]], label [[FOR_INC_JT1]], label [[SI_UNFOLD_FALSE:%.*]]
-; CHECK:       si.unfold.false:
+; CHECK-NEXT:    br i1 [[CMP]], label [[FOR_INC_JT1]], label [[SEL_SI_UNFOLD_FALSE_JT2:%.*]]
+; CHECK:       sel.si.unfold.false:
+; CHECK-NEXT:    br label [[FOR_INC]]
+; CHECK:       sel.si.unfold.false.jt2:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI_JT2:%.*]] = phi i32 [ 2, [[CASE2]] ]
 ; CHECK-NEXT:    br label [[FOR_INC_JT2]]
 ; CHECK:       for.inc:
+; CHECK-NEXT:    [[STATE_NEXT]] = phi i32 [ poison, [[SEL_SI_UNFOLD_FALSE:%.*]] ]
 ; CHECK-NEXT:    [[INC]] = add nsw i32 undef, 1
 ; CHECK-NEXT:    [[CMP_EXIT:%.*]] = icmp slt i32 [[INC]], [[NUM:%.*]]
 ; CHECK-NEXT:    br i1 [[CMP_EXIT]], label [[FOR_BODY]], label [[FOR_END:%.*]]
 ; CHECK:       for.inc.jt2:
-; CHECK-NEXT:    [[COUNT4:%.*]] = phi i32 [ [[COUNT1]], [[SI_UNFOLD_FALSE]] ], [ [[COUNT2]], [[CASE1]] ]
-; CHECK-NEXT:    [[STATE_NEXT_JT2]] = phi i32 [ 2, [[CASE1]] ], [ 2, [[SI_UNFOLD_FALSE]] ]
+; CHECK-NEXT:    [[COUNT4:%.*]] = phi i32 [ [[COUNT1]], [[SEL_SI_UNFOLD_FALSE_JT2]] ], [ [[COUNT2]], [[CASE1]] ]
+; CHECK-NEXT:    [[STATE_NEXT_JT2]] = phi i32 [ 2, [[CASE1]] ], [ [[DOTSI_UNFOLD_PHI_JT2]], [[SEL_SI_UNFOLD_FALSE_JT2]] ]
 ; CHECK-NEXT:    [[INC_JT2]] = add nsw i32 [[COUNT4]], 1
 ; CHECK-NEXT:    [[CMP_EXIT_JT2:%.*]] = icmp slt i32 [[INC_JT2]], [[NUM]]
 ; CHECK-NEXT:    br i1 [[CMP_EXIT_JT2]], label [[FOR_BODY_JT2]], label [[FOR_END]]
 ; CHECK:       for.inc.jt1:
-; CHECK-NEXT:    [[COUNT3:%.*]] = phi i32 [ [[COUNT1]], [[CASE2]] ], [ [[COUNT]], [[FOR_BODY]] ]
+; CHECK-NEXT:    [[COUNT3:%.*]] = phi i32 [ [[COUNT]], [[FOR_BODY]] ], [ [[COUNT1]], [[CASE2]] ]
 ; CHECK-NEXT:    [[STATE_NEXT_JT1]] = phi i32 [ 1, [[CASE2]] ], [ 1, [[FOR_BODY]] ]
 ; CHECK-NEXT:    [[INC_JT1]] = add nsw i32 [[COUNT3]], 1
 ; CHECK-NEXT:    [[CMP_EXIT_JT1:%.*]] = icmp slt i32 [[INC_JT1]], [[NUM]]
@@ -89,15 +93,11 @@ define i32 @test2(i32 %num) {
 ; CHECK-NEXT:    br label [[FOR_BODY:%.*]]
 ; CHECK:       for.body:
 ; CHECK-NEXT:    [[COUNT:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[INC:%.*]], [[FOR_INC:%.*]] ]
-; CHECK-NEXT:    [[STATE:%.*]] = phi i32 [ 1, [[ENTRY]] ], [ poison, [[FOR_INC]] ]
+; CHECK-NEXT:    [[STATE:%.*]] = phi i32 [ 1, [[ENTRY]] ], [ [[STATE_NEXT:%.*]], [[FOR_INC]] ]
 ; CHECK-NEXT:    switch i32 [[STATE]], label [[FOR_INC_JT1:%.*]] [
 ; CHECK-NEXT:      i32 1, label [[CASE1:%.*]]
 ; CHECK-NEXT:      i32 2, label [[CASE2:%.*]]
 ; CHECK-NEXT:    ]
-; CHECK:       for.body.jt3:
-; CHECK-NEXT:    [[COUNT_JT3:%.*]] = phi i32 [ [[INC_JT3:%.*]], [[FOR_INC_JT3:%.*]] ]
-; CHECK-NEXT:    [[STATE_JT3:%.*]] = phi i32 [ [[STATE_NEXT_JT3:%.*]], [[FOR_INC_JT3]] ]
-; CHECK-NEXT:    br label [[FOR_INC_JT1]]
 ; CHECK:       for.body.jt2:
 ; CHECK-NEXT:    [[COUNT_JT2:%.*]] = phi i32 [ [[INC_JT2:%.*]], [[FOR_INC_JT2:%.*]] ]
 ; CHECK-NEXT:    [[STATE_JT2:%.*]] = phi i32 [ [[STATE_NEXT_JT2:%.*]], [[FOR_INC_JT2]] ]
@@ -106,46 +106,78 @@ define i32 @test2(i32 %num) {
 ; CHECK-NEXT:    [[COUNT_JT1:%.*]] = phi i32 [ [[INC_JT1:%.*]], [[FOR_INC_JT1]] ]
 ; CHECK-NEXT:    [[STATE_JT1:%.*]] = phi i32 [ [[STATE_NEXT_JT1:%.*]], [[FOR_INC_JT1]] ]
 ; CHECK-NEXT:    br label [[CASE1]]
+; CHECK:       for.body.jt3:
+; CHECK-NEXT:    [[COUNT_JT3:%.*]] = phi i32 [ [[INC_JT3:%.*]], [[FOR_INC_JT3:%.*]] ]
+; CHECK-NEXT:    [[STATE_JT3:%.*]] = phi i32 [ [[STATE_NEXT_JT3:%.*]], [[FOR_INC_JT3]] ]
+; CHECK-NEXT:    br label [[FOR_INC]]
 ; CHECK:       case1:
-; CHECK-NEXT:    [[COUNT5:%.*]] = phi i32 [ [[COUNT_JT1]], [[FOR_BODY_JT1:%.*]] ], [ [[COUNT]], [[FOR_BODY]] ]
-; CHECK-NEXT:    [[CMP_C1:%.*]] = icmp slt i32 [[COUNT5]], 50
-; CHECK-NEXT:    [[CMP2_C1:%.*]] = icmp slt i32 [[COUNT5]], 100
-; CHECK-NEXT:    br i1 [[CMP2_C1]], label [[SI_UNFOLD_TRUE:%.*]], label [[FOR_INC_JT3]]
+; CHECK-NEXT:    [[COUNT6:%.*]] = phi i32 [ [[COUNT_JT1]], [[FOR_BODY_JT1:%.*]] ], [ [[COUNT]], [[FOR_BODY]] ]
+; CHECK-NEXT:    [[CMP_C1:%.*]] = icmp slt i32 [[COUNT6]], 50
+; CHECK-NEXT:    [[CMP2_C1:%.*]] = icmp slt i32 [[COUNT6]], 100
+; CHECK-NEXT:    br i1 [[CMP2_C1]], label [[STATE1_1_SI_UNFOLD_TRUE_JT1:%.*]], label [[STATE1_2_SI_UNFOLD_FALSE_JT3:%.*]]
 ; CHECK:       case2:
-; CHECK-NEXT:    [[COUNT3:%.*]] = phi i32 [ [[COUNT_JT2]], [[FOR_BODY_JT2:%.*]] ], [ [[COUNT]], [[FOR_BODY]] ]
-; CHECK-NEXT:    [[CMP_C2:%.*]] = icmp slt i32 [[COUNT3]], 50
-; CHECK-NEXT:    [[CMP2_C2:%.*]] = icmp sgt i32 [[COUNT3]], 100
-; CHECK-NEXT:    br i1 [[CMP2_C2]], label [[FOR_INC_JT3]], label [[SI_UNFOLD_FALSE:%.*]]
-; CHECK:       si.unfold.false:
-; CHECK-NEXT:    br i1 [[CMP_C2]], label [[FOR_INC_JT1]], label [[SI_UNFOLD_FALSE1:%.*]]
-; CHECK:       si.unfold.false1:
+; CHECK-NEXT:    [[CMP_C2:%.*]] = icmp slt i32 [[COUNT]], 50
+; CHECK-NEXT:    [[CMP2_C2:%.*]] = icmp sgt i32 [[COUNT]], 100
+; CHECK-NEXT:    br i1 [[CMP2_C2]], label [[FOR_INC_JT3]], label [[STATE2_1_SI_UNFOLD_TRUE_JT1:%.*]]
+; CHECK:       state2.1.si.unfold.true:
+; CHECK-NEXT:    br i1 [[CMP_C2]], label [[STATE2_2_SI_UNFOLD_FALSE:%.*]], label [[STATE2_1_SI_UNFOLD_FALSE_JT2:%.*]]
+; CHECK:       state2.1.si.unfold.true.jt1:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI_JT1:%.*]] = phi i32 [ 1, [[CASE2]] ]
+; CHECK-NEXT:    br i1 [[CMP_C2]], label [[STATE2_2_SI_UNFOLD_FALSE_JT1:%.*]], label [[STATE2_1_SI_UNFOLD_FALSE:%.*]]
+; CHECK:       state2.1.si.unfold.false:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI1:%.*]] = phi i32 [ 2, [[STATE2_1_SI_UNFOLD_TRUE_JT1]] ]
+; CHECK-NEXT:    br label [[STATE2_2_SI_UNFOLD_FALSE]]
+; CHECK:       state2.1.si.unfold.false.jt2:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI1_JT2:%.*]] = phi i32 [ 2, [[STATE2_1_SI_UNFOLD_TRUE:%.*]] ]
+; CHECK-NEXT:    br label [[STATE2_2_SI_UNFOLD_FALSE_JT2:%.*]]
+; CHECK:       state2.2.si.unfold.false:
+; CHECK-NEXT:    [[STATE2_1_SI_UNFOLD_PHI:%.*]] = phi i32 [ poison, [[STATE2_1_SI_UNFOLD_TRUE]] ], [ [[DOTSI_UNFOLD_PHI1]], [[STATE2_1_SI_UNFOLD_FALSE]] ]
+; CHECK-NEXT:    br label [[FOR_INC]]
+; CHECK:       state2.2.si.unfold.false.jt2:
+; CHECK-NEXT:    [[STATE2_1_SI_UNFOLD_PHI_JT2:%.*]] = phi i32 [ [[DOTSI_UNFOLD_PHI1_JT2]], [[STATE2_1_SI_UNFOLD_FALSE_JT2]] ]
 ; CHECK-NEXT:    br label [[FOR_INC_JT2]]
-; CHECK:       si.unfold.true:
-; CHECK-NEXT:    br i1 [[CMP_C1]], label [[FOR_INC_JT1]], label [[SI_UNFOLD_FALSE2:%.*]]
-; CHECK:       si.unfold.false2:
+; CHECK:       state2.2.si.unfold.false.jt1:
+; CHECK-NEXT:    [[STATE2_1_SI_UNFOLD_PHI_JT1:%.*]] = phi i32 [ [[DOTSI_UNFOLD_PHI_JT1]], [[STATE2_1_SI_UNFOLD_TRUE_JT1]] ]
+; CHECK-NEXT:    br label [[FOR_INC_JT1]]
+; CHECK:       state1.2.si.unfold.false:
+; CHECK-NEXT:    br label [[FOR_INC]]
+; CHECK:       state1.2.si.unfold.false.jt3:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI2_JT3:%.*]] = phi i32 [ 3, [[CASE1]] ]
+; CHECK-NEXT:    br label [[FOR_INC_JT3]]
+; CHECK:       state1.1.si.unfold.true:
+; CHECK-NEXT:    br i1 [[CMP_C1]], label [[FOR_INC]], label [[STATE1_1_SI_UNFOLD_FALSE_JT2:%.*]]
+; CHECK:       state1.1.si.unfold.true.jt1:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI3_JT1:%.*]] = phi i32 [ 1, [[CASE1]] ]
+; CHECK-NEXT:    br i1 [[CMP_C1]], label [[FOR_INC_JT1]], label [[STATE1_1_SI_UNFOLD_FALSE:%.*]]
+; CHECK:       state1.1.si.unfold.false:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI4:%.*]] = phi i32 [ 2, [[STATE1_1_SI_UNFOLD_TRUE_JT1]] ]
+; CHECK-NEXT:    br label [[FOR_INC]]
+; CHECK:       state1.1.si.unfold.false.jt2:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI4_JT2:%.*]] = phi i32 [ 2, [[STATE1_1_SI_UNFOLD_TRUE:%.*]] ]
 ; CHECK-NEXT:    br label [[FOR_INC_JT2]]
 ; CHECK:       for.inc:
-; CHECK-NEXT:    [[INC]] = add nsw i32 undef, 1
+; CHECK-NEXT:    [[COUNT5:%.*]] = phi i32 [ [[COUNT_JT3]], [[FOR_BODY_JT3:%.*]] ], [ undef, [[STATE1_1_SI_UNFOLD_TRUE]] ], [ [[COUNT6]], [[STATE1_1_SI_UNFOLD_FALSE]] ], [ undef, [[STATE1_2_SI_UNFOLD_FALSE:%.*]] ], [ [[COUNT]], [[STATE2_2_SI_UNFOLD_FALSE]] ]
+; CHECK-NEXT:    [[STATE_NEXT]] = phi i32 [ [[STATE2_1_SI_UNFOLD_PHI]], [[STATE2_2_SI_UNFOLD_FALSE]] ], [ poison, [[STATE1_2_SI_UNFOLD_FALSE]] ], [ poison, [[STATE1_1_SI_UNFOLD_TRUE]] ], [ [[DOTSI_UNFOLD_PHI4]], [[STATE1_1_SI_UNFOLD_FALSE]] ], [ 1, [[FOR_BODY_JT3]] ]
+; CHECK-NEXT:    [[INC]] = add nsw i32 [[COUNT5]], 1
 ; CHECK-NEXT:    [[CMP_EXIT:%.*]] = icmp slt i32 [[INC]], [[NUM:%.*]]
 ; CHECK-NEXT:    br i1 [[CMP_EXIT]], label [[FOR_BODY]], label [[FOR_END:%.*]]
-; CHECK:       for.inc.jt3:
-; CHECK-NEXT:    [[COUNT6:%.*]] = phi i32 [ [[COUNT3]], [[CASE2]] ], [ [[COUNT5]], [[CASE1]] ]
-; CHECK-NEXT:    [[STATE_NEXT_JT3]] = phi i32 [ 3, [[CASE1]] ], [ 3, [[CASE2]] ]
-; CHECK-NEXT:    [[INC_JT3]] = add nsw i32 [[COUNT6]], 1
-; CHECK-NEXT:    [[CMP_EXIT_JT3:%.*]] = icmp slt i32 [[INC_JT3]], [[NUM]]
-; CHECK-NEXT:    br i1 [[CMP_EXIT_JT3]], label [[FOR_BODY_JT3:%.*]], label [[FOR_END]]
 ; CHECK:       for.inc.jt2:
-; CHECK-NEXT:    [[COUNT7:%.*]] = phi i32 [ [[COUNT3]], [[SI_UNFOLD_FALSE1]] ], [ [[COUNT5]], [[SI_UNFOLD_FALSE2]] ]
-; CHECK-NEXT:    [[STATE_NEXT_JT2]] = phi i32 [ 2, [[SI_UNFOLD_FALSE1]] ], [ 2, [[SI_UNFOLD_FALSE2]] ]
-; CHECK-NEXT:    [[INC_JT2]] = add nsw i32 [[COUNT7]], 1
+; CHECK-NEXT:    [[STATE_NEXT_JT2]] = phi i32 [ [[STATE2_1_SI_UNFOLD_PHI_JT2]], [[STATE2_2_SI_UNFOLD_FALSE_JT2]] ], [ [[DOTSI_UNFOLD_PHI4_JT2]], [[STATE1_1_SI_UNFOLD_FALSE_JT2]] ]
+; CHECK-NEXT:    [[INC_JT2]] = add nsw i32 undef, 1
 ; CHECK-NEXT:    [[CMP_EXIT_JT2:%.*]] = icmp slt i32 [[INC_JT2]], [[NUM]]
-; CHECK-NEXT:    br i1 [[CMP_EXIT_JT2]], label [[FOR_BODY_JT2]], label [[FOR_END]]
+; CHECK-NEXT:    br i1 [[CMP_EXIT_JT2]], label [[FOR_BODY_JT2:%.*]], label [[FOR_END]]
 ; CHECK:       for.inc.jt1:
-; CHECK-NEXT:    [[COUNT4:%.*]] = phi i32 [ [[COUNT_JT3]], [[FOR_BODY_JT3]] ], [ [[COUNT3]], [[SI_UNFOLD_FALSE]] ], [ [[COUNT5]], [[SI_UNFOLD_TRUE]] ], [ [[COUNT]], [[FOR_BODY]] ]
-; CHECK-NEXT:    [[STATE_NEXT_JT1]] = phi i32 [ 1, [[FOR_BODY]] ], [ 1, [[SI_UNFOLD_FALSE]] ], [ 1, [[SI_UNFOLD_TRUE]] ], [ 1, [[FOR_BODY_JT3]] ]
-; CHECK-NEXT:    [[INC_JT1]] = add nsw i32 [[COUNT4]], 1
+; CHECK-NEXT:    [[COUNT7:%.*]] = phi i32 [ [[COUNT6]], [[STATE1_1_SI_UNFOLD_TRUE_JT1]] ], [ [[COUNT]], [[STATE2_2_SI_UNFOLD_FALSE_JT1]] ], [ [[COUNT]], [[FOR_BODY]] ]
+; CHECK-NEXT:    [[STATE_NEXT_JT1]] = phi i32 [ 1, [[FOR_BODY]] ], [ [[STATE2_1_SI_UNFOLD_PHI_JT1]], [[STATE2_2_SI_UNFOLD_FALSE_JT1]] ], [ [[DOTSI_UNFOLD_PHI3_JT1]], [[STATE1_1_SI_UNFOLD_TRUE_JT1]] ]
+; CHECK-NEXT:    [[INC_JT1]] = add nsw i32 [[COUNT7]], 1
 ; CHECK-NEXT:    [[CMP_EXIT_JT1:%.*]] = icmp slt i32 [[INC_JT1]], [[NUM]]
 ; CHECK-NEXT:    br i1 [[CMP_EXIT_JT1]], label [[FOR_BODY_JT1]], label [[FOR_END]]
+; CHECK:       for.inc.jt3:
+; CHECK-NEXT:    [[COUNT8:%.*]] = phi i32 [ [[COUNT6]], [[STATE1_2_SI_UNFOLD_FALSE_JT3]] ], [ [[COUNT]], [[CASE2]] ]
+; CHECK-NEXT:    [[STATE_NEXT_JT3]] = phi i32 [ 3, [[CASE2]] ], [ [[DOTSI_UNFOLD_PHI2_JT3]], [[STATE1_2_SI_UNFOLD_FALSE_JT3]] ]
+; CHECK-NEXT:    [[INC_JT3]] = add nsw i32 [[COUNT8]], 1
+; CHECK-NEXT:    [[CMP_EXIT_JT3:%.*]] = icmp slt i32 [[INC_JT3]], [[NUM]]
+; CHECK-NEXT:    br i1 [[CMP_EXIT_JT3]], label [[FOR_BODY_JT3]], label [[FOR_END]]
 ; CHECK:       for.end:
 ; CHECK-NEXT:    ret i32 0
 ;
@@ -190,7 +222,7 @@ define i32 @test3(i32 %num) {
 ; CHECK-NEXT:    br label [[FOR_BODY:%.*]]
 ; CHECK:       for.body:
 ; CHECK-NEXT:    [[COUNT:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[INC:%.*]], [[FOR_INC:%.*]] ]
-; CHECK-NEXT:    [[STATE:%.*]] = phi i32 [ 1, [[ENTRY]] ], [ poison, [[FOR_INC]] ]
+; CHECK-NEXT:    [[STATE:%.*]] = phi i32 [ 1, [[ENTRY]] ], [ [[STATE_NEXT:%.*]], [[FOR_INC]] ]
 ; CHECK-NEXT:    switch i32 [[STATE]], label [[FOR_INC_JT1:%.*]] [
 ; CHECK-NEXT:      i32 1, label [[CASE1:%.*]]
 ; CHECK-NEXT:      i32 2, label [[CASE2:%.*]]
@@ -212,47 +244,70 @@ define i32 @test3(i32 %num) {
 ; CHECK-NEXT:    [[STATE_JT1:%.*]] = phi i32 [ [[STATE_NEXT_JT1:%.*]], [[FOR_INC_JT1]] ]
 ; CHECK-NEXT:    br label [[CASE1]]
 ; CHECK:       case1:
-; CHECK-NEXT:    [[COUNT5:%.*]] = phi i32 [ [[COUNT_JT1]], [[FOR_BODY_JT1:%.*]] ], [ [[COUNT]], [[FOR_BODY]] ]
+; CHECK-NEXT:    [[COUNT6:%.*]] = phi i32 [ [[COUNT_JT1]], [[FOR_BODY_JT1:%.*]] ], [ [[COUNT]], [[FOR_BODY]] ]
 ; CHECK-NEXT:    br label [[FOR_INC_JT2]]
 ; CHECK:       case2:
-; CHECK-NEXT:    [[COUNT4:%.*]] = phi i32 [ [[COUNT_JT2]], [[FOR_BODY_JT2:%.*]] ], [ [[COUNT]], [[FOR_BODY]] ]
-; CHECK-NEXT:    [[CMP_1:%.*]] = icmp slt i32 [[COUNT4]], 50
-; CHECK-NEXT:    [[CMP_2:%.*]] = icmp slt i32 [[COUNT4]], 100
-; CHECK-NEXT:    [[TMP0:%.*]] = and i32 [[COUNT4]], 1
+; CHECK-NEXT:    [[COUNT5:%.*]] = phi i32 [ [[COUNT_JT2]], [[FOR_BODY_JT2:%.*]] ], [ [[COUNT]], [[FOR_BODY]] ]
+; CHECK-NEXT:    [[CMP_1:%.*]] = icmp slt i32 [[COUNT5]], 50
+; CHECK-NEXT:    [[CMP_2:%.*]] = icmp slt i32 [[COUNT5]], 100
+; CHECK-NEXT:    [[TMP0:%.*]] = and i32 [[COUNT5]], 1
 ; CHECK-NEXT:    [[CMP_3:%.*]] = icmp eq i32 [[TMP0]], 0
-; CHECK-NEXT:    br i1 [[CMP_3]], label [[SI_UNFOLD_TRUE:%.*]], label [[SI_UNFOLD_FALSE:%.*]]
-; CHECK:       si.unfold.true:
-; CHECK-NEXT:    br i1 [[CMP_1]], label [[FOR_INC_JT1]], label [[SI_UNFOLD_FALSE2:%.*]]
-; CHECK:       si.unfold.false:
-; CHECK-NEXT:    br i1 [[CMP_2]], label [[FOR_INC_JT3]], label [[SI_UNFOLD_FALSE1:%.*]]
-; CHECK:       si.unfold.false1:
+; CHECK-NEXT:    br i1 [[CMP_3]], label [[SEL_1_SI_UNFOLD_TRUE_JT1:%.*]], label [[SEL_2_SI_UNFOLD_TRUE_JT3:%.*]]
+; CHECK:       sel.2.si.unfold.true:
+; CHECK-NEXT:    br i1 [[CMP_2]], label [[SEL_3_SI_UNFOLD_FALSE:%.*]], label [[SEL_2_SI_UNFOLD_FALSE_JT4:%.*]]
+; CHECK:       sel.2.si.unfold.true.jt3:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI_JT3:%.*]] = phi i32 [ 3, [[CASE2]] ]
+; CHECK-NEXT:    br i1 [[CMP_2]], label [[SEL_3_SI_UNFOLD_FALSE_JT3:%.*]], label [[SEL_2_SI_UNFOLD_FALSE:%.*]]
+; CHECK:       sel.2.si.unfold.false:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI1:%.*]] = phi i32 [ 4, [[SEL_2_SI_UNFOLD_TRUE_JT3]] ]
+; CHECK-NEXT:    br label [[SEL_3_SI_UNFOLD_FALSE]]
+; CHECK:       sel.2.si.unfold.false.jt4:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI1_JT4:%.*]] = phi i32 [ 4, [[SEL_2_SI_UNFOLD_TRUE:%.*]] ]
+; CHECK-NEXT:    br label [[SEL_3_SI_UNFOLD_FALSE_JT4:%.*]]
+; CHECK:       sel.3.si.unfold.false:
+; CHECK-NEXT:    [[SEL_2_SI_UNFOLD_PHI:%.*]] = phi i32 [ poison, [[SEL_2_SI_UNFOLD_TRUE]] ], [ [[DOTSI_UNFOLD_PHI1]], [[SEL_2_SI_UNFOLD_FALSE]] ]
+; CHECK-NEXT:    br label [[FOR_INC]]
+; CHECK:       sel.3.si.unfold.false.jt4:
+; CHECK-NEXT:    [[SEL_2_SI_UNFOLD_PHI_JT4:%.*]] = phi i32 [ [[DOTSI_UNFOLD_PHI1_JT4]], [[SEL_2_SI_UNFOLD_FALSE_JT4]] ]
 ; CHECK-NEXT:    br label [[FOR_INC_JT4]]
-; CHECK:       si.unfold.false2:
+; CHECK:       sel.3.si.unfold.false.jt3:
+; CHECK-NEXT:    [[SEL_2_SI_UNFOLD_PHI_JT3:%.*]] = phi i32 [ [[DOTSI_UNFOLD_PHI_JT3]], [[SEL_2_SI_UNFOLD_TRUE_JT3]] ]
+; CHECK-NEXT:    br label [[FOR_INC_JT3]]
+; CHECK:       sel.1.si.unfold.true:
+; CHECK-NEXT:    br i1 [[CMP_1]], label [[FOR_INC]], label [[SEL_1_SI_UNFOLD_FALSE_JT2:%.*]]
+; CHECK:       sel.1.si.unfold.true.jt1:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI2_JT1:%.*]] = phi i32 [ 1, [[CASE2]] ]
+; CHECK-NEXT:    br i1 [[CMP_1]], label [[FOR_INC_JT1]], label [[SEL_1_SI_UNFOLD_FALSE:%.*]]
+; CHECK:       sel.1.si.unfold.false:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI3:%.*]] = phi i32 [ 2, [[SEL_1_SI_UNFOLD_TRUE_JT1]] ]
+; CHECK-NEXT:    br label [[FOR_INC]]
+; CHECK:       sel.1.si.unfold.false.jt2:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI3_JT2:%.*]] = phi i32 [ 2, [[SEL_1_SI_UNFOLD_TRUE:%.*]] ]
 ; CHECK-NEXT:    br label [[FOR_INC_JT2]]
 ; CHECK:       for.inc:
-; CHECK-NEXT:    [[INC]] = add nsw i32 undef, 1
+; CHECK-NEXT:    [[STATE_NEXT]] = phi i32 [ [[SEL_2_SI_UNFOLD_PHI]], [[SEL_3_SI_UNFOLD_FALSE]] ], [ poison, [[SEL_1_SI_UNFOLD_TRUE]] ], [ [[DOTSI_UNFOLD_PHI3]], [[SEL_1_SI_UNFOLD_FALSE]] ]
+; CHECK-NEXT:    [[INC]] = add nsw i32 [[COUNT5]], 1
 ; CHECK-NEXT:    [[CMP_EXIT:%.*]] = icmp slt i32 [[INC]], [[NUM:%.*]]
 ; CHECK-NEXT:    br i1 [[CMP_EXIT]], label [[FOR_BODY]], label [[FOR_END:%.*]]
 ; CHECK:       for.inc.jt4:
-; CHECK-NEXT:    [[STATE_NEXT_JT4]] = phi i32 [ 4, [[SI_UNFOLD_FALSE1]] ]
-; CHECK-NEXT:    [[INC_JT4]] = add nsw i32 [[COUNT4]], 1
+; CHECK-NEXT:    [[STATE_NEXT_JT4]] = phi i32 [ [[SEL_2_SI_UNFOLD_PHI_JT4]], [[SEL_3_SI_UNFOLD_FALSE_JT4]] ]
+; CHECK-NEXT:    [[INC_JT4]] = add nsw i32 undef, 1
 ; CHECK-NEXT:    [[CMP_EXIT_JT4:%.*]] = icmp slt i32 [[INC_JT4]], [[NUM]]
 ; CHECK-NEXT:    br i1 [[CMP_EXIT_JT4]], label [[FOR_BODY_JT4:%.*]], label [[FOR_END]]
 ; CHECK:       for.inc.jt3:
-; CHECK-NEXT:    [[STATE_NEXT_JT3]] = phi i32 [ 3, [[SI_UNFOLD_FALSE]] ]
-; CHECK-NEXT:    [[INC_JT3]] = add nsw i32 [[COUNT4]], 1
+; CHECK-NEXT:    [[STATE_NEXT_JT3]] = phi i32 [ [[SEL_2_SI_UNFOLD_PHI_JT3]], [[SEL_3_SI_UNFOLD_FALSE_JT3]] ]
+; CHECK-NEXT:    [[INC_JT3]] = add nsw i32 [[COUNT5]], 1
 ; CHECK-NEXT:    [[CMP_EXIT_JT3:%.*]] = icmp slt i32 [[INC_JT3]], [[NUM]]
 ; CHECK-NEXT:    br i1 [[CMP_EXIT_JT3]], label [[FOR_BODY_JT3:%.*]], label [[FOR_END]]
 ; CHECK:       for.inc.jt2:
-; CHECK-NEXT:    [[COUNT6:%.*]] = phi i32 [ [[COUNT4]], [[SI_UNFOLD_FALSE2]] ], [ [[COUNT5]], [[CASE1]] ]
-; CHECK-NEXT:    [[STATE_NEXT_JT2]] = phi i32 [ 2, [[CASE1]] ], [ 2, [[SI_UNFOLD_FALSE2]] ]
+; CHECK-NEXT:    [[STATE_NEXT_JT2]] = phi i32 [ 2, [[CASE1]] ], [ [[DOTSI_UNFOLD_PHI3_JT2]], [[SEL_1_SI_UNFOLD_FALSE_JT2]] ]
 ; CHECK-NEXT:    [[INC_JT2]] = add nsw i32 [[COUNT6]], 1
 ; CHECK-NEXT:    [[CMP_EXIT_JT2:%.*]] = icmp slt i32 [[INC_JT2]], [[NUM]]
 ; CHECK-NEXT:    br i1 [[CMP_EXIT_JT2]], label [[FOR_BODY_JT2]], label [[FOR_END]]
 ; CHECK:       for.inc.jt1:
-; CHECK-NEXT:    [[COUNT3:%.*]] = phi i32 [ [[COUNT_JT4]], [[FOR_BODY_JT4]] ], [ [[COUNT_JT3]], [[FOR_BODY_JT3]] ], [ [[COUNT4]], [[SI_UNFOLD_TRUE]] ], [ [[COUNT]], [[FOR_BODY]] ]
-; CHECK-NEXT:    [[STATE_NEXT_JT1]] = phi i32 [ 1, [[FOR_BODY]] ], [ 1, [[SI_UNFOLD_TRUE]] ], [ 1, [[FOR_BODY_JT3]] ], [ 1, [[FOR_BODY_JT4]] ]
-; CHECK-NEXT:    [[INC_JT1]] = add nsw i32 [[COUNT3]], 1
+; CHECK-NEXT:    [[COUNT4:%.*]] = phi i32 [ [[COUNT_JT4]], [[FOR_BODY_JT4]] ], [ [[COUNT_JT3]], [[FOR_BODY_JT3]] ], [ [[COUNT5]], [[SEL_1_SI_UNFOLD_TRUE_JT1]] ], [ [[COUNT]], [[FOR_BODY]] ]
+; CHECK-NEXT:    [[STATE_NEXT_JT1]] = phi i32 [ 1, [[FOR_BODY]] ], [ 1, [[FOR_BODY_JT3]] ], [ 1, [[FOR_BODY_JT4]] ], [ [[DOTSI_UNFOLD_PHI2_JT1]], [[SEL_1_SI_UNFOLD_TRUE_JT1]] ]
+; CHECK-NEXT:    [[INC_JT1]] = add nsw i32 [[COUNT4]], 1
 ; CHECK-NEXT:    [[CMP_EXIT_JT1:%.*]] = icmp slt i32 [[INC_JT1]], [[NUM]]
 ; CHECK-NEXT:    br i1 [[CMP_EXIT_JT1]], label [[FOR_BODY_JT1]], label [[FOR_END]]
 ; CHECK:       for.end:
@@ -324,18 +379,25 @@ define void @pr65222(i32 %flags, i1 %cmp, i1 %tobool.not) {
 ; CHECK:       while.cond:
 ; CHECK-NEXT:    br i1 [[CMP:%.*]], label [[THEN:%.*]], label [[IF_END:%.*]]
 ; CHECK:       then:
-; CHECK-NEXT:    br i1 [[TOBOOL_NOT:%.*]], label [[SI_UNFOLD_TRUE:%.*]], label [[SI_UNFOLD_FALSE:%.*]]
-; CHECK:       si.unfold.true:
-; CHECK-NEXT:    br i1 [[CMP]], label [[IF_END]], label [[SI_UNFOLD_FALSE2:%.*]]
-; CHECK:       si.unfold.false:
-; CHECK-NEXT:    br i1 [[CMP]], label [[IF_END]], label [[SI_UNFOLD_FALSE1:%.*]]
-; CHECK:       si.unfold.false1:
+; CHECK-NEXT:    br i1 [[TOBOOL_NOT:%.*]], label [[COND1_SI_UNFOLD_TRUE:%.*]], label [[COND_SI_UNFOLD_TRUE:%.*]]
+; CHECK:       cond.si.unfold.true:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI:%.*]] = phi i32 [ 2, [[THEN]] ]
+; CHECK-NEXT:    br i1 [[CMP]], label [[TOUNFOLD_SI_UNFOLD_FALSE:%.*]], label [[COND_SI_UNFOLD_FALSE:%.*]]
+; CHECK:       cond.si.unfold.false:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI1:%.*]] = phi i32 [ 0, [[COND_SI_UNFOLD_TRUE]] ]
+; CHECK-NEXT:    br label [[TOUNFOLD_SI_UNFOLD_FALSE]]
+; CHECK:       tounfold.si.unfold.false:
+; CHECK-NEXT:    [[COND_SI_UNFOLD_PHI:%.*]] = phi i32 [ [[DOTSI_UNFOLD_PHI]], [[COND_SI_UNFOLD_TRUE]] ], [ [[DOTSI_UNFOLD_PHI1]], [[COND_SI_UNFOLD_FALSE]] ]
 ; CHECK-NEXT:    br label [[IF_END]]
-; CHECK:       si.unfold.false2:
+; CHECK:       cond1.si.unfold.true:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI2:%.*]] = phi i32 [ 3, [[THEN]] ]
+; CHECK-NEXT:    br i1 [[CMP]], label [[IF_END]], label [[COND1_SI_UNFOLD_FALSE:%.*]]
+; CHECK:       cond1.si.unfold.false:
+; CHECK-NEXT:    [[DOTSI_UNFOLD_PHI3:%.*]] = phi i32 [ 1, [[COND1_SI_UNFOLD_TRUE]] ]
 ; CHECK-NEXT:    br label [[IF_END]]
 ; CHECK:       if.end:
-; CHECK-NEXT:    [[UNFOLDED:%.*]] = phi i32 [ [[FLAGS:%.*]], [[WHILE_COND]] ], [ 3, [[SI_UNFOLD_TRUE]] ], [ 2, [[SI_UNFOLD_FALSE]] ], [ 0, [[SI_UNFOLD_FALSE1]] ], [ 1, [[SI_UNFOLD_FALSE2]] ]
-; CHECK-NEXT:    [[OTHER:%.*]] = phi i32 [ [[FLAGS]], [[WHILE_COND]] ], [ 0, [[SI_UNFOLD_TRUE]] ], [ 0, [[SI_UNFOLD_FALSE]] ], [ 0, [[SI_UNFOLD_FALSE1]] ], [ 0, [[SI_UNFOLD_FALSE2]] ]
+; CHECK-NEXT:    [[UNFOLDED:%.*]] = phi i32 [ [[FLAGS:%.*]], [[WHILE_COND]] ], [ [[COND_SI_UNFOLD_PHI]], [[TOUNFOLD_SI_UNFOLD_FALSE]] ], [ [[DOTSI_UNFOLD_PHI2]], [[COND1_SI_UNFOLD_TRUE]] ], [ [[DOTSI_UNFOLD_PHI3]], [[COND1_SI_UNFOLD_FALSE]] ]
+; CHECK-NEXT:    [[OTHER:%.*]] = phi i32 [ [[FLAGS]], [[WHILE_COND]] ], [ 0, [[TOUNFOLD_SI_UNFOLD_FALSE]] ], [ 0, [[COND1_SI_UNFOLD_TRUE]] ], [ 0, [[COND1_SI_UNFOLD_FALSE]] ]
 ; CHECK-NEXT:    switch i32 [[UNFOLDED]], label [[UNREACHABLE:%.*]] [
 ; CHECK-NEXT:      i32 0, label [[SW_BB:%.*]]
 ; CHECK-NEXT:    ]
diff --git a/llvm/test/Transforms/DFAJumpThreading/max-path-length.ll b/llvm/test/Transforms/DFAJumpThreading/max-path-length.ll
index 5205ca2cabd32..92747629f638f 100644
--- a/llvm/test/Transforms/DFAJumpThreading/max-path-length.ll
+++ b/llvm/test/Transforms/DFAJumpThreading/max-path-length.ll
@@ -1,64 +1,18 @@
-; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
-; RUN: opt -S -passes=dfa-jump-threading -dfa-max-path-length=6 %s | FileCheck %s
+; REQUIRES: asserts
+; RUN: opt -S -passes=dfa-jump-threading -dfa-max-path-length=3 %s \
+; RUN:  2>&1 -disable-output -debug-only=dfa-jump-threading | FileCheck %s
+; RUN: opt -S -passes=dfa-jump-threading -dfa-max-num-visited-paths=3 %s \
+; RUN:  2>&1 -disable-output -debug-only=dfa-jump-threading | FileCheck %s
 
 ; Make the path
-;   <%for.body %case1 %case1.1 %case1.2 %case1.3 %case1.4 %for.inc %for.end>
+;   < case1 case1.1 case1.2 case1.3 case1.4 for.inc for.body > [ 3, case1 ]
 ; too long so that it is not jump-threaded.
 define i32 @max_path_length(i32 %num) {
-; CHECK-LABEL: @max_path_length(
-; CHECK-NEXT:  entry:
-; CHECK-NEXT:    br label [[FOR_BODY:%.*]]
-; CHECK:       for.body:
-; CHECK-NEXT:    [[COUNT:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[INC:%.*]], [[FOR_INC:%.*]] ]
-; CHECK-NEXT:    [[STATE:%.*]] = phi i32 [ 1, [[ENTRY]] ], [ [[STATE_NEXT:%.*]], [[FOR_INC]] ]
-; CHECK-NEXT:    switch i32 [[STATE]], label [[FOR_INC_JT1:%.*]] [
-; CHECK-NEXT:    i32 1, label [[CASE1:%.*]]
-; CHECK-NEXT:    i32 2, label [[CASE2:%.*]]
-; CHECK-NEXT:    ]
-; CHECK:       for.body.jt2:
-; CHECK-NEXT:    [[COUNT_JT2:%.*]] = phi i32 [ [[INC_JT2:%.*]], [[FOR_INC_JT2:%.*]] ]
-; CHECK-NEXT:    [[STATE_JT2:%.*]] = phi i32 [ [[STATE_NEXT_JT2:%.*]], [[FOR_INC_JT2]] ]
-; CHECK-NEXT:    br label [[CASE2]]
-; CHECK:       for.body.jt1:
-; CHECK-NEXT:    [[COUNT_JT1:%.*]] = phi i32 [ [[INC_JT1:%.*]], [[FOR_INC_JT1]] ]
-; CHECK-NEXT:    [[STATE_JT1:%.*]] = phi i32 [ [[STATE_NEXT_JT1:%.*]], [[FOR_INC_JT1]] ]
-; CHECK-NEXT:    br label [[CASE1]]
-; CHECK:       case1:
-; CHECK-NEXT:    [[COUNT2:%.*]] = phi i32 [ [[COUNT_JT1]], [[FOR_BODY_JT1:%.*]] ], [ [[COUNT]], [[FOR_BODY]] ]
-; CHECK-NEXT:    br label [[CASE1_1:%.*]]
-; CHECK:       case1.1:
-; CHECK-NEXT:    br label [[CASE1_2:%.*]]
-; CHECK:       case1.2:
-; CHECK-NEXT:    br label [[CASE1_3:%.*]]
-; CHECK:       case1.3:
-; CHECK-NEXT:    br label [[CASE1_4:%.*]]
-; CHECK:       case1.4:
-; CHECK-NEXT:    br label [[FOR_INC]]
-; CHECK:       case2:
-; CHECK-NEXT:    [[COUNT1:%.*]] = phi i32 [ [[COUNT_JT2]], [[FOR_BODY_JT2:%.*]] ], [ [[COUNT]], [[FOR_BODY]] ]
-; CHECK-NEXT:    [[CMP:%.*]] = icmp eq i32 [[COUNT1]], 50
-; CHECK-NEXT:    br i1 [[CMP]], label [[FOR_INC_JT1]], label [[SI_UNFOLD_FALSE:%.*]]
-; CHECK:       si.unfold.false:
-; CHECK-NEXT:    br label [[FOR_INC_JT2]]
-; CHECK:       for.inc:
-; CHECK-NEXT:    [[STATE_NEXT]] = phi i32 [ 2, [[CASE1_4]] ]
-; CHECK-NEXT:    [[INC]] = add nsw i32 [[COUNT2]], 1
-; CHECK-NEXT:    [[CMP_EXIT:%.*]] = icmp slt i32 [[INC]], [[NUM:%.*]]
-; CHECK-NEXT:    br i1 [[CMP_EXIT]], label [[FOR_BODY]], label [[FOR_END:%.*]]
-; CHECK:       for.inc.jt2:
-; CHECK-NEXT:    [[STATE_NEXT_JT2]] = phi i32 [ 2, [[SI_UNFOLD_FALSE]] ]
-; CHECK-NEXT:    [[INC_JT2]] = add nsw i32 [[COUNT1]], 1
-; CHECK-NEXT:    [[CMP_EXIT_JT2:%.*]] = icmp slt i32 [[INC_JT2]], [[NUM]]
-; CHECK-NEXT:    br i1 [[CMP_EXIT_JT2]], label [[FOR_BODY_JT2]], label [[FOR_END]]
-; CHECK:       for.inc.jt1:
-; CHECK-NEXT:    [[COUNT3:%.*]] = phi i32 [ [[COUNT1]], [[CASE2]] ], [ [[COUNT]], [[FOR_BODY]] ]
-; CHECK-NEXT:    [[STATE_NEXT_JT1]] = phi i32 [ 1, [[CASE2]] ], [ 1, [[FOR_BODY]] ]
-; CHECK-NEXT:    [[INC_JT1]] = add nsw i32 [[COUNT3]], 1
-; CHECK-NEXT:    [[CMP_EXIT_JT1:%.*]] = icmp slt i32 [[INC_JT1]], [[NUM]]
-; CHECK-NEXT:    br i1 [[CMP_EXIT_JT1]], label [[FOR_BODY_JT1]], label [[FOR_END]]
-; CHECK:       for.end:
-; CHECK-NEXT:    ret i32 0
-;
+; CHECK-NOT: 3, case1
+; CHECK: < case2 for.inc for.body > [ 1, for.inc ]
+; CHECK-NEXT: < for.inc for.body > [ 1, for.inc ]
+; CHECK-NEXT: < case2 sel.si.unfold.false for.inc for.body > [ 2, sel.si.unfold.false ]
+; CHECK-NEXT: DFA-JT: Renaming non-local uses of: 
 entry:
   br label %for.body
 
@@ -71,6 +25,7 @@ for.body:
   ]
 
 case1:
+  %case1.state.next = phi i32 [ 3, %for.body ]
   br label %case1.1
 
 case1.1:
@@ -91,7 +46,7 @@ case2:
   br label %for.inc
 
 for.inc:
-  %state.next = phi i32 [ %sel, %case2 ], [ 1, %for.body ], [ 2, %case1.4 ]
+  %state.next = phi i32 [ %sel, %case2 ], [ 1, %for.body ], [ %case1.state.next, %case1.4 ]
   %inc = add nsw i32 %count, 1
   %cmp.exit = icmp slt i32 %inc, %num
   br i1 %cmp.exit, label %for.body, label %for.end



More information about the llvm-commits mailing list