[llvm] r335553 - [PM/LoopUnswitch] Teach the new unswitch to handle nontrivial
Chandler Carruth via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 9 02:51:22 PDT 2018
FYI for folks testing out this functionality, there are collection of
serious bugs in this commit. I've got a fix and just need to add some
testing. Will have it landed tomorrow. Just wanted to send a heads up.
+Mikael Holmén <mikael.holmen at ericsson.com> +Fedor Sergeev
<fedor.sergeev at azul.com>
On Mon, Jun 25, 2018 at 4:37 PM Chandler Carruth via llvm-commits <
llvm-commits at lists.llvm.org> wrote:
> Author: chandlerc
> Date: Mon Jun 25 16:32:54 2018
> New Revision: 335553
>
> URL: http://llvm.org/viewvc/llvm-project?rev=335553&view=rev
> Log:
> [PM/LoopUnswitch] Teach the new unswitch to handle nontrivial
> unswitching of switches.
>
> This works much like trivial unswitching of switches in that it reliably
> moves the switch out of the loop. Here we potentially clone the entire
> loop into each successor of the switch and re-point the cases at these
> clones.
>
> Due to the complexity of actually doing nontrivial unswitching, this
> patch doesn't create a dedicated routine for handling switches -- it
> would duplicate far too much code. Instead, it generalizes the existing
> routine to handle both branches and switches as it largely reduces to
> looping in a few places instead of doing something once. This actually
> improves the results in some cases with branches due to being much more
> careful about how dead regions of code are managed. With branches,
> because exactly one clone is created and there are exactly two edges
> considered, somewhat sloppy handling of the dead regions of code was
> sufficient in most cases. But with switches, there are much more
> complicated patterns of dead code and so I've had to move to a more
> robust model generally. We still do as much pruning of the dead code
> early as possible because that allows us to avoid even cloning the code.
>
> This also surfaced another problem with nontrivial unswitching before
> which is that we weren't as precise in reconstructing loops as we could
> have been. This seems to have been mostly harmless, but resulted in
> pointless LCSSA PHI nodes and other unnecessary cruft. With switches, we
> have to get this *right*, and everything benefits from it.
>
> While the testing may seem a bit light here because we only have two
> real cases with actual switches, they do a surprisingly good job of
> exercising numerous edge cases. Also, because we share the logic with
> branches, most of the changes in this patch are reasonably well covered
> by existing tests.
>
> The new unswitch now has all of the same fundamental power as the old
> one with the exception of the single unsound case of *partial* switch
> unswitching -- that really is just loop specialization and not
> unswitching at all. It doesn't fit into the canonicalization model in
> any way. We can add a loop specialization pass that runs late based on
> profile data if important test cases ever come up here.
>
> Differential Revision: https://reviews.llvm.org/D47683
>
> Modified:
> llvm/trunk/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h
> llvm/trunk/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
> llvm/trunk/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch.ll
>
> Modified: llvm/trunk/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h?rev=335553&r1=335552&r2=335553&view=diff
>
> ==============================================================================
> --- llvm/trunk/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h
> (original)
> +++ llvm/trunk/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h Mon Jun
> 25 16:32:54 2018
> @@ -17,9 +17,9 @@
>
> namespace llvm {
>
> -/// This pass transforms loops that contain branches on loop-invariant
> -/// conditions to have multiple loops. For example, it turns the left
> into the
> -/// right code:
> +/// This pass transforms loops that contain branches or switches on loop-
> +/// invariant conditions to have multiple loops. For example, it turns
> the left
> +/// into the right code:
> ///
> /// for (...) if (lic)
> /// A for (...)
> @@ -35,6 +35,31 @@ namespace llvm {
> /// This pass expects LICM to be run before it to hoist invariant
> conditions out
> /// of the loop, to make the unswitching opportunity obvious.
> ///
> +/// There is a taxonomy of unswitching that we use to classify different
> forms
> +/// of this transformaiton:
> +///
> +/// - Trival unswitching: this is when the condition can be unswitched
> without
> +/// cloning any code from inside the loop. A non-trivial unswitch
> requires
> +/// code duplication.
> +///
> +/// - Full unswitching: this is when the branch or switch is completely
> moved
> +/// from inside the loop to outside the loop. Partial unswitching
> removes the
> +/// branch from the clone of the loop but must leave a (somewhat
> simplified)
> +/// branch in the original loop. While theoretically partial
> unswitching can
> +/// be done for switches, the requirements are extreme - we need the
> loop
> +/// invariant input to the switch to be sufficient to collapse to a
> single
> +/// successor in each clone.
> +///
> +/// This pass always does trivial, full unswitching for both branches and
> +/// switches. For branches, it also always does trivial, partial
> unswitching.
> +///
> +/// If enabled (via the constructor's `NonTrivial` parameter), this pass
> will
> +/// additionally do non-trivial, full unswitching for branches and
> switches, and
> +/// will do non-trivial, partial unswitching for branches.
> +///
> +/// Because partial unswitching of switches is extremely unlikely to be
> possible
> +/// in practice and significantly complicates the implementation, this
> pass does
> +/// not currently implement that in any mode.
> class SimpleLoopUnswitchPass : public
> PassInfoMixin<SimpleLoopUnswitchPass> {
> bool NonTrivial;
>
>
> Modified: llvm/trunk/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp?rev=335553&r1=335552&r2=335553&view=diff
>
> ==============================================================================
> --- llvm/trunk/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp (original)
> +++ llvm/trunk/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp Mon Jun 25
> 16:32:54 2018
> @@ -715,8 +715,12 @@ static bool unswitchAllTrivialConditions
> ///
> /// This routine handles cloning all of the necessary loop blocks and exit
> /// blocks including rewriting their instructions and the relevant PHI
> nodes.
> -/// It skips loop and exit blocks that are not necessary based on the
> provided
> -/// set. It also correctly creates the unconditional branch in the cloned
> +/// Any loop blocks or exit blocks which are dominated by a different
> successor
> +/// than the one for this clone of the loop blocks can be trivially
> skipped. We
> +/// use the `DominatingSucc` map to determine whether a block satisfies
> that
> +/// property with a simple map lookup.
> +///
> +/// It also correctly creates the unconditional branch in the cloned
> /// unswitched parent block to only point at the unswitched successor.
> ///
> /// This does not handle most of the necessary updates to `LoopInfo`.
> Only exit
> @@ -730,7 +734,7 @@ static BasicBlock *buildClonedLoopBlocks
> Loop &L, BasicBlock *LoopPH, BasicBlock *SplitBB,
> ArrayRef<BasicBlock *> ExitBlocks, BasicBlock *ParentBB,
> BasicBlock *UnswitchedSuccBB, BasicBlock *ContinueSuccBB,
> - const SmallPtrSetImpl<BasicBlock *> &SkippedLoopAndExitBlocks,
> + const SmallDenseMap<BasicBlock *, BasicBlock *, 16> &DominatingSucc,
> ValueToValueMapTy &VMap,
> SmallVectorImpl<DominatorTree::UpdateType> &DTUpdates,
> AssumptionCache &AC,
> DominatorTree &DT, LoopInfo &LI) {
> @@ -751,19 +755,26 @@ static BasicBlock *buildClonedLoopBlocks
> return NewBB;
> };
>
> + // We skip cloning blocks when they have a dominating succ that is not
> the
> + // succ we are cloning for.
> + auto SkipBlock = [&](BasicBlock *BB) {
> + auto It = DominatingSucc.find(BB);
> + return It != DominatingSucc.end() && It->second != UnswitchedSuccBB;
> + };
> +
> // First, clone the preheader.
> auto *ClonedPH = CloneBlock(LoopPH);
>
> // Then clone all the loop blocks, skipping the ones that aren't
> necessary.
> for (auto *LoopBB : L.blocks())
> - if (!SkippedLoopAndExitBlocks.count(LoopBB))
> + if (!SkipBlock(LoopBB))
> CloneBlock(LoopBB);
>
> // Split all the loop exit edges so that when we clone the exit blocks,
> if
> // any of the exit blocks are *also* a preheader for some other loop, we
> // don't create multiple predecessors entering the loop header.
> for (auto *ExitBB : ExitBlocks) {
> - if (SkippedLoopAndExitBlocks.count(ExitBB))
> + if (SkipBlock(ExitBB))
> continue;
>
> // When we are going to clone an exit, we don't need to clone all the
> @@ -841,7 +852,7 @@ static BasicBlock *buildClonedLoopBlocks
> // Update any PHI nodes in the cloned successors of the skipped blocks
> to not
> // have spurious incoming values.
> for (auto *LoopBB : L.blocks())
> - if (SkippedLoopAndExitBlocks.count(LoopBB))
> + if (SkipBlock(LoopBB))
> for (auto *SuccBB : successors(LoopBB))
> if (auto *ClonedSuccBB =
> cast_or_null<BasicBlock>(VMap.lookup(SuccBB)))
> for (PHINode &PN : ClonedSuccBB->phis())
> @@ -1175,10 +1186,41 @@ static void buildClonedLoops(Loop &OrigL
> }
>
> static void
> +deleteDeadClonedBlocks(Loop &L, ArrayRef<BasicBlock *> ExitBlocks,
> + ArrayRef<std::unique_ptr<ValueToValueMapTy>> VMaps,
> + DominatorTree &DT) {
> + // Find all the dead clones, and remove them from their successors.
> + SmallVector<BasicBlock *, 16> DeadBlocks;
> + for (BasicBlock *BB : llvm::concat<BasicBlock *const>(L.blocks(),
> ExitBlocks))
> + for (auto &VMap : VMaps)
> + if (BasicBlock *ClonedBB =
> cast_or_null<BasicBlock>(VMap->lookup(BB)))
> + if (!DT.isReachableFromEntry(ClonedBB)) {
> + for (BasicBlock *SuccBB : successors(ClonedBB))
> + SuccBB->removePredecessor(ClonedBB);
> + DeadBlocks.push_back(ClonedBB);
> + }
> +
> + // Drop any remaining references to break cycles.
> + for (BasicBlock *BB : DeadBlocks)
> + BB->dropAllReferences();
> + // Erase them from the IR.
> + for (BasicBlock *BB : DeadBlocks)
> + BB->eraseFromParent();
> +}
> +
> +static void
> deleteDeadBlocksFromLoop(Loop &L,
> - const SmallVectorImpl<BasicBlock *> &DeadBlocks,
> SmallVectorImpl<BasicBlock *> &ExitBlocks,
> DominatorTree &DT, LoopInfo &LI) {
> + // Find all the dead blocks, and remove them from their successors.
> + SmallVector<BasicBlock *, 16> DeadBlocks;
> + for (BasicBlock *BB : llvm::concat<BasicBlock *const>(L.blocks(),
> ExitBlocks))
> + if (!DT.isReachableFromEntry(BB)) {
> + for (BasicBlock *SuccBB : successors(BB))
> + SuccBB->removePredecessor(BB);
> + DeadBlocks.push_back(BB);
> + }
> +
> SmallPtrSet<BasicBlock *, 16> DeadBlockSet(DeadBlocks.begin(),
> DeadBlocks.end());
>
> @@ -1187,11 +1229,6 @@ deleteDeadBlocksFromLoop(Loop &L,
> llvm::erase_if(ExitBlocks,
> [&](BasicBlock *BB) { return DeadBlockSet.count(BB); });
>
> - // Remove these blocks from their successors.
> - for (auto *BB : DeadBlocks)
> - for (BasicBlock *SuccBB : successors(BB))
> - SuccBB->removePredecessor(BB, /*DontDeleteUselessPHIs*/ true);
> -
> // Walk from this loop up through its parents removing all of the dead
> blocks.
> for (Loop *ParentL = &L; ParentL; ParentL = ParentL->getParentLoop()) {
> for (auto *BB : DeadBlocks)
> @@ -1582,31 +1619,24 @@ void visitDomSubTree(DominatorTree &DT,
> } while (!DomWorklist.empty());
> }
>
> -/// Take an invariant branch that has been determined to be safe and
> worthwhile
> -/// to unswitch despite being non-trivial to do so and perform the
> unswitch.
> -///
> -/// This directly updates the CFG to hoist the predicate out of the loop,
> and
> -/// clone the necessary parts of the loop to maintain behavior.
> -///
> -/// It also updates both dominator tree and loopinfo based on the
> unswitching.
> -///
> -/// Once unswitching has been performed it runs the provided callback to
> report
> -/// the new loops and no-longer valid loops to the caller.
> -static bool unswitchInvariantBranch(
> - Loop &L, BranchInst &BI, ArrayRef<Value *> Invariants, DominatorTree
> &DT,
> - LoopInfo &LI, AssumptionCache &AC,
> +static bool unswitchNontrivialInvariants(
> + Loop &L, TerminatorInst &TI, ArrayRef<Value *> Invariants,
> + DominatorTree &DT, LoopInfo &LI, AssumptionCache &AC,
> function_ref<void(bool, ArrayRef<Loop *>)> UnswitchCB) {
> - auto *ParentBB = BI.getParent();
> -
> - // We can only unswitch conditional branches with an invariant
> condition or
> - // combining invariant conditions with an instruction.
> - assert(BI.isConditional() && "Can only unswitch a conditional branch!");
> - bool FullUnswitch = BI.getCondition() == Invariants[0];
> + auto *ParentBB = TI.getParent();
> + BranchInst *BI = dyn_cast<BranchInst>(&TI);
> + SwitchInst *SI = BI ? nullptr : cast<SwitchInst>(&TI);
> +
> + // We can only unswitch switches, conditional branches with an invariant
> + // condition, or combining invariant conditions with an instruction.
> + assert((SI || BI->isConditional()) &&
> + "Can only unswitch switches and conditional branch!");
> + bool FullUnswitch = SI || BI->getCondition() == Invariants[0];
> if (FullUnswitch)
> assert(Invariants.size() == 1 &&
> "Cannot have other invariants with full unswitching!");
> else
> - assert(isa<Instruction>(BI.getCondition()) &&
> + assert(isa<Instruction>(BI->getCondition()) &&
> "Partial unswitching requires an instruction as the
> condition!");
>
> // Constant and BBs tracking the cloned and continuing successor. When
> we are
> @@ -1618,18 +1648,27 @@ static bool unswitchInvariantBranch(
> bool Direction = true;
> int ClonedSucc = 0;
> if (!FullUnswitch) {
> - if (cast<Instruction>(BI.getCondition())->getOpcode() !=
> Instruction::Or) {
> - assert(cast<Instruction>(BI.getCondition())->getOpcode() ==
> Instruction::And &&
> - "Only `or` and `and` instructions can combine invariants being
> unswitched.");
> + if (cast<Instruction>(BI->getCondition())->getOpcode() !=
> Instruction::Or) {
> + assert(cast<Instruction>(BI->getCondition())->getOpcode() ==
> + Instruction::And &&
> + "Only `or` and `and` instructions can combine invariants
> being "
> + "unswitched.");
> Direction = false;
> ClonedSucc = 1;
> }
> }
> - auto *UnswitchedSuccBB = BI.getSuccessor(ClonedSucc);
> - auto *ContinueSuccBB = BI.getSuccessor(1 - ClonedSucc);
>
> - assert(UnswitchedSuccBB != ContinueSuccBB &&
> - "Should not unswitch a branch that always goes to the same
> place!");
> + BasicBlock *RetainedSuccBB =
> + BI ? BI->getSuccessor(1 - ClonedSucc) : SI->getDefaultDest();
> + SmallSetVector<BasicBlock *, 4> UnswitchedSuccBBs;
> + if (BI)
> + UnswitchedSuccBBs.insert(BI->getSuccessor(ClonedSucc));
> + else
> + for (auto Case : SI->cases())
> + UnswitchedSuccBBs.insert(Case.getCaseSuccessor());
> +
> + assert(!UnswitchedSuccBBs.count(RetainedSuccBB) &&
> + "Should not unswitch the same successor we are retaining!");
>
> // The branch should be in this exact loop. Any inner loop's invariant
> branch
> // should be handled by unswitching that inner loop. The caller of this
> @@ -1648,9 +1687,6 @@ static bool unswitchInvariantBranch(
> if (isa<CleanupPadInst>(ExitBB->getFirstNonPHI()))
> return false;
>
> - SmallPtrSet<BasicBlock *, 4> ExitBlockSet(ExitBlocks.begin(),
> - ExitBlocks.end());
> -
> // Compute the parent loop now before we start hacking on things.
> Loop *ParentL = L.getParentLoop();
>
> @@ -1669,30 +1705,22 @@ static bool unswitchInvariantBranch(
> OuterExitL = NewOuterExitL;
> }
>
> - // If the edge we *aren't* cloning in the unswitch (the continuing edge)
> - // dominates its target, we can skip cloning the dominated region of
> the loop
> - // and its exits. We compute this as a set of nodes to be skipped.
> - SmallPtrSet<BasicBlock *, 4> SkippedLoopAndExitBlocks;
> - if (ContinueSuccBB->getUniquePredecessor() ||
> - llvm::all_of(predecessors(ContinueSuccBB), [&](BasicBlock *PredBB) {
> - return PredBB == ParentBB || DT.dominates(ContinueSuccBB, PredBB);
> - })) {
> - visitDomSubTree(DT, ContinueSuccBB, [&](BasicBlock *BB) {
> - SkippedLoopAndExitBlocks.insert(BB);
> - return true;
> - });
> - }
> - // If we are doing full unswitching, then similarly to the above, the
> edge we
> - // *are* cloning in the unswitch (the unswitched edge) dominates its
> target,
> - // we will end up with dead nodes in the original loop and its exits
> that will
> - // need to be deleted. Here, we just retain that the property holds and
> will
> - // compute the deleted set later.
> - bool DeleteUnswitchedSucc =
> - FullUnswitch &&
> - (UnswitchedSuccBB->getUniquePredecessor() ||
> - llvm::all_of(predecessors(UnswitchedSuccBB), [&](BasicBlock
> *PredBB) {
> - return PredBB == ParentBB || DT.dominates(UnswitchedSuccBB,
> PredBB);
> - }));
> + // If the edge from this terminator to a successor dominates that
> successor,
> + // store a map from each block in its dominator subtree to it. This
> lets us
> + // tell when cloning for a particular successor if a block is dominated
> by
> + // some *other* successor with a single data structure. We use this to
> + // significantly reduce cloning.
> + SmallDenseMap<BasicBlock *, BasicBlock *, 16> DominatingSucc;
> + for (auto *SuccBB : llvm::concat<BasicBlock *const>(
> + makeArrayRef(RetainedSuccBB), UnswitchedSuccBBs))
> + if (SuccBB->getUniquePredecessor() ||
> + llvm::all_of(predecessors(SuccBB), [&](BasicBlock *PredBB) {
> + return PredBB == ParentBB || DT.dominates(SuccBB, PredBB);
> + }))
> + visitDomSubTree(DT, SuccBB, [&](BasicBlock *BB) {
> + DominatingSucc[BB] = SuccBB;
> + return true;
> + });
>
> // Split the preheader, so that we know that there is a safe place to
> insert
> // the conditional branch. We will change the preheader to have a
> conditional
> @@ -1702,84 +1730,93 @@ static bool unswitchInvariantBranch(
> BasicBlock *SplitBB = L.getLoopPreheader();
> BasicBlock *LoopPH = SplitEdge(SplitBB, L.getHeader(), &DT, &LI);
>
> - // Keep a mapping for the cloned values.
> - ValueToValueMapTy VMap;
> -
> // Keep track of the dominator tree updates needed.
> SmallVector<DominatorTree::UpdateType, 4> DTUpdates;
>
> - // Build the cloned blocks from the loop.
> - auto *ClonedPH = buildClonedLoopBlocks(
> - L, LoopPH, SplitBB, ExitBlocks, ParentBB, UnswitchedSuccBB,
> - ContinueSuccBB, SkippedLoopAndExitBlocks, VMap, DTUpdates, AC, DT,
> LI);
> + // Clone the loop for each unswitched successor.
> + SmallVector<std::unique_ptr<ValueToValueMapTy>, 4> VMaps;
> + VMaps.reserve(UnswitchedSuccBBs.size());
> + SmallDenseMap<BasicBlock *, BasicBlock *, 4> ClonedPHs;
> + for (auto *SuccBB : UnswitchedSuccBBs) {
> + VMaps.emplace_back(new ValueToValueMapTy());
> + ClonedPHs[SuccBB] = buildClonedLoopBlocks(
> + L, LoopPH, SplitBB, ExitBlocks, ParentBB, SuccBB, RetainedSuccBB,
> + DominatingSucc, *VMaps.back(), DTUpdates, AC, DT, LI);
> + }
>
> // The stitching of the branched code back together depends on whether
> we're
> // doing full unswitching or not with the exception that we always want
> to
> // nuke the initial terminator placed in the split block.
> SplitBB->getTerminator()->eraseFromParent();
> if (FullUnswitch) {
> - // Remove the parent as a predecessor of the
> - // unswitched successor.
> - UnswitchedSuccBB->removePredecessor(ParentBB,
> - /*DontDeleteUselessPHIs*/ true);
> - DTUpdates.push_back({DominatorTree::Delete, ParentBB,
> UnswitchedSuccBB});
> -
> - // Now splice the branch from the original loop and use it to select
> between
> - // the two loops.
> - SplitBB->getInstList().splice(SplitBB->end(),
> ParentBB->getInstList(), BI);
> - BI.setSuccessor(ClonedSucc, ClonedPH);
> - BI.setSuccessor(1 - ClonedSucc, LoopPH);
> + for (BasicBlock *SuccBB : UnswitchedSuccBBs) {
> + // Remove the parent as a predecessor of the unswitched successor.
> + SuccBB->removePredecessor(ParentBB,
> + /*DontDeleteUselessPHIs*/ true);
> + DTUpdates.push_back({DominatorTree::Delete, ParentBB, SuccBB});
> + }
> +
> + // Now splice the terminator from the original loop and rewrite its
> + // successors.
> + SplitBB->getInstList().splice(SplitBB->end(),
> ParentBB->getInstList(), TI);
> + if (BI) {
> + assert(UnswitchedSuccBBs.size() == 1 &&
> + "Only one possible unswitched block for a branch!");
> + BasicBlock *ClonedPH = ClonedPHs.begin()->second;
> + BI->setSuccessor(ClonedSucc, ClonedPH);
> + BI->setSuccessor(1 - ClonedSucc, LoopPH);
> + DTUpdates.push_back({DominatorTree::Insert, SplitBB, ClonedPH});
> + } else {
> + assert(SI && "Must either be a branch or switch!");
> +
> + // Walk the cases and directly update their successors.
> + for (auto &Case : SI->cases())
> +
> Case.setSuccessor(ClonedPHs.find(Case.getCaseSuccessor())->second);
> + // We need to use the set to populate domtree updates as even when
> there
> + // are multiple cases pointing at the same successor we only want to
> + // insert one edge in the domtree.
> + for (BasicBlock *SuccBB : UnswitchedSuccBBs)
> + DTUpdates.push_back(
> + {DominatorTree::Insert, SplitBB,
> ClonedPHs.find(SuccBB)->second});
> +
> + SI->setDefaultDest(LoopPH);
> + }
>
> // Create a new unconditional branch to the continuing block (as
> opposed to
> // the one cloned).
> - BranchInst::Create(ContinueSuccBB, ParentBB);
> + BranchInst::Create(RetainedSuccBB, ParentBB);
> } else {
> + assert(BI && "Only branches have partial unswitching.");
> + assert(UnswitchedSuccBBs.size() == 1 &&
> + "Only one possible unswitched block for a branch!");
> + BasicBlock *ClonedPH = ClonedPHs.begin()->second;
> // When doing a partial unswitch, we have to do a bit more work to
> build up
> // the branch in the split block.
> buildPartialUnswitchConditionalBranch(*SplitBB, Invariants, Direction,
> *ClonedPH, *LoopPH);
> + DTUpdates.push_back({DominatorTree::Insert, SplitBB, ClonedPH});
> }
>
> - // Before we update the dominator tree, collect the dead blocks if
> we're going
> - // to end up deleting the unswitched successor.
> - SmallVector<BasicBlock *, 16> DeadBlocks;
> - if (DeleteUnswitchedSucc) {
> - DeadBlocks.push_back(UnswitchedSuccBB);
> - for (int i = 0; i < (int)DeadBlocks.size(); ++i) {
> - // If we reach an exit block, stop recursing as the unswitched loop
> will
> - // end up reaching the merge block which we make the successor of
> the
> - // exit.
> - if (ExitBlockSet.count(DeadBlocks[i]))
> - continue;
> -
> - // Insert the children that are within the loop or exit block set.
> Other
> - // children may reach out of the loop. While we don't expect these
> to be
> - // dead (as the unswitched clone should reach them) we don't try to
> prove
> - // that here.
> - for (DomTreeNode *ChildN : *DT[DeadBlocks[i]])
> - if (L.contains(ChildN->getBlock()) ||
> - ExitBlockSet.count(ChildN->getBlock()))
> - DeadBlocks.push_back(ChildN->getBlock());
> - }
> - }
> -
> - // Add the remaining edge to our updates and apply them to get an
> up-to-date
> - // dominator tree. Note that this will cause the dead blocks above to be
> - // unreachable and no longer in the dominator tree.
> - DTUpdates.push_back({DominatorTree::Insert, SplitBB, ClonedPH});
> + // Apply the updates accumulated above to get an up-to-date dominator
> tree.
> DT.applyUpdates(DTUpdates);
>
> + // Now that we have an accurate dominator tree, first delete the dead
> cloned
> + // blocks so that we can accurately build any cloned loops. It is
> important to
> + // not delete the blocks from the original loop yet because we still
> want to
> + // reference the original loop to understand the cloned loop's
> structure.
> + deleteDeadClonedBlocks(L, ExitBlocks, VMaps, DT);
> +
> // Build the cloned loop structure itself. This may be substantially
> // different from the original structure due to the simplified CFG.
> This also
> // handles inserting all the cloned blocks into the correct loops.
> SmallVector<Loop *, 4> NonChildClonedLoops;
> - buildClonedLoops(L, ExitBlocks, VMap, LI, NonChildClonedLoops);
> -
> - // Delete anything that was made dead in the original loop due to
> - // unswitching.
> - if (!DeadBlocks.empty())
> - deleteDeadBlocksFromLoop(L, DeadBlocks, ExitBlocks, DT, LI);
> + for (std::unique_ptr<ValueToValueMapTy> &VMap : VMaps)
> + buildClonedLoops(L, ExitBlocks, *VMap, LI, NonChildClonedLoops);
>
> + // Now that our cloned loops have been built, we can update the
> original loop.
> + // First we delete the dead blocks from it and then we rebuild the loop
> + // structure taking these deletions into account.
> + deleteDeadBlocksFromLoop(L, ExitBlocks, DT, LI);
> SmallVector<Loop *, 4> HoistedLoops;
> bool IsStillLoop = rebuildLoopAfterUnswitch(L, ExitBlocks, LI,
> HoistedLoops);
>
> @@ -1790,31 +1827,37 @@ static bool unswitchInvariantBranch(
> // verification steps.
> assert(DT.verify(DominatorTree::VerificationLevel::Fast));
>
> - // Now we want to replace all the uses of the invariants within both the
> - // original and cloned blocks. We do this here so that we can use the
> now
> - // updated dominator tree to identify which side the users are on.
> - ConstantInt *UnswitchedReplacement =
> - Direction ? ConstantInt::getTrue(BI.getContext())
> - : ConstantInt::getFalse(BI.getContext());
> - ConstantInt *ContinueReplacement =
> - Direction ? ConstantInt::getFalse(BI.getContext())
> - : ConstantInt::getTrue(BI.getContext());
> - for (Value *Invariant : Invariants)
> - for (auto UI = Invariant->use_begin(), UE = Invariant->use_end();
> - UI != UE;) {
> - // Grab the use and walk past it so we can clobber it in the use
> list.
> - Use *U = &*UI++;
> - Instruction *UserI = dyn_cast<Instruction>(U->getUser());
> - if (!UserI)
> - continue;
> + if (BI) {
> + // If we unswitched a branch which collapses the condition to a known
> + // constant we want to replace all the uses of the invariants within
> both
> + // the original and cloned blocks. We do this here so that we can use
> the
> + // now updated dominator tree to identify which side the users are on.
> + assert(UnswitchedSuccBBs.size() == 1 &&
> + "Only one possible unswitched block for a branch!");
> + BasicBlock *ClonedPH = ClonedPHs.begin()->second;
> + ConstantInt *UnswitchedReplacement =
> + Direction ? ConstantInt::getTrue(BI->getContext())
> + : ConstantInt::getFalse(BI->getContext());
> + ConstantInt *ContinueReplacement =
> + Direction ? ConstantInt::getFalse(BI->getContext())
> + : ConstantInt::getTrue(BI->getContext());
> + for (Value *Invariant : Invariants)
> + for (auto UI = Invariant->use_begin(), UE = Invariant->use_end();
> + UI != UE;) {
> + // Grab the use and walk past it so we can clobber it in the use
> list.
> + Use *U = &*UI++;
> + Instruction *UserI = dyn_cast<Instruction>(U->getUser());
> + if (!UserI)
> + continue;
>
> - // Replace it with the 'continue' side if in the main loop body,
> and the
> - // unswitched if in the cloned blocks.
> - if (DT.dominates(LoopPH, UserI->getParent()))
> - U->set(ContinueReplacement);
> - else if (DT.dominates(ClonedPH, UserI->getParent()))
> - U->set(UnswitchedReplacement);
> - }
> + // Replace it with the 'continue' side if in the main loop body,
> and the
> + // unswitched if in the cloned blocks.
> + if (DT.dominates(LoopPH, UserI->getParent()))
> + U->set(ContinueReplacement);
> + else if (DT.dominates(ClonedPH, UserI->getParent()))
> + U->set(UnswitchedReplacement);
> + }
> + }
>
> // We can change which blocks are exit blocks of all the cloned sibling
> // loops, the current loop, and any parent loops which shared exit
> blocks
> @@ -1937,8 +1980,16 @@ static bool unswitchBestCondition(
> if (LI.getLoopFor(BB) != &L)
> continue;
>
> + if (auto *SI = dyn_cast<SwitchInst>(BB->getTerminator())) {
> + // We can only consider fully loop-invariant switch conditions as
> we need
> + // to completely eliminate the switch after unswitching.
> + if (!isa<Constant>(SI->getCondition()) &&
> + L.isLoopInvariant(SI->getCondition()))
> + UnswitchCandidates.push_back({SI, {SI->getCondition()}});
> + continue;
> + }
> +
> auto *BI = dyn_cast<BranchInst>(BB->getTerminator());
> - // FIXME: Handle switches here!
> if (!BI || !BI->isConditional() || isa<Constant>(BI->getCondition())
> ||
> BI->getSuccessor(0) == BI->getSuccessor(1))
> continue;
> @@ -2091,9 +2142,9 @@ static bool unswitchBestCondition(
> TerminatorInst &TI = *TerminatorAndInvariants.first;
> ArrayRef<Value *> Invariants = TerminatorAndInvariants.second;
> BranchInst *BI = dyn_cast<BranchInst>(&TI);
> - int CandidateCost =
> - ComputeUnswitchedCost(TI, /*FullUnswitch*/ Invariants.size() == 1
> && BI &&
> - Invariants[0] ==
> BI->getCondition());
> + int CandidateCost = ComputeUnswitchedCost(
> + TI, /*FullUnswitch*/ !BI || (Invariants.size() == 1 &&
> + Invariants[0] ==
> BI->getCondition()));
> LLVM_DEBUG(dbgs() << " Computed cost of " << CandidateCost
> << " for unswitch candidate: " << TI << "\n");
> if (!BestUnswitchTI || CandidateCost < BestUnswitchCost) {
> @@ -2109,17 +2160,11 @@ static bool unswitchBestCondition(
> return false;
> }
>
> - auto *UnswitchBI = dyn_cast<BranchInst>(BestUnswitchTI);
> - if (!UnswitchBI) {
> - // FIXME: Add support for unswitching a switch here!
> - LLVM_DEBUG(dbgs() << "Cannot unswitch anything but a branch!\n");
> - return false;
> - }
> -
> LLVM_DEBUG(dbgs() << " Trying to unswitch non-trivial (cost = "
> - << BestUnswitchCost << ") branch: " << *UnswitchBI <<
> "\n");
> - return unswitchInvariantBranch(L, *UnswitchBI, BestUnswitchInvariants,
> DT, LI,
> - AC, UnswitchCB);
> + << BestUnswitchCost << ") terminator: " <<
> *BestUnswitchTI
> + << "\n");
> + return unswitchNontrivialInvariants(
> + L, *BestUnswitchTI, BestUnswitchInvariants, DT, LI, AC, UnswitchCB);
> }
>
> /// Unswitch control flow predicated on loop invariant conditions.
>
> Modified:
> llvm/trunk/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch.ll
> URL:
> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch.ll?rev=335553&r1=335552&r2=335553&view=diff
>
> ==============================================================================
> --- llvm/trunk/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch.ll
> (original)
> +++ llvm/trunk/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch.ll
> Mon Jun 25 16:32:54 2018
> @@ -387,7 +387,7 @@ loop_begin:
> loop_b:
> %b = load i32, i32* %b.ptr
> br i1 %v, label %loop_begin, label %loop_exit
> -; The 'loop_b' unswitched loop.
> +; The original loop, now non-looping due to unswitching..
> ;
> ; CHECK: entry.split:
> ; CHECK-NEXT: br label %loop_begin
> @@ -398,14 +398,13 @@ loop_b:
> ; CHECK-NEXT: br label %loop_exit.split
> ;
> ; CHECK: loop_exit.split:
> -; CHECK-NEXT: %[[A_LCSSA:.*]] = phi i32 [ %[[A]], %loop_begin ]
> ; CHECK-NEXT: br label %loop_exit
>
> loop_exit:
> %ab.phi = phi i32 [ %b, %loop_b ], [ %a, %loop_begin ]
> ret i32 %ab.phi
> ; CHECK: loop_exit:
> -; CHECK-NEXT: %[[AB_PHI:.*]] = phi i32 [ %[[A_LCSSA]],
> %loop_exit.split ], [ %[[B_LCSSA]], %loop_exit.split.us ]
> +; CHECK-NEXT: %[[AB_PHI:.*]] = phi i32 [ %[[A]], %loop_exit.split ], [
> %[[B_LCSSA]], %loop_exit.split.us ]
> ; CHECK-NEXT: ret i32 %[[AB_PHI]]
> }
>
> @@ -458,8 +457,7 @@ loop_exit1:
> call void @sink1(i32 %a.phi)
> ret void
> ; CHECK: loop_exit1:
> -; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA]], %
> loop_exit1.split.us ]
> -; CHECK-NEXT: call void @sink1(i32 %[[A_PHI]])
> +; CHECK-NEXT: call void @sink1(i32 %[[A_LCSSA]])
> ; CHECK-NEXT: ret void
>
> loop_exit2:
> @@ -467,8 +465,8 @@ loop_exit2:
> call void @sink2(i32 %b.phi)
> ret void
> ; CHECK: loop_exit2:
> -; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B]], %loop_b ]
> -; CHECK-NEXT: call void @sink2(i32 %[[B_PHI]])
> +; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B]], %loop_b ]
> +; CHECK-NEXT: call void @sink2(i32 %[[B_LCSSA]])
> ; CHECK-NEXT: ret void
> }
>
> @@ -531,8 +529,7 @@ loop_exit2:
> call void @sink2(i32 %b.phi)
> ret void
> ; CHECK: loop_exit2:
> -; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B_LCSSA]], %
> loop_exit2.split.us ]
> -; CHECK-NEXT: call void @sink2(i32 %[[B_PHI]])
> +; CHECK-NEXT: call void @sink2(i32 %[[B_LCSSA]])
> ; CHECK-NEXT: ret void
> }
>
> @@ -587,8 +584,7 @@ loop_exit1:
> call void @sink1(i32 %a.phi)
> br label %exit
> ; CHECK: loop_exit1:
> -; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA]], %
> loop_exit1.split.us ]
> -; CHECK-NEXT: call void @sink1(i32 %[[A_PHI]])
> +; CHECK-NEXT: call void @sink1(i32 %[[A_LCSSA]])
> ; CHECK-NEXT: br label %exit
>
> loop_exit2:
> @@ -596,8 +592,8 @@ loop_exit2:
> call void @sink2(i32 %b.phi)
> br label %exit
> ; CHECK: loop_exit2:
> -; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B]], %loop_b ]
> -; CHECK-NEXT: call void @sink2(i32 %[[B_PHI]])
> +; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B]], %loop_b ]
> +; CHECK-NEXT: call void @sink2(i32 %[[B_LCSSA]])
> ; CHECK-NEXT: br label %exit
>
> exit:
> @@ -663,7 +659,7 @@ loop_latch:
> %v2 = load i1, i1* %ptr
> br i1 %v2, label %loop_begin, label %loop_exit
> ; CHECK: loop_latch:
> -; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B]], %inner_loop_b ]
> +; CHECK-NEXT: %[[B_INNER_LCSSA:.*]] = phi i32 [ %[[B]], %inner_loop_b ]
> ; CHECK-NEXT: %[[V2:.*]] = load i1, i1* %ptr
> ; CHECK-NEXT: br i1 %[[V2]], label %loop_begin, label
> %loop_exit.loopexit1
>
> @@ -671,15 +667,14 @@ loop_exit:
> %ab.phi = phi i32 [ %a, %inner_loop_begin ], [ %b.phi, %loop_latch ]
> ret i32 %ab.phi
> ; CHECK: loop_exit.loopexit:
> -; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA]], %
> loop_exit.loopexit.split.us ]
> ; CHECK-NEXT: br label %loop_exit
> ;
> ; CHECK: loop_exit.loopexit1:
> -; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B_LCSSA]], %loop_latch ]
> +; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B_INNER_LCSSA]],
> %loop_latch ]
> ; CHECK-NEXT: br label %loop_exit
> ;
> ; CHECK: loop_exit:
> -; CHECK-NEXT: %[[AB_PHI:.*]] = phi i32 [ %[[A_PHI]],
> %loop_exit.loopexit ], [ %[[B_PHI]], %loop_exit.loopexit1 ]
> +; CHECK-NEXT: %[[AB_PHI:.*]] = phi i32 [ %[[A_LCSSA]],
> %loop_exit.loopexit ], [ %[[B_LCSSA]], %loop_exit.loopexit1 ]
> ; CHECK-NEXT: ret i32 %[[AB_PHI]]
> }
>
> @@ -773,11 +768,10 @@ latch:
> ; CHECK-NEXT: br label %latch
> ;
> ; CHECK: latch:
> -; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B_INNER_LCSSA]],
> %loop_b_inner_exit ]
> ; CHECK-NEXT: br i1 %[[V]], label %loop_begin, label %loop_exit.split
> ;
> ; CHECK: loop_exit.split:
> -; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B_PHI]], %latch ]
> +; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B_INNER_LCSSA]], %latch ]
> ; CHECK-NEXT: br label %loop_exit
>
> loop_exit:
> @@ -1466,7 +1460,6 @@ inner_loop_exit:
> %v = load i1, i1* %ptr
> br i1 %v, label %loop_begin, label %loop_exit
> ; CHECK: inner_loop_exit:
> -; CHECK-NEXT: %[[A_INNER_LCSSA:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]],
> %inner_loop_exit.split.us ]
> ; CHECK-NEXT: %[[V:.*]] = load i1, i1* %ptr
> ; CHECK-NEXT: br i1 %[[V]], label %loop_begin, label %loop_exit
>
> @@ -1474,7 +1467,7 @@ loop_exit:
> %a.lcssa = phi i32 [ %a.inner_lcssa, %inner_loop_exit ]
> ret i32 %a.lcssa
> ; CHECK: loop_exit:
> -; CHECK-NEXT: %[[A_LCSSA:.*]] = phi i32 [ %[[A_INNER_LCSSA]],
> %inner_loop_exit ]
> +; CHECK-NEXT: %[[A_LCSSA:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]],
> %inner_loop_exit ]
> ; CHECK-NEXT: ret i32 %[[A_LCSSA]]
> }
>
> @@ -1555,7 +1548,7 @@ loop_exit:
> ret i32 %a.lcssa
> ; CHECK: loop_exit:
> ; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA]], %loop_exit.split
> ], [ %[[A_PHI_US]], %loop_exit.split.us ]
> -; CHECK-NEXT: ret i32 %[[AB_PHI]]
> +; CHECK-NEXT: ret i32 %[[A_PHI]]
> }
>
> ; Test that requires re-forming dedicated exits for the original loop.
> @@ -1635,7 +1628,7 @@ loop_exit:
> ret i32 %a.lcssa
> ; CHECK: loop_exit:
> ; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_PHI_SPLIT]],
> %loop_exit.split ], [ %[[A_LCSSA_US]], %loop_exit.split.us ]
> -; CHECK-NEXT: ret i32 %[[AB_PHI]]
> +; CHECK-NEXT: ret i32 %[[A_PHI]]
> }
>
> ; Check that if a cloned inner loop after unswitching doesn't loop and
> directly
> @@ -1721,7 +1714,6 @@ loop_exit:
> %a.lcssa = phi i32 [ %a, %inner_loop_begin ], [ %a.inner_lcssa,
> %inner_loop_exit ]
> ret i32 %a.lcssa
> ; CHECK: loop_exit.loopexit:
> -; CHECK-NEXT: %[[A_LCSSA_US:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]], %
> loop_exit.loopexit.split.us ]
> ; CHECK-NEXT: br label %loop_exit
> ;
> ; CHECK: loop_exit.loopexit1:
> @@ -1729,7 +1721,7 @@ loop_exit:
> ; CHECK-NEXT: br label %loop_exit
> ;
> ; CHECK: loop_exit:
> -; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA_US]],
> %loop_exit.loopexit ], [ %[[A_LCSSA]], %loop_exit.loopexit1 ]
> +; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]],
> %loop_exit.loopexit ], [ %[[A_LCSSA]], %loop_exit.loopexit1 ]
> ; CHECK-NEXT: ret i32 %[[A_PHI]]
> }
>
> @@ -1802,7 +1794,6 @@ inner_loop_exit:
> %v3 = load i1, i1* %ptr
> br i1 %v3, label %loop_latch, label %loop_exit
> ; CHECK: inner_loop_exit:
> -; CHECK-NEXT: %[[A_INNER_PHI:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]], %
> inner_loop_exit.split.us ]
> ; CHECK-NEXT: %[[V:.*]] = load i1, i1* %ptr
> ; CHECK-NEXT: br i1 %[[V]], label %loop_latch, label
> %loop_exit.loopexit1
>
> @@ -1819,7 +1810,7 @@ loop_exit:
> ; CHECK-NEXT: br label %loop_exit
> ;
> ; CHECK: loop_exit.loopexit1:
> -; CHECK-NEXT: %[[A_LCSSA_US:.*]] = phi i32 [ %[[A_INNER_PHI]],
> %inner_loop_exit ]
> +; CHECK-NEXT: %[[A_LCSSA_US:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]],
> %inner_loop_exit ]
> ; CHECK-NEXT: br label %loop_exit
> ;
> ; CHECK: loop_exit:
> @@ -1916,7 +1907,6 @@ inner_loop_exit:
> %v4 = load i1, i1* %ptr
> br i1 %v4, label %loop_begin, label %loop_exit
> ; CHECK: inner_loop_exit.loopexit:
> -; CHECK-NEXT: %[[A_INNER_LCSSA_US:.*]] = phi i32 [
> %[[A_INNER_INNER_LCSSA_US]], %inner_loop_exit.loopexit.split.us ]
> ; CHECK-NEXT: br label %inner_loop_exit
> ;
> ; CHECK: inner_loop_exit.loopexit1:
> @@ -1924,7 +1914,7 @@ inner_loop_exit:
> ; CHECK-NEXT: br label %inner_loop_exit
> ;
> ; CHECK: inner_loop_exit:
> -; CHECK-NEXT: %[[A_INNER_PHI:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]],
> %inner_loop_exit.loopexit ], [ %[[A_INNER_LCSSA]],
> %inner_loop_exit.loopexit1 ]
> +; CHECK-NEXT: %[[A_INNER_PHI:.*]] = phi i32 [
> %[[A_INNER_INNER_LCSSA_US]], %inner_loop_exit.loopexit ], [
> %[[A_INNER_LCSSA]], %inner_loop_exit.loopexit1 ]
> ; CHECK-NEXT: %[[V:.*]] = load i1, i1* %ptr
> ; CHECK-NEXT: br i1 %[[V]], label %loop_begin, label %loop_exit
>
> @@ -2010,7 +2000,6 @@ inner_inner_loop_exit:
> %v3 = load i1, i1* %ptr
> br i1 %v3, label %inner_loop_latch, label %inner_loop_exit
> ; CHECK: inner_inner_loop_exit:
> -; CHECK-NEXT: %[[A_INNER_INNER_PHI:.*]] = phi i32 [
> %[[A_INNER_INNER_LCSSA_US]], %inner_inner_loop_exit.split.us ]
> ; CHECK-NEXT: %[[V:.*]] = load i1, i1* %ptr
> ; CHECK-NEXT: br i1 %[[V]], label %inner_loop_latch, label
> %inner_loop_exit.loopexit1
>
> @@ -2028,7 +2017,7 @@ inner_loop_exit:
> ; CHECK-NEXT: br label %inner_loop_exit
> ;
> ; CHECK: inner_loop_exit.loopexit1:
> -; CHECK-NEXT: %[[A_INNER_LCSSA_US:.*]] = phi i32 [
> %[[A_INNER_INNER_PHI]], %inner_inner_loop_exit ]
> +; CHECK-NEXT: %[[A_INNER_LCSSA_US:.*]] = phi i32 [
> %[[A_INNER_INNER_LCSSA_US]], %inner_inner_loop_exit ]
> ; CHECK-NEXT: br label %inner_loop_exit
> ;
> ; CHECK: inner_loop_exit:
> @@ -2296,56 +2285,96 @@ define i32 @test20(i32* %var, i32 %cond1
> entry:
> br label %loop_begin
> ; CHECK-NEXT: entry:
> -; CHECK-NEXT: br label %loop_begin
> +; CHECK-NEXT: switch i32 %cond2, label %[[ENTRY_SPLIT_EXIT:.*]] [
> +; CHECK-NEXT: i32 0, label %[[ENTRY_SPLIT_A:.*]]
> +; CHECK-NEXT: i32 1, label %[[ENTRY_SPLIT_A]]
> +; CHECK-NEXT: i32 13, label %[[ENTRY_SPLIT_B:.*]]
> +; CHECK-NEXT: i32 2, label %[[ENTRY_SPLIT_A]]
> +; CHECK-NEXT: i32 42, label %[[ENTRY_SPLIT_C:.*]]
> +; CHECK-NEXT: ]
>
> loop_begin:
> %var_val = load i32, i32* %var
> - switch i32 %cond2, label %loop_a [
> - i32 0, label %loop_b
> - i32 1, label %loop_b
> - i32 13, label %loop_c
> - i32 2, label %loop_b
> - i32 42, label %loop_exit
> + switch i32 %cond2, label %loop_exit [
> + i32 0, label %loop_a
> + i32 1, label %loop_a
> + i32 13, label %loop_b
> + i32 2, label %loop_a
> + i32 42, label %loop_c
> ]
> -; CHECK: loop_begin:
> -; CHECK-NEXT: %[[V:.*]] = load i32, i32* %var
> -; CHECK-NEXT: switch i32 %cond2, label %loop_a [
> -; CHECK-NEXT: i32 0, label %loop_b
> -; CHECK-NEXT: i32 1, label %loop_b
> -; CHECK-NEXT: i32 13, label %loop_c
> -; CHECK-NEXT: i32 2, label %loop_b
> -; CHECK-NEXT: i32 42, label %loop_exit
> -; CHECK-NEXT: ]
>
> loop_a:
> call void @a()
> br label %loop_latch
> -; CHECK: loop_a:
> +; Unswitched 'a' loop.
> +;
> +; CHECK: [[ENTRY_SPLIT_A]]:
> +; CHECK-NEXT: br label %[[LOOP_BEGIN_A:.*]]
> +;
> +; CHECK: [[LOOP_BEGIN_A]]:
> +; CHECK-NEXT: %{{.*}} = load i32, i32* %var
> +; CHECK-NEXT: br label %[[LOOP_A:.*]]
> +;
> +; CHECK: [[LOOP_A]]:
> ; CHECK-NEXT: call void @a()
> -; CHECK-NEXT: br label %loop_latch
> +; CHECK-NEXT: br label %[[LOOP_LATCH_A:.*]]
> +;
> +; CHECK: [[LOOP_LATCH_A]]:
> +; CHECK: br label %[[LOOP_BEGIN_A]]
>
> loop_b:
> call void @b()
> br label %loop_latch
> -; CHECK: loop_b:
> +; Unswitched 'b' loop.
> +;
> +; CHECK: [[ENTRY_SPLIT_B]]:
> +; CHECK-NEXT: br label %[[LOOP_BEGIN_B:.*]]
> +;
> +; CHECK: [[LOOP_BEGIN_B]]:
> +; CHECK-NEXT: %{{.*}} = load i32, i32* %var
> +; CHECK-NEXT: br label %[[LOOP_B:.*]]
> +;
> +; CHECK: [[LOOP_B]]:
> ; CHECK-NEXT: call void @b()
> -; CHECK-NEXT: br label %loop_latch
> +; CHECK-NEXT: br label %[[LOOP_LATCH_B:.*]]
> +;
> +; CHECK: [[LOOP_LATCH_B]]:
> +; CHECK: br label %[[LOOP_BEGIN_B]]
>
> loop_c:
> call void @c() noreturn nounwind
> br label %loop_latch
> -; CHECK: loop_c:
> +; Unswitched 'c' loop.
> +;
> +; CHECK: [[ENTRY_SPLIT_C]]:
> +; CHECK-NEXT: br label %[[LOOP_BEGIN_C:.*]]
> +;
> +; CHECK: [[LOOP_BEGIN_C]]:
> +; CHECK-NEXT: %{{.*}} = load i32, i32* %var
> +; CHECK-NEXT: br label %[[LOOP_C:.*]]
> +;
> +; CHECK: [[LOOP_C]]:
> ; CHECK-NEXT: call void @c()
> -; CHECK-NEXT: br label %loop_latch
> +; CHECK-NEXT: br label %[[LOOP_LATCH_C:.*]]
> +;
> +; CHECK: [[LOOP_LATCH_C]]:
> +; CHECK: br label %[[LOOP_BEGIN_C]]
>
> loop_latch:
> br label %loop_begin
> -; CHECK: loop_latch:
> -; CHECK-NEXT: br label %loop_begin
>
> loop_exit:
> %lcssa = phi i32 [ %var_val, %loop_begin ]
> ret i32 %lcssa
> +; Unswitched exit edge (no longer a loop).
> +;
> +; CHECK: [[ENTRY_SPLIT_EXIT]]:
> +; CHECK-NEXT: br label %loop_begin
> +;
> +; CHECK: loop_begin:
> +; CHECK-NEXT: %[[V:.*]] = load i32, i32* %var
> +; CHECK-NEXT: br label %loop_exit
> +;
> ; CHECK: loop_exit:
> ; CHECK-NEXT: %[[LCSSA:.*]] = phi i32 [ %[[V]], %loop_begin ]
> ; CHECK-NEXT: ret i32 %[[LCSSA]]
> @@ -2824,3 +2853,112 @@ loop_exit:
> ; CHECK: loop_exit:
> ; CHECK-NEXT: ret
> }
> +
> +; Non-trivial unswitching of a switch.
> +define i32 @test27(i1* %ptr, i32 %cond) {
> +; CHECK-LABEL: @test27(
> +entry:
> + br label %loop_begin
> +; CHECK-NEXT: entry:
> +; CHECK-NEXT: switch i32 %cond, label %[[ENTRY_SPLIT_LATCH:.*]] [
> +; CHECK-NEXT: i32 0, label %[[ENTRY_SPLIT_A:.*]]
> +; CHECK-NEXT: i32 1, label %[[ENTRY_SPLIT_B:.*]]
> +; CHECK-NEXT: i32 2, label %[[ENTRY_SPLIT_C:.*]]
> +; CHECK-NEXT: ]
> +
> +loop_begin:
> + switch i32 %cond, label %latch [
> + i32 0, label %loop_a
> + i32 1, label %loop_b
> + i32 2, label %loop_c
> + ]
> +
> +loop_a:
> + call void @a()
> + br label %latch
> +; Unswitched 'a' loop.
> +;
> +; CHECK: [[ENTRY_SPLIT_A]]:
> +; CHECK-NEXT: br label %[[LOOP_BEGIN_A:.*]]
> +;
> +; CHECK: [[LOOP_BEGIN_A]]:
> +; CHECK-NEXT: br label %[[LOOP_A:.*]]
> +;
> +; CHECK: [[LOOP_A]]:
> +; CHECK-NEXT: call void @a()
> +; CHECK-NEXT: br label %[[LOOP_LATCH_A:.*]]
> +;
> +; CHECK: [[LOOP_LATCH_A]]:
> +; CHECK-NEXT: %[[V_A:.*]] = load i1, i1* %ptr
> +; CHECK: br i1 %[[V_A]], label %[[LOOP_BEGIN_A]], label
> %[[LOOP_EXIT_A:.*]]
> +;
> +; CHECK: [[LOOP_EXIT_A]]:
> +; CHECK-NEXT: br label %loop_exit
> +
> +loop_b:
> + call void @b()
> + br label %latch
> +; Unswitched 'b' loop.
> +;
> +; CHECK: [[ENTRY_SPLIT_B]]:
> +; CHECK-NEXT: br label %[[LOOP_BEGIN_B:.*]]
> +;
> +; CHECK: [[LOOP_BEGIN_B]]:
> +; CHECK-NEXT: br label %[[LOOP_B:.*]]
> +;
> +; CHECK: [[LOOP_B]]:
> +; CHECK-NEXT: call void @b()
> +; CHECK-NEXT: br label %[[LOOP_LATCH_B:.*]]
> +;
> +; CHECK: [[LOOP_LATCH_B]]:
> +; CHECK-NEXT: %[[V_B:.*]] = load i1, i1* %ptr
> +; CHECK: br i1 %[[V_B]], label %[[LOOP_BEGIN_B]], label
> %[[LOOP_EXIT_B:.*]]
> +;
> +; CHECK: [[LOOP_EXIT_B]]:
> +; CHECK-NEXT: br label %loop_exit
> +
> +loop_c:
> + call void @c()
> + br label %latch
> +; Unswitched 'c' loop.
> +;
> +; CHECK: [[ENTRY_SPLIT_C]]:
> +; CHECK-NEXT: br label %[[LOOP_BEGIN_C:.*]]
> +;
> +; CHECK: [[LOOP_BEGIN_C]]:
> +; CHECK-NEXT: br label %[[LOOP_C:.*]]
> +;
> +; CHECK: [[LOOP_C]]:
> +; CHECK-NEXT: call void @c()
> +; CHECK-NEXT: br label %[[LOOP_LATCH_C:.*]]
> +;
> +; CHECK: [[LOOP_LATCH_C]]:
> +; CHECK-NEXT: %[[V_C:.*]] = load i1, i1* %ptr
> +; CHECK: br i1 %[[V_C]], label %[[LOOP_BEGIN_C]], label
> %[[LOOP_EXIT_C:.*]]
> +;
> +; CHECK: [[LOOP_EXIT_C]]:
> +; CHECK-NEXT: br label %loop_exit
> +
> +latch:
> + %v = load i1, i1* %ptr
> + br i1 %v, label %loop_begin, label %loop_exit
> +; Unswitched the 'latch' only loop.
> +;
> +; CHECK: [[ENTRY_SPLIT_LATCH]]:
> +; CHECK-NEXT: br label %[[LOOP_BEGIN_LATCH:.*]]
> +;
> +; CHECK: [[LOOP_BEGIN_LATCH]]:
> +; CHECK-NEXT: br label %[[LOOP_LATCH_LATCH:.*]]
> +;
> +; CHECK: [[LOOP_LATCH_LATCH]]:
> +; CHECK-NEXT: %[[V_LATCH:.*]] = load i1, i1* %ptr
> +; CHECK: br i1 %[[V_LATCH]], label %[[LOOP_BEGIN_LATCH]], label
> %[[LOOP_EXIT_LATCH:.*]]
> +;
> +; CHECK: [[LOOP_EXIT_LATCH]]:
> +; CHECK-NEXT: br label %loop_exit
> +
> +loop_exit:
> + ret i32 0
> +; CHECK: loop_exit:
> +; CHECK-NEXT: ret i32 0
> +}
> \ No newline at end of file
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180709/e47b410c/attachment.html>
More information about the llvm-commits
mailing list