[llvm] r335553 - [PM/LoopUnswitch] Teach the new unswitch to handle nontrivial
Chandler Carruth via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 9 02:51:37 PDT 2018
Also +Alina Sbirlea <asbirlea at google.com> as she's been hacking on top of
this...
On Mon, Jul 9, 2018 at 2:51 AM Chandler Carruth <chandlerc at google.com>
wrote:
> FYI for folks testing out this functionality, there are collection of
> serious bugs in this commit. I've got a fix and just need to add some
> testing. Will have it landed tomorrow. Just wanted to send a heads up.
>
> +Mikael Holmén <mikael.holmen at ericsson.com> +Fedor Sergeev
> <fedor.sergeev at azul.com>
>
> On Mon, Jun 25, 2018 at 4:37 PM Chandler Carruth via llvm-commits <
> llvm-commits at lists.llvm.org> wrote:
>
>> Author: chandlerc
>> Date: Mon Jun 25 16:32:54 2018
>> New Revision: 335553
>>
>> URL: http://llvm.org/viewvc/llvm-project?rev=335553&view=rev
>> Log:
>> [PM/LoopUnswitch] Teach the new unswitch to handle nontrivial
>> unswitching of switches.
>>
>> This works much like trivial unswitching of switches in that it reliably
>> moves the switch out of the loop. Here we potentially clone the entire
>> loop into each successor of the switch and re-point the cases at these
>> clones.
>>
>> Due to the complexity of actually doing nontrivial unswitching, this
>> patch doesn't create a dedicated routine for handling switches -- it
>> would duplicate far too much code. Instead, it generalizes the existing
>> routine to handle both branches and switches as it largely reduces to
>> looping in a few places instead of doing something once. This actually
>> improves the results in some cases with branches due to being much more
>> careful about how dead regions of code are managed. With branches,
>> because exactly one clone is created and there are exactly two edges
>> considered, somewhat sloppy handling of the dead regions of code was
>> sufficient in most cases. But with switches, there are much more
>> complicated patterns of dead code and so I've had to move to a more
>> robust model generally. We still do as much pruning of the dead code
>> early as possible because that allows us to avoid even cloning the code.
>>
>> This also surfaced another problem with nontrivial unswitching before
>> which is that we weren't as precise in reconstructing loops as we could
>> have been. This seems to have been mostly harmless, but resulted in
>> pointless LCSSA PHI nodes and other unnecessary cruft. With switches, we
>> have to get this *right*, and everything benefits from it.
>>
>> While the testing may seem a bit light here because we only have two
>> real cases with actual switches, they do a surprisingly good job of
>> exercising numerous edge cases. Also, because we share the logic with
>> branches, most of the changes in this patch are reasonably well covered
>> by existing tests.
>>
>> The new unswitch now has all of the same fundamental power as the old
>> one with the exception of the single unsound case of *partial* switch
>> unswitching -- that really is just loop specialization and not
>> unswitching at all. It doesn't fit into the canonicalization model in
>> any way. We can add a loop specialization pass that runs late based on
>> profile data if important test cases ever come up here.
>>
>> Differential Revision: https://reviews.llvm.org/D47683
>>
>> Modified:
>> llvm/trunk/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h
>> llvm/trunk/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
>> llvm/trunk/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch.ll
>>
>> Modified: llvm/trunk/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h?rev=335553&r1=335552&r2=335553&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h
>> (original)
>> +++ llvm/trunk/include/llvm/Transforms/Scalar/SimpleLoopUnswitch.h Mon
>> Jun 25 16:32:54 2018
>> @@ -17,9 +17,9 @@
>>
>> namespace llvm {
>>
>> -/// This pass transforms loops that contain branches on loop-invariant
>> -/// conditions to have multiple loops. For example, it turns the left
>> into the
>> -/// right code:
>> +/// This pass transforms loops that contain branches or switches on loop-
>> +/// invariant conditions to have multiple loops. For example, it turns
>> the left
>> +/// into the right code:
>> ///
>> /// for (...) if (lic)
>> /// A for (...)
>> @@ -35,6 +35,31 @@ namespace llvm {
>> /// This pass expects LICM to be run before it to hoist invariant
>> conditions out
>> /// of the loop, to make the unswitching opportunity obvious.
>> ///
>> +/// There is a taxonomy of unswitching that we use to classify different
>> forms
>> +/// of this transformaiton:
>> +///
>> +/// - Trival unswitching: this is when the condition can be unswitched
>> without
>> +/// cloning any code from inside the loop. A non-trivial unswitch
>> requires
>> +/// code duplication.
>> +///
>> +/// - Full unswitching: this is when the branch or switch is completely
>> moved
>> +/// from inside the loop to outside the loop. Partial unswitching
>> removes the
>> +/// branch from the clone of the loop but must leave a (somewhat
>> simplified)
>> +/// branch in the original loop. While theoretically partial
>> unswitching can
>> +/// be done for switches, the requirements are extreme - we need the
>> loop
>> +/// invariant input to the switch to be sufficient to collapse to a
>> single
>> +/// successor in each clone.
>> +///
>> +/// This pass always does trivial, full unswitching for both branches and
>> +/// switches. For branches, it also always does trivial, partial
>> unswitching.
>> +///
>> +/// If enabled (via the constructor's `NonTrivial` parameter), this pass
>> will
>> +/// additionally do non-trivial, full unswitching for branches and
>> switches, and
>> +/// will do non-trivial, partial unswitching for branches.
>> +///
>> +/// Because partial unswitching of switches is extremely unlikely to be
>> possible
>> +/// in practice and significantly complicates the implementation, this
>> pass does
>> +/// not currently implement that in any mode.
>> class SimpleLoopUnswitchPass : public
>> PassInfoMixin<SimpleLoopUnswitchPass> {
>> bool NonTrivial;
>>
>>
>> Modified: llvm/trunk/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp?rev=335553&r1=335552&r2=335553&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp (original)
>> +++ llvm/trunk/lib/Transforms/Scalar/SimpleLoopUnswitch.cpp Mon Jun 25
>> 16:32:54 2018
>> @@ -715,8 +715,12 @@ static bool unswitchAllTrivialConditions
>> ///
>> /// This routine handles cloning all of the necessary loop blocks and
>> exit
>> /// blocks including rewriting their instructions and the relevant PHI
>> nodes.
>> -/// It skips loop and exit blocks that are not necessary based on the
>> provided
>> -/// set. It also correctly creates the unconditional branch in the cloned
>> +/// Any loop blocks or exit blocks which are dominated by a different
>> successor
>> +/// than the one for this clone of the loop blocks can be trivially
>> skipped. We
>> +/// use the `DominatingSucc` map to determine whether a block satisfies
>> that
>> +/// property with a simple map lookup.
>> +///
>> +/// It also correctly creates the unconditional branch in the cloned
>> /// unswitched parent block to only point at the unswitched successor.
>> ///
>> /// This does not handle most of the necessary updates to `LoopInfo`.
>> Only exit
>> @@ -730,7 +734,7 @@ static BasicBlock *buildClonedLoopBlocks
>> Loop &L, BasicBlock *LoopPH, BasicBlock *SplitBB,
>> ArrayRef<BasicBlock *> ExitBlocks, BasicBlock *ParentBB,
>> BasicBlock *UnswitchedSuccBB, BasicBlock *ContinueSuccBB,
>> - const SmallPtrSetImpl<BasicBlock *> &SkippedLoopAndExitBlocks,
>> + const SmallDenseMap<BasicBlock *, BasicBlock *, 16> &DominatingSucc,
>> ValueToValueMapTy &VMap,
>> SmallVectorImpl<DominatorTree::UpdateType> &DTUpdates,
>> AssumptionCache &AC,
>> DominatorTree &DT, LoopInfo &LI) {
>> @@ -751,19 +755,26 @@ static BasicBlock *buildClonedLoopBlocks
>> return NewBB;
>> };
>>
>> + // We skip cloning blocks when they have a dominating succ that is not
>> the
>> + // succ we are cloning for.
>> + auto SkipBlock = [&](BasicBlock *BB) {
>> + auto It = DominatingSucc.find(BB);
>> + return It != DominatingSucc.end() && It->second != UnswitchedSuccBB;
>> + };
>> +
>> // First, clone the preheader.
>> auto *ClonedPH = CloneBlock(LoopPH);
>>
>> // Then clone all the loop blocks, skipping the ones that aren't
>> necessary.
>> for (auto *LoopBB : L.blocks())
>> - if (!SkippedLoopAndExitBlocks.count(LoopBB))
>> + if (!SkipBlock(LoopBB))
>> CloneBlock(LoopBB);
>>
>> // Split all the loop exit edges so that when we clone the exit
>> blocks, if
>> // any of the exit blocks are *also* a preheader for some other loop,
>> we
>> // don't create multiple predecessors entering the loop header.
>> for (auto *ExitBB : ExitBlocks) {
>> - if (SkippedLoopAndExitBlocks.count(ExitBB))
>> + if (SkipBlock(ExitBB))
>> continue;
>>
>> // When we are going to clone an exit, we don't need to clone all the
>> @@ -841,7 +852,7 @@ static BasicBlock *buildClonedLoopBlocks
>> // Update any PHI nodes in the cloned successors of the skipped blocks
>> to not
>> // have spurious incoming values.
>> for (auto *LoopBB : L.blocks())
>> - if (SkippedLoopAndExitBlocks.count(LoopBB))
>> + if (SkipBlock(LoopBB))
>> for (auto *SuccBB : successors(LoopBB))
>> if (auto *ClonedSuccBB =
>> cast_or_null<BasicBlock>(VMap.lookup(SuccBB)))
>> for (PHINode &PN : ClonedSuccBB->phis())
>> @@ -1175,10 +1186,41 @@ static void buildClonedLoops(Loop &OrigL
>> }
>>
>> static void
>> +deleteDeadClonedBlocks(Loop &L, ArrayRef<BasicBlock *> ExitBlocks,
>> + ArrayRef<std::unique_ptr<ValueToValueMapTy>>
>> VMaps,
>> + DominatorTree &DT) {
>> + // Find all the dead clones, and remove them from their successors.
>> + SmallVector<BasicBlock *, 16> DeadBlocks;
>> + for (BasicBlock *BB : llvm::concat<BasicBlock *const>(L.blocks(),
>> ExitBlocks))
>> + for (auto &VMap : VMaps)
>> + if (BasicBlock *ClonedBB =
>> cast_or_null<BasicBlock>(VMap->lookup(BB)))
>> + if (!DT.isReachableFromEntry(ClonedBB)) {
>> + for (BasicBlock *SuccBB : successors(ClonedBB))
>> + SuccBB->removePredecessor(ClonedBB);
>> + DeadBlocks.push_back(ClonedBB);
>> + }
>> +
>> + // Drop any remaining references to break cycles.
>> + for (BasicBlock *BB : DeadBlocks)
>> + BB->dropAllReferences();
>> + // Erase them from the IR.
>> + for (BasicBlock *BB : DeadBlocks)
>> + BB->eraseFromParent();
>> +}
>> +
>> +static void
>> deleteDeadBlocksFromLoop(Loop &L,
>> - const SmallVectorImpl<BasicBlock *> &DeadBlocks,
>> SmallVectorImpl<BasicBlock *> &ExitBlocks,
>> DominatorTree &DT, LoopInfo &LI) {
>> + // Find all the dead blocks, and remove them from their successors.
>> + SmallVector<BasicBlock *, 16> DeadBlocks;
>> + for (BasicBlock *BB : llvm::concat<BasicBlock *const>(L.blocks(),
>> ExitBlocks))
>> + if (!DT.isReachableFromEntry(BB)) {
>> + for (BasicBlock *SuccBB : successors(BB))
>> + SuccBB->removePredecessor(BB);
>> + DeadBlocks.push_back(BB);
>> + }
>> +
>> SmallPtrSet<BasicBlock *, 16> DeadBlockSet(DeadBlocks.begin(),
>> DeadBlocks.end());
>>
>> @@ -1187,11 +1229,6 @@ deleteDeadBlocksFromLoop(Loop &L,
>> llvm::erase_if(ExitBlocks,
>> [&](BasicBlock *BB) { return DeadBlockSet.count(BB); });
>>
>> - // Remove these blocks from their successors.
>> - for (auto *BB : DeadBlocks)
>> - for (BasicBlock *SuccBB : successors(BB))
>> - SuccBB->removePredecessor(BB, /*DontDeleteUselessPHIs*/ true);
>> -
>> // Walk from this loop up through its parents removing all of the dead
>> blocks.
>> for (Loop *ParentL = &L; ParentL; ParentL = ParentL->getParentLoop()) {
>> for (auto *BB : DeadBlocks)
>> @@ -1582,31 +1619,24 @@ void visitDomSubTree(DominatorTree &DT,
>> } while (!DomWorklist.empty());
>> }
>>
>> -/// Take an invariant branch that has been determined to be safe and
>> worthwhile
>> -/// to unswitch despite being non-trivial to do so and perform the
>> unswitch.
>> -///
>> -/// This directly updates the CFG to hoist the predicate out of the
>> loop, and
>> -/// clone the necessary parts of the loop to maintain behavior.
>> -///
>> -/// It also updates both dominator tree and loopinfo based on the
>> unswitching.
>> -///
>> -/// Once unswitching has been performed it runs the provided callback to
>> report
>> -/// the new loops and no-longer valid loops to the caller.
>> -static bool unswitchInvariantBranch(
>> - Loop &L, BranchInst &BI, ArrayRef<Value *> Invariants, DominatorTree
>> &DT,
>> - LoopInfo &LI, AssumptionCache &AC,
>> +static bool unswitchNontrivialInvariants(
>> + Loop &L, TerminatorInst &TI, ArrayRef<Value *> Invariants,
>> + DominatorTree &DT, LoopInfo &LI, AssumptionCache &AC,
>> function_ref<void(bool, ArrayRef<Loop *>)> UnswitchCB) {
>> - auto *ParentBB = BI.getParent();
>> -
>> - // We can only unswitch conditional branches with an invariant
>> condition or
>> - // combining invariant conditions with an instruction.
>> - assert(BI.isConditional() && "Can only unswitch a conditional
>> branch!");
>> - bool FullUnswitch = BI.getCondition() == Invariants[0];
>> + auto *ParentBB = TI.getParent();
>> + BranchInst *BI = dyn_cast<BranchInst>(&TI);
>> + SwitchInst *SI = BI ? nullptr : cast<SwitchInst>(&TI);
>> +
>> + // We can only unswitch switches, conditional branches with an
>> invariant
>> + // condition, or combining invariant conditions with an instruction.
>> + assert((SI || BI->isConditional()) &&
>> + "Can only unswitch switches and conditional branch!");
>> + bool FullUnswitch = SI || BI->getCondition() == Invariants[0];
>> if (FullUnswitch)
>> assert(Invariants.size() == 1 &&
>> "Cannot have other invariants with full unswitching!");
>> else
>> - assert(isa<Instruction>(BI.getCondition()) &&
>> + assert(isa<Instruction>(BI->getCondition()) &&
>> "Partial unswitching requires an instruction as the
>> condition!");
>>
>> // Constant and BBs tracking the cloned and continuing successor. When
>> we are
>> @@ -1618,18 +1648,27 @@ static bool unswitchInvariantBranch(
>> bool Direction = true;
>> int ClonedSucc = 0;
>> if (!FullUnswitch) {
>> - if (cast<Instruction>(BI.getCondition())->getOpcode() !=
>> Instruction::Or) {
>> - assert(cast<Instruction>(BI.getCondition())->getOpcode() ==
>> Instruction::And &&
>> - "Only `or` and `and` instructions can combine invariants being
>> unswitched.");
>> + if (cast<Instruction>(BI->getCondition())->getOpcode() !=
>> Instruction::Or) {
>> + assert(cast<Instruction>(BI->getCondition())->getOpcode() ==
>> + Instruction::And &&
>> + "Only `or` and `and` instructions can combine invariants
>> being "
>> + "unswitched.");
>> Direction = false;
>> ClonedSucc = 1;
>> }
>> }
>> - auto *UnswitchedSuccBB = BI.getSuccessor(ClonedSucc);
>> - auto *ContinueSuccBB = BI.getSuccessor(1 - ClonedSucc);
>>
>> - assert(UnswitchedSuccBB != ContinueSuccBB &&
>> - "Should not unswitch a branch that always goes to the same
>> place!");
>> + BasicBlock *RetainedSuccBB =
>> + BI ? BI->getSuccessor(1 - ClonedSucc) : SI->getDefaultDest();
>> + SmallSetVector<BasicBlock *, 4> UnswitchedSuccBBs;
>> + if (BI)
>> + UnswitchedSuccBBs.insert(BI->getSuccessor(ClonedSucc));
>> + else
>> + for (auto Case : SI->cases())
>> + UnswitchedSuccBBs.insert(Case.getCaseSuccessor());
>> +
>> + assert(!UnswitchedSuccBBs.count(RetainedSuccBB) &&
>> + "Should not unswitch the same successor we are retaining!");
>>
>> // The branch should be in this exact loop. Any inner loop's invariant
>> branch
>> // should be handled by unswitching that inner loop. The caller of this
>> @@ -1648,9 +1687,6 @@ static bool unswitchInvariantBranch(
>> if (isa<CleanupPadInst>(ExitBB->getFirstNonPHI()))
>> return false;
>>
>> - SmallPtrSet<BasicBlock *, 4> ExitBlockSet(ExitBlocks.begin(),
>> - ExitBlocks.end());
>> -
>> // Compute the parent loop now before we start hacking on things.
>> Loop *ParentL = L.getParentLoop();
>>
>> @@ -1669,30 +1705,22 @@ static bool unswitchInvariantBranch(
>> OuterExitL = NewOuterExitL;
>> }
>>
>> - // If the edge we *aren't* cloning in the unswitch (the continuing
>> edge)
>> - // dominates its target, we can skip cloning the dominated region of
>> the loop
>> - // and its exits. We compute this as a set of nodes to be skipped.
>> - SmallPtrSet<BasicBlock *, 4> SkippedLoopAndExitBlocks;
>> - if (ContinueSuccBB->getUniquePredecessor() ||
>> - llvm::all_of(predecessors(ContinueSuccBB), [&](BasicBlock *PredBB)
>> {
>> - return PredBB == ParentBB || DT.dominates(ContinueSuccBB,
>> PredBB);
>> - })) {
>> - visitDomSubTree(DT, ContinueSuccBB, [&](BasicBlock *BB) {
>> - SkippedLoopAndExitBlocks.insert(BB);
>> - return true;
>> - });
>> - }
>> - // If we are doing full unswitching, then similarly to the above, the
>> edge we
>> - // *are* cloning in the unswitch (the unswitched edge) dominates its
>> target,
>> - // we will end up with dead nodes in the original loop and its exits
>> that will
>> - // need to be deleted. Here, we just retain that the property holds
>> and will
>> - // compute the deleted set later.
>> - bool DeleteUnswitchedSucc =
>> - FullUnswitch &&
>> - (UnswitchedSuccBB->getUniquePredecessor() ||
>> - llvm::all_of(predecessors(UnswitchedSuccBB), [&](BasicBlock
>> *PredBB) {
>> - return PredBB == ParentBB || DT.dominates(UnswitchedSuccBB,
>> PredBB);
>> - }));
>> + // If the edge from this terminator to a successor dominates that
>> successor,
>> + // store a map from each block in its dominator subtree to it. This
>> lets us
>> + // tell when cloning for a particular successor if a block is
>> dominated by
>> + // some *other* successor with a single data structure. We use this to
>> + // significantly reduce cloning.
>> + SmallDenseMap<BasicBlock *, BasicBlock *, 16> DominatingSucc;
>> + for (auto *SuccBB : llvm::concat<BasicBlock *const>(
>> + makeArrayRef(RetainedSuccBB), UnswitchedSuccBBs))
>> + if (SuccBB->getUniquePredecessor() ||
>> + llvm::all_of(predecessors(SuccBB), [&](BasicBlock *PredBB) {
>> + return PredBB == ParentBB || DT.dominates(SuccBB, PredBB);
>> + }))
>> + visitDomSubTree(DT, SuccBB, [&](BasicBlock *BB) {
>> + DominatingSucc[BB] = SuccBB;
>> + return true;
>> + });
>>
>> // Split the preheader, so that we know that there is a safe place to
>> insert
>> // the conditional branch. We will change the preheader to have a
>> conditional
>> @@ -1702,84 +1730,93 @@ static bool unswitchInvariantBranch(
>> BasicBlock *SplitBB = L.getLoopPreheader();
>> BasicBlock *LoopPH = SplitEdge(SplitBB, L.getHeader(), &DT, &LI);
>>
>> - // Keep a mapping for the cloned values.
>> - ValueToValueMapTy VMap;
>> -
>> // Keep track of the dominator tree updates needed.
>> SmallVector<DominatorTree::UpdateType, 4> DTUpdates;
>>
>> - // Build the cloned blocks from the loop.
>> - auto *ClonedPH = buildClonedLoopBlocks(
>> - L, LoopPH, SplitBB, ExitBlocks, ParentBB, UnswitchedSuccBB,
>> - ContinueSuccBB, SkippedLoopAndExitBlocks, VMap, DTUpdates, AC, DT,
>> LI);
>> + // Clone the loop for each unswitched successor.
>> + SmallVector<std::unique_ptr<ValueToValueMapTy>, 4> VMaps;
>> + VMaps.reserve(UnswitchedSuccBBs.size());
>> + SmallDenseMap<BasicBlock *, BasicBlock *, 4> ClonedPHs;
>> + for (auto *SuccBB : UnswitchedSuccBBs) {
>> + VMaps.emplace_back(new ValueToValueMapTy());
>> + ClonedPHs[SuccBB] = buildClonedLoopBlocks(
>> + L, LoopPH, SplitBB, ExitBlocks, ParentBB, SuccBB, RetainedSuccBB,
>> + DominatingSucc, *VMaps.back(), DTUpdates, AC, DT, LI);
>> + }
>>
>> // The stitching of the branched code back together depends on whether
>> we're
>> // doing full unswitching or not with the exception that we always
>> want to
>> // nuke the initial terminator placed in the split block.
>> SplitBB->getTerminator()->eraseFromParent();
>> if (FullUnswitch) {
>> - // Remove the parent as a predecessor of the
>> - // unswitched successor.
>> - UnswitchedSuccBB->removePredecessor(ParentBB,
>> - /*DontDeleteUselessPHIs*/ true);
>> - DTUpdates.push_back({DominatorTree::Delete, ParentBB,
>> UnswitchedSuccBB});
>> -
>> - // Now splice the branch from the original loop and use it to select
>> between
>> - // the two loops.
>> - SplitBB->getInstList().splice(SplitBB->end(),
>> ParentBB->getInstList(), BI);
>> - BI.setSuccessor(ClonedSucc, ClonedPH);
>> - BI.setSuccessor(1 - ClonedSucc, LoopPH);
>> + for (BasicBlock *SuccBB : UnswitchedSuccBBs) {
>> + // Remove the parent as a predecessor of the unswitched successor.
>> + SuccBB->removePredecessor(ParentBB,
>> + /*DontDeleteUselessPHIs*/ true);
>> + DTUpdates.push_back({DominatorTree::Delete, ParentBB, SuccBB});
>> + }
>> +
>> + // Now splice the terminator from the original loop and rewrite its
>> + // successors.
>> + SplitBB->getInstList().splice(SplitBB->end(),
>> ParentBB->getInstList(), TI);
>> + if (BI) {
>> + assert(UnswitchedSuccBBs.size() == 1 &&
>> + "Only one possible unswitched block for a branch!");
>> + BasicBlock *ClonedPH = ClonedPHs.begin()->second;
>> + BI->setSuccessor(ClonedSucc, ClonedPH);
>> + BI->setSuccessor(1 - ClonedSucc, LoopPH);
>> + DTUpdates.push_back({DominatorTree::Insert, SplitBB, ClonedPH});
>> + } else {
>> + assert(SI && "Must either be a branch or switch!");
>> +
>> + // Walk the cases and directly update their successors.
>> + for (auto &Case : SI->cases())
>> +
>> Case.setSuccessor(ClonedPHs.find(Case.getCaseSuccessor())->second);
>> + // We need to use the set to populate domtree updates as even when
>> there
>> + // are multiple cases pointing at the same successor we only want
>> to
>> + // insert one edge in the domtree.
>> + for (BasicBlock *SuccBB : UnswitchedSuccBBs)
>> + DTUpdates.push_back(
>> + {DominatorTree::Insert, SplitBB,
>> ClonedPHs.find(SuccBB)->second});
>> +
>> + SI->setDefaultDest(LoopPH);
>> + }
>>
>> // Create a new unconditional branch to the continuing block (as
>> opposed to
>> // the one cloned).
>> - BranchInst::Create(ContinueSuccBB, ParentBB);
>> + BranchInst::Create(RetainedSuccBB, ParentBB);
>> } else {
>> + assert(BI && "Only branches have partial unswitching.");
>> + assert(UnswitchedSuccBBs.size() == 1 &&
>> + "Only one possible unswitched block for a branch!");
>> + BasicBlock *ClonedPH = ClonedPHs.begin()->second;
>> // When doing a partial unswitch, we have to do a bit more work to
>> build up
>> // the branch in the split block.
>> buildPartialUnswitchConditionalBranch(*SplitBB, Invariants,
>> Direction,
>> *ClonedPH, *LoopPH);
>> + DTUpdates.push_back({DominatorTree::Insert, SplitBB, ClonedPH});
>> }
>>
>> - // Before we update the dominator tree, collect the dead blocks if
>> we're going
>> - // to end up deleting the unswitched successor.
>> - SmallVector<BasicBlock *, 16> DeadBlocks;
>> - if (DeleteUnswitchedSucc) {
>> - DeadBlocks.push_back(UnswitchedSuccBB);
>> - for (int i = 0; i < (int)DeadBlocks.size(); ++i) {
>> - // If we reach an exit block, stop recursing as the unswitched
>> loop will
>> - // end up reaching the merge block which we make the successor of
>> the
>> - // exit.
>> - if (ExitBlockSet.count(DeadBlocks[i]))
>> - continue;
>> -
>> - // Insert the children that are within the loop or exit block set.
>> Other
>> - // children may reach out of the loop. While we don't expect these
>> to be
>> - // dead (as the unswitched clone should reach them) we don't try
>> to prove
>> - // that here.
>> - for (DomTreeNode *ChildN : *DT[DeadBlocks[i]])
>> - if (L.contains(ChildN->getBlock()) ||
>> - ExitBlockSet.count(ChildN->getBlock()))
>> - DeadBlocks.push_back(ChildN->getBlock());
>> - }
>> - }
>> -
>> - // Add the remaining edge to our updates and apply them to get an
>> up-to-date
>> - // dominator tree. Note that this will cause the dead blocks above to
>> be
>> - // unreachable and no longer in the dominator tree.
>> - DTUpdates.push_back({DominatorTree::Insert, SplitBB, ClonedPH});
>> + // Apply the updates accumulated above to get an up-to-date dominator
>> tree.
>> DT.applyUpdates(DTUpdates);
>>
>> + // Now that we have an accurate dominator tree, first delete the dead
>> cloned
>> + // blocks so that we can accurately build any cloned loops. It is
>> important to
>> + // not delete the blocks from the original loop yet because we still
>> want to
>> + // reference the original loop to understand the cloned loop's
>> structure.
>> + deleteDeadClonedBlocks(L, ExitBlocks, VMaps, DT);
>> +
>> // Build the cloned loop structure itself. This may be substantially
>> // different from the original structure due to the simplified CFG.
>> This also
>> // handles inserting all the cloned blocks into the correct loops.
>> SmallVector<Loop *, 4> NonChildClonedLoops;
>> - buildClonedLoops(L, ExitBlocks, VMap, LI, NonChildClonedLoops);
>> -
>> - // Delete anything that was made dead in the original loop due to
>> - // unswitching.
>> - if (!DeadBlocks.empty())
>> - deleteDeadBlocksFromLoop(L, DeadBlocks, ExitBlocks, DT, LI);
>> + for (std::unique_ptr<ValueToValueMapTy> &VMap : VMaps)
>> + buildClonedLoops(L, ExitBlocks, *VMap, LI, NonChildClonedLoops);
>>
>> + // Now that our cloned loops have been built, we can update the
>> original loop.
>> + // First we delete the dead blocks from it and then we rebuild the loop
>> + // structure taking these deletions into account.
>> + deleteDeadBlocksFromLoop(L, ExitBlocks, DT, LI);
>> SmallVector<Loop *, 4> HoistedLoops;
>> bool IsStillLoop = rebuildLoopAfterUnswitch(L, ExitBlocks, LI,
>> HoistedLoops);
>>
>> @@ -1790,31 +1827,37 @@ static bool unswitchInvariantBranch(
>> // verification steps.
>> assert(DT.verify(DominatorTree::VerificationLevel::Fast));
>>
>> - // Now we want to replace all the uses of the invariants within both
>> the
>> - // original and cloned blocks. We do this here so that we can use the
>> now
>> - // updated dominator tree to identify which side the users are on.
>> - ConstantInt *UnswitchedReplacement =
>> - Direction ? ConstantInt::getTrue(BI.getContext())
>> - : ConstantInt::getFalse(BI.getContext());
>> - ConstantInt *ContinueReplacement =
>> - Direction ? ConstantInt::getFalse(BI.getContext())
>> - : ConstantInt::getTrue(BI.getContext());
>> - for (Value *Invariant : Invariants)
>> - for (auto UI = Invariant->use_begin(), UE = Invariant->use_end();
>> - UI != UE;) {
>> - // Grab the use and walk past it so we can clobber it in the use
>> list.
>> - Use *U = &*UI++;
>> - Instruction *UserI = dyn_cast<Instruction>(U->getUser());
>> - if (!UserI)
>> - continue;
>> + if (BI) {
>> + // If we unswitched a branch which collapses the condition to a known
>> + // constant we want to replace all the uses of the invariants within
>> both
>> + // the original and cloned blocks. We do this here so that we can
>> use the
>> + // now updated dominator tree to identify which side the users are
>> on.
>> + assert(UnswitchedSuccBBs.size() == 1 &&
>> + "Only one possible unswitched block for a branch!");
>> + BasicBlock *ClonedPH = ClonedPHs.begin()->second;
>> + ConstantInt *UnswitchedReplacement =
>> + Direction ? ConstantInt::getTrue(BI->getContext())
>> + : ConstantInt::getFalse(BI->getContext());
>> + ConstantInt *ContinueReplacement =
>> + Direction ? ConstantInt::getFalse(BI->getContext())
>> + : ConstantInt::getTrue(BI->getContext());
>> + for (Value *Invariant : Invariants)
>> + for (auto UI = Invariant->use_begin(), UE = Invariant->use_end();
>> + UI != UE;) {
>> + // Grab the use and walk past it so we can clobber it in the use
>> list.
>> + Use *U = &*UI++;
>> + Instruction *UserI = dyn_cast<Instruction>(U->getUser());
>> + if (!UserI)
>> + continue;
>>
>> - // Replace it with the 'continue' side if in the main loop body,
>> and the
>> - // unswitched if in the cloned blocks.
>> - if (DT.dominates(LoopPH, UserI->getParent()))
>> - U->set(ContinueReplacement);
>> - else if (DT.dominates(ClonedPH, UserI->getParent()))
>> - U->set(UnswitchedReplacement);
>> - }
>> + // Replace it with the 'continue' side if in the main loop body,
>> and the
>> + // unswitched if in the cloned blocks.
>> + if (DT.dominates(LoopPH, UserI->getParent()))
>> + U->set(ContinueReplacement);
>> + else if (DT.dominates(ClonedPH, UserI->getParent()))
>> + U->set(UnswitchedReplacement);
>> + }
>> + }
>>
>> // We can change which blocks are exit blocks of all the cloned sibling
>> // loops, the current loop, and any parent loops which shared exit
>> blocks
>> @@ -1937,8 +1980,16 @@ static bool unswitchBestCondition(
>> if (LI.getLoopFor(BB) != &L)
>> continue;
>>
>> + if (auto *SI = dyn_cast<SwitchInst>(BB->getTerminator())) {
>> + // We can only consider fully loop-invariant switch conditions as
>> we need
>> + // to completely eliminate the switch after unswitching.
>> + if (!isa<Constant>(SI->getCondition()) &&
>> + L.isLoopInvariant(SI->getCondition()))
>> + UnswitchCandidates.push_back({SI, {SI->getCondition()}});
>> + continue;
>> + }
>> +
>> auto *BI = dyn_cast<BranchInst>(BB->getTerminator());
>> - // FIXME: Handle switches here!
>> if (!BI || !BI->isConditional() || isa<Constant>(BI->getCondition())
>> ||
>> BI->getSuccessor(0) == BI->getSuccessor(1))
>> continue;
>> @@ -2091,9 +2142,9 @@ static bool unswitchBestCondition(
>> TerminatorInst &TI = *TerminatorAndInvariants.first;
>> ArrayRef<Value *> Invariants = TerminatorAndInvariants.second;
>> BranchInst *BI = dyn_cast<BranchInst>(&TI);
>> - int CandidateCost =
>> - ComputeUnswitchedCost(TI, /*FullUnswitch*/ Invariants.size() ==
>> 1 && BI &&
>> - Invariants[0] ==
>> BI->getCondition());
>> + int CandidateCost = ComputeUnswitchedCost(
>> + TI, /*FullUnswitch*/ !BI || (Invariants.size() == 1 &&
>> + Invariants[0] ==
>> BI->getCondition()));
>> LLVM_DEBUG(dbgs() << " Computed cost of " << CandidateCost
>> << " for unswitch candidate: " << TI << "\n");
>> if (!BestUnswitchTI || CandidateCost < BestUnswitchCost) {
>> @@ -2109,17 +2160,11 @@ static bool unswitchBestCondition(
>> return false;
>> }
>>
>> - auto *UnswitchBI = dyn_cast<BranchInst>(BestUnswitchTI);
>> - if (!UnswitchBI) {
>> - // FIXME: Add support for unswitching a switch here!
>> - LLVM_DEBUG(dbgs() << "Cannot unswitch anything but a branch!\n");
>> - return false;
>> - }
>> -
>> LLVM_DEBUG(dbgs() << " Trying to unswitch non-trivial (cost = "
>> - << BestUnswitchCost << ") branch: " << *UnswitchBI
>> << "\n");
>> - return unswitchInvariantBranch(L, *UnswitchBI, BestUnswitchInvariants,
>> DT, LI,
>> - AC, UnswitchCB);
>> + << BestUnswitchCost << ") terminator: " <<
>> *BestUnswitchTI
>> + << "\n");
>> + return unswitchNontrivialInvariants(
>> + L, *BestUnswitchTI, BestUnswitchInvariants, DT, LI, AC,
>> UnswitchCB);
>> }
>>
>> /// Unswitch control flow predicated on loop invariant conditions.
>>
>> Modified:
>> llvm/trunk/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch.ll
>> URL:
>> http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch.ll?rev=335553&r1=335552&r2=335553&view=diff
>>
>> ==============================================================================
>> --- llvm/trunk/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch.ll
>> (original)
>> +++ llvm/trunk/test/Transforms/SimpleLoopUnswitch/nontrivial-unswitch.ll
>> Mon Jun 25 16:32:54 2018
>> @@ -387,7 +387,7 @@ loop_begin:
>> loop_b:
>> %b = load i32, i32* %b.ptr
>> br i1 %v, label %loop_begin, label %loop_exit
>> -; The 'loop_b' unswitched loop.
>> +; The original loop, now non-looping due to unswitching..
>> ;
>> ; CHECK: entry.split:
>> ; CHECK-NEXT: br label %loop_begin
>> @@ -398,14 +398,13 @@ loop_b:
>> ; CHECK-NEXT: br label %loop_exit.split
>> ;
>> ; CHECK: loop_exit.split:
>> -; CHECK-NEXT: %[[A_LCSSA:.*]] = phi i32 [ %[[A]], %loop_begin ]
>> ; CHECK-NEXT: br label %loop_exit
>>
>> loop_exit:
>> %ab.phi = phi i32 [ %b, %loop_b ], [ %a, %loop_begin ]
>> ret i32 %ab.phi
>> ; CHECK: loop_exit:
>> -; CHECK-NEXT: %[[AB_PHI:.*]] = phi i32 [ %[[A_LCSSA]],
>> %loop_exit.split ], [ %[[B_LCSSA]], %loop_exit.split.us ]
>> +; CHECK-NEXT: %[[AB_PHI:.*]] = phi i32 [ %[[A]], %loop_exit.split ],
>> [ %[[B_LCSSA]], %loop_exit.split.us ]
>> ; CHECK-NEXT: ret i32 %[[AB_PHI]]
>> }
>>
>> @@ -458,8 +457,7 @@ loop_exit1:
>> call void @sink1(i32 %a.phi)
>> ret void
>> ; CHECK: loop_exit1:
>> -; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA]], %
>> loop_exit1.split.us ]
>> -; CHECK-NEXT: call void @sink1(i32 %[[A_PHI]])
>> +; CHECK-NEXT: call void @sink1(i32 %[[A_LCSSA]])
>> ; CHECK-NEXT: ret void
>>
>> loop_exit2:
>> @@ -467,8 +465,8 @@ loop_exit2:
>> call void @sink2(i32 %b.phi)
>> ret void
>> ; CHECK: loop_exit2:
>> -; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B]], %loop_b ]
>> -; CHECK-NEXT: call void @sink2(i32 %[[B_PHI]])
>> +; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B]], %loop_b ]
>> +; CHECK-NEXT: call void @sink2(i32 %[[B_LCSSA]])
>> ; CHECK-NEXT: ret void
>> }
>>
>> @@ -531,8 +529,7 @@ loop_exit2:
>> call void @sink2(i32 %b.phi)
>> ret void
>> ; CHECK: loop_exit2:
>> -; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B_LCSSA]], %
>> loop_exit2.split.us ]
>> -; CHECK-NEXT: call void @sink2(i32 %[[B_PHI]])
>> +; CHECK-NEXT: call void @sink2(i32 %[[B_LCSSA]])
>> ; CHECK-NEXT: ret void
>> }
>>
>> @@ -587,8 +584,7 @@ loop_exit1:
>> call void @sink1(i32 %a.phi)
>> br label %exit
>> ; CHECK: loop_exit1:
>> -; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA]], %
>> loop_exit1.split.us ]
>> -; CHECK-NEXT: call void @sink1(i32 %[[A_PHI]])
>> +; CHECK-NEXT: call void @sink1(i32 %[[A_LCSSA]])
>> ; CHECK-NEXT: br label %exit
>>
>> loop_exit2:
>> @@ -596,8 +592,8 @@ loop_exit2:
>> call void @sink2(i32 %b.phi)
>> br label %exit
>> ; CHECK: loop_exit2:
>> -; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B]], %loop_b ]
>> -; CHECK-NEXT: call void @sink2(i32 %[[B_PHI]])
>> +; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B]], %loop_b ]
>> +; CHECK-NEXT: call void @sink2(i32 %[[B_LCSSA]])
>> ; CHECK-NEXT: br label %exit
>>
>> exit:
>> @@ -663,7 +659,7 @@ loop_latch:
>> %v2 = load i1, i1* %ptr
>> br i1 %v2, label %loop_begin, label %loop_exit
>> ; CHECK: loop_latch:
>> -; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B]], %inner_loop_b ]
>> +; CHECK-NEXT: %[[B_INNER_LCSSA:.*]] = phi i32 [ %[[B]], %inner_loop_b
>> ]
>> ; CHECK-NEXT: %[[V2:.*]] = load i1, i1* %ptr
>> ; CHECK-NEXT: br i1 %[[V2]], label %loop_begin, label
>> %loop_exit.loopexit1
>>
>> @@ -671,15 +667,14 @@ loop_exit:
>> %ab.phi = phi i32 [ %a, %inner_loop_begin ], [ %b.phi, %loop_latch ]
>> ret i32 %ab.phi
>> ; CHECK: loop_exit.loopexit:
>> -; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA]], %
>> loop_exit.loopexit.split.us ]
>> ; CHECK-NEXT: br label %loop_exit
>> ;
>> ; CHECK: loop_exit.loopexit1:
>> -; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B_LCSSA]], %loop_latch ]
>> +; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B_INNER_LCSSA]],
>> %loop_latch ]
>> ; CHECK-NEXT: br label %loop_exit
>> ;
>> ; CHECK: loop_exit:
>> -; CHECK-NEXT: %[[AB_PHI:.*]] = phi i32 [ %[[A_PHI]],
>> %loop_exit.loopexit ], [ %[[B_PHI]], %loop_exit.loopexit1 ]
>> +; CHECK-NEXT: %[[AB_PHI:.*]] = phi i32 [ %[[A_LCSSA]],
>> %loop_exit.loopexit ], [ %[[B_LCSSA]], %loop_exit.loopexit1 ]
>> ; CHECK-NEXT: ret i32 %[[AB_PHI]]
>> }
>>
>> @@ -773,11 +768,10 @@ latch:
>> ; CHECK-NEXT: br label %latch
>> ;
>> ; CHECK: latch:
>> -; CHECK-NEXT: %[[B_PHI:.*]] = phi i32 [ %[[B_INNER_LCSSA]],
>> %loop_b_inner_exit ]
>> ; CHECK-NEXT: br i1 %[[V]], label %loop_begin, label %loop_exit.split
>> ;
>> ; CHECK: loop_exit.split:
>> -; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B_PHI]], %latch ]
>> +; CHECK-NEXT: %[[B_LCSSA:.*]] = phi i32 [ %[[B_INNER_LCSSA]], %latch ]
>> ; CHECK-NEXT: br label %loop_exit
>>
>> loop_exit:
>> @@ -1466,7 +1460,6 @@ inner_loop_exit:
>> %v = load i1, i1* %ptr
>> br i1 %v, label %loop_begin, label %loop_exit
>> ; CHECK: inner_loop_exit:
>> -; CHECK-NEXT: %[[A_INNER_LCSSA:.*]] = phi i32 [
>> %[[A_INNER_LCSSA_US]], %inner_loop_exit.split.us ]
>> ; CHECK-NEXT: %[[V:.*]] = load i1, i1* %ptr
>> ; CHECK-NEXT: br i1 %[[V]], label %loop_begin, label %loop_exit
>>
>> @@ -1474,7 +1467,7 @@ loop_exit:
>> %a.lcssa = phi i32 [ %a.inner_lcssa, %inner_loop_exit ]
>> ret i32 %a.lcssa
>> ; CHECK: loop_exit:
>> -; CHECK-NEXT: %[[A_LCSSA:.*]] = phi i32 [ %[[A_INNER_LCSSA]],
>> %inner_loop_exit ]
>> +; CHECK-NEXT: %[[A_LCSSA:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]],
>> %inner_loop_exit ]
>> ; CHECK-NEXT: ret i32 %[[A_LCSSA]]
>> }
>>
>> @@ -1555,7 +1548,7 @@ loop_exit:
>> ret i32 %a.lcssa
>> ; CHECK: loop_exit:
>> ; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA]],
>> %loop_exit.split ], [ %[[A_PHI_US]], %loop_exit.split.us ]
>> -; CHECK-NEXT: ret i32 %[[AB_PHI]]
>> +; CHECK-NEXT: ret i32 %[[A_PHI]]
>> }
>>
>> ; Test that requires re-forming dedicated exits for the original loop.
>> @@ -1635,7 +1628,7 @@ loop_exit:
>> ret i32 %a.lcssa
>> ; CHECK: loop_exit:
>> ; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_PHI_SPLIT]],
>> %loop_exit.split ], [ %[[A_LCSSA_US]], %loop_exit.split.us ]
>> -; CHECK-NEXT: ret i32 %[[AB_PHI]]
>> +; CHECK-NEXT: ret i32 %[[A_PHI]]
>> }
>>
>> ; Check that if a cloned inner loop after unswitching doesn't loop and
>> directly
>> @@ -1721,7 +1714,6 @@ loop_exit:
>> %a.lcssa = phi i32 [ %a, %inner_loop_begin ], [ %a.inner_lcssa,
>> %inner_loop_exit ]
>> ret i32 %a.lcssa
>> ; CHECK: loop_exit.loopexit:
>> -; CHECK-NEXT: %[[A_LCSSA_US:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]], %
>> loop_exit.loopexit.split.us ]
>> ; CHECK-NEXT: br label %loop_exit
>> ;
>> ; CHECK: loop_exit.loopexit1:
>> @@ -1729,7 +1721,7 @@ loop_exit:
>> ; CHECK-NEXT: br label %loop_exit
>> ;
>> ; CHECK: loop_exit:
>> -; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_LCSSA_US]],
>> %loop_exit.loopexit ], [ %[[A_LCSSA]], %loop_exit.loopexit1 ]
>> +; CHECK-NEXT: %[[A_PHI:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]],
>> %loop_exit.loopexit ], [ %[[A_LCSSA]], %loop_exit.loopexit1 ]
>> ; CHECK-NEXT: ret i32 %[[A_PHI]]
>> }
>>
>> @@ -1802,7 +1794,6 @@ inner_loop_exit:
>> %v3 = load i1, i1* %ptr
>> br i1 %v3, label %loop_latch, label %loop_exit
>> ; CHECK: inner_loop_exit:
>> -; CHECK-NEXT: %[[A_INNER_PHI:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]], %
>> inner_loop_exit.split.us ]
>> ; CHECK-NEXT: %[[V:.*]] = load i1, i1* %ptr
>> ; CHECK-NEXT: br i1 %[[V]], label %loop_latch, label
>> %loop_exit.loopexit1
>>
>> @@ -1819,7 +1810,7 @@ loop_exit:
>> ; CHECK-NEXT: br label %loop_exit
>> ;
>> ; CHECK: loop_exit.loopexit1:
>> -; CHECK-NEXT: %[[A_LCSSA_US:.*]] = phi i32 [ %[[A_INNER_PHI]],
>> %inner_loop_exit ]
>> +; CHECK-NEXT: %[[A_LCSSA_US:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]],
>> %inner_loop_exit ]
>> ; CHECK-NEXT: br label %loop_exit
>> ;
>> ; CHECK: loop_exit:
>> @@ -1916,7 +1907,6 @@ inner_loop_exit:
>> %v4 = load i1, i1* %ptr
>> br i1 %v4, label %loop_begin, label %loop_exit
>> ; CHECK: inner_loop_exit.loopexit:
>> -; CHECK-NEXT: %[[A_INNER_LCSSA_US:.*]] = phi i32 [
>> %[[A_INNER_INNER_LCSSA_US]], %inner_loop_exit.loopexit.split.us ]
>> ; CHECK-NEXT: br label %inner_loop_exit
>> ;
>> ; CHECK: inner_loop_exit.loopexit1:
>> @@ -1924,7 +1914,7 @@ inner_loop_exit:
>> ; CHECK-NEXT: br label %inner_loop_exit
>> ;
>> ; CHECK: inner_loop_exit:
>> -; CHECK-NEXT: %[[A_INNER_PHI:.*]] = phi i32 [ %[[A_INNER_LCSSA_US]],
>> %inner_loop_exit.loopexit ], [ %[[A_INNER_LCSSA]],
>> %inner_loop_exit.loopexit1 ]
>> +; CHECK-NEXT: %[[A_INNER_PHI:.*]] = phi i32 [
>> %[[A_INNER_INNER_LCSSA_US]], %inner_loop_exit.loopexit ], [
>> %[[A_INNER_LCSSA]], %inner_loop_exit.loopexit1 ]
>> ; CHECK-NEXT: %[[V:.*]] = load i1, i1* %ptr
>> ; CHECK-NEXT: br i1 %[[V]], label %loop_begin, label %loop_exit
>>
>> @@ -2010,7 +2000,6 @@ inner_inner_loop_exit:
>> %v3 = load i1, i1* %ptr
>> br i1 %v3, label %inner_loop_latch, label %inner_loop_exit
>> ; CHECK: inner_inner_loop_exit:
>> -; CHECK-NEXT: %[[A_INNER_INNER_PHI:.*]] = phi i32 [
>> %[[A_INNER_INNER_LCSSA_US]], %inner_inner_loop_exit.split.us ]
>> ; CHECK-NEXT: %[[V:.*]] = load i1, i1* %ptr
>> ; CHECK-NEXT: br i1 %[[V]], label %inner_loop_latch, label
>> %inner_loop_exit.loopexit1
>>
>> @@ -2028,7 +2017,7 @@ inner_loop_exit:
>> ; CHECK-NEXT: br label %inner_loop_exit
>> ;
>> ; CHECK: inner_loop_exit.loopexit1:
>> -; CHECK-NEXT: %[[A_INNER_LCSSA_US:.*]] = phi i32 [
>> %[[A_INNER_INNER_PHI]], %inner_inner_loop_exit ]
>> +; CHECK-NEXT: %[[A_INNER_LCSSA_US:.*]] = phi i32 [
>> %[[A_INNER_INNER_LCSSA_US]], %inner_inner_loop_exit ]
>> ; CHECK-NEXT: br label %inner_loop_exit
>> ;
>> ; CHECK: inner_loop_exit:
>> @@ -2296,56 +2285,96 @@ define i32 @test20(i32* %var, i32 %cond1
>> entry:
>> br label %loop_begin
>> ; CHECK-NEXT: entry:
>> -; CHECK-NEXT: br label %loop_begin
>> +; CHECK-NEXT: switch i32 %cond2, label %[[ENTRY_SPLIT_EXIT:.*]] [
>> +; CHECK-NEXT: i32 0, label %[[ENTRY_SPLIT_A:.*]]
>> +; CHECK-NEXT: i32 1, label %[[ENTRY_SPLIT_A]]
>> +; CHECK-NEXT: i32 13, label %[[ENTRY_SPLIT_B:.*]]
>> +; CHECK-NEXT: i32 2, label %[[ENTRY_SPLIT_A]]
>> +; CHECK-NEXT: i32 42, label %[[ENTRY_SPLIT_C:.*]]
>> +; CHECK-NEXT: ]
>>
>> loop_begin:
>> %var_val = load i32, i32* %var
>> - switch i32 %cond2, label %loop_a [
>> - i32 0, label %loop_b
>> - i32 1, label %loop_b
>> - i32 13, label %loop_c
>> - i32 2, label %loop_b
>> - i32 42, label %loop_exit
>> + switch i32 %cond2, label %loop_exit [
>> + i32 0, label %loop_a
>> + i32 1, label %loop_a
>> + i32 13, label %loop_b
>> + i32 2, label %loop_a
>> + i32 42, label %loop_c
>> ]
>> -; CHECK: loop_begin:
>> -; CHECK-NEXT: %[[V:.*]] = load i32, i32* %var
>> -; CHECK-NEXT: switch i32 %cond2, label %loop_a [
>> -; CHECK-NEXT: i32 0, label %loop_b
>> -; CHECK-NEXT: i32 1, label %loop_b
>> -; CHECK-NEXT: i32 13, label %loop_c
>> -; CHECK-NEXT: i32 2, label %loop_b
>> -; CHECK-NEXT: i32 42, label %loop_exit
>> -; CHECK-NEXT: ]
>>
>> loop_a:
>> call void @a()
>> br label %loop_latch
>> -; CHECK: loop_a:
>> +; Unswitched 'a' loop.
>> +;
>> +; CHECK: [[ENTRY_SPLIT_A]]:
>> +; CHECK-NEXT: br label %[[LOOP_BEGIN_A:.*]]
>> +;
>> +; CHECK: [[LOOP_BEGIN_A]]:
>> +; CHECK-NEXT: %{{.*}} = load i32, i32* %var
>> +; CHECK-NEXT: br label %[[LOOP_A:.*]]
>> +;
>> +; CHECK: [[LOOP_A]]:
>> ; CHECK-NEXT: call void @a()
>> -; CHECK-NEXT: br label %loop_latch
>> +; CHECK-NEXT: br label %[[LOOP_LATCH_A:.*]]
>> +;
>> +; CHECK: [[LOOP_LATCH_A]]:
>> +; CHECK: br label %[[LOOP_BEGIN_A]]
>>
>> loop_b:
>> call void @b()
>> br label %loop_latch
>> -; CHECK: loop_b:
>> +; Unswitched 'b' loop.
>> +;
>> +; CHECK: [[ENTRY_SPLIT_B]]:
>> +; CHECK-NEXT: br label %[[LOOP_BEGIN_B:.*]]
>> +;
>> +; CHECK: [[LOOP_BEGIN_B]]:
>> +; CHECK-NEXT: %{{.*}} = load i32, i32* %var
>> +; CHECK-NEXT: br label %[[LOOP_B:.*]]
>> +;
>> +; CHECK: [[LOOP_B]]:
>> ; CHECK-NEXT: call void @b()
>> -; CHECK-NEXT: br label %loop_latch
>> +; CHECK-NEXT: br label %[[LOOP_LATCH_B:.*]]
>> +;
>> +; CHECK: [[LOOP_LATCH_B]]:
>> +; CHECK: br label %[[LOOP_BEGIN_B]]
>>
>> loop_c:
>> call void @c() noreturn nounwind
>> br label %loop_latch
>> -; CHECK: loop_c:
>> +; Unswitched 'c' loop.
>> +;
>> +; CHECK: [[ENTRY_SPLIT_C]]:
>> +; CHECK-NEXT: br label %[[LOOP_BEGIN_C:.*]]
>> +;
>> +; CHECK: [[LOOP_BEGIN_C]]:
>> +; CHECK-NEXT: %{{.*}} = load i32, i32* %var
>> +; CHECK-NEXT: br label %[[LOOP_C:.*]]
>> +;
>> +; CHECK: [[LOOP_C]]:
>> ; CHECK-NEXT: call void @c()
>> -; CHECK-NEXT: br label %loop_latch
>> +; CHECK-NEXT: br label %[[LOOP_LATCH_C:.*]]
>> +;
>> +; CHECK: [[LOOP_LATCH_C]]:
>> +; CHECK: br label %[[LOOP_BEGIN_C]]
>>
>> loop_latch:
>> br label %loop_begin
>> -; CHECK: loop_latch:
>> -; CHECK-NEXT: br label %loop_begin
>>
>> loop_exit:
>> %lcssa = phi i32 [ %var_val, %loop_begin ]
>> ret i32 %lcssa
>> +; Unswitched exit edge (no longer a loop).
>> +;
>> +; CHECK: [[ENTRY_SPLIT_EXIT]]:
>> +; CHECK-NEXT: br label %loop_begin
>> +;
>> +; CHECK: loop_begin:
>> +; CHECK-NEXT: %[[V:.*]] = load i32, i32* %var
>> +; CHECK-NEXT: br label %loop_exit
>> +;
>> ; CHECK: loop_exit:
>> ; CHECK-NEXT: %[[LCSSA:.*]] = phi i32 [ %[[V]], %loop_begin ]
>> ; CHECK-NEXT: ret i32 %[[LCSSA]]
>> @@ -2824,3 +2853,112 @@ loop_exit:
>> ; CHECK: loop_exit:
>> ; CHECK-NEXT: ret
>> }
>> +
>> +; Non-trivial unswitching of a switch.
>> +define i32 @test27(i1* %ptr, i32 %cond) {
>> +; CHECK-LABEL: @test27(
>> +entry:
>> + br label %loop_begin
>> +; CHECK-NEXT: entry:
>> +; CHECK-NEXT: switch i32 %cond, label %[[ENTRY_SPLIT_LATCH:.*]] [
>> +; CHECK-NEXT: i32 0, label %[[ENTRY_SPLIT_A:.*]]
>> +; CHECK-NEXT: i32 1, label %[[ENTRY_SPLIT_B:.*]]
>> +; CHECK-NEXT: i32 2, label %[[ENTRY_SPLIT_C:.*]]
>> +; CHECK-NEXT: ]
>> +
>> +loop_begin:
>> + switch i32 %cond, label %latch [
>> + i32 0, label %loop_a
>> + i32 1, label %loop_b
>> + i32 2, label %loop_c
>> + ]
>> +
>> +loop_a:
>> + call void @a()
>> + br label %latch
>> +; Unswitched 'a' loop.
>> +;
>> +; CHECK: [[ENTRY_SPLIT_A]]:
>> +; CHECK-NEXT: br label %[[LOOP_BEGIN_A:.*]]
>> +;
>> +; CHECK: [[LOOP_BEGIN_A]]:
>> +; CHECK-NEXT: br label %[[LOOP_A:.*]]
>> +;
>> +; CHECK: [[LOOP_A]]:
>> +; CHECK-NEXT: call void @a()
>> +; CHECK-NEXT: br label %[[LOOP_LATCH_A:.*]]
>> +;
>> +; CHECK: [[LOOP_LATCH_A]]:
>> +; CHECK-NEXT: %[[V_A:.*]] = load i1, i1* %ptr
>> +; CHECK: br i1 %[[V_A]], label %[[LOOP_BEGIN_A]], label
>> %[[LOOP_EXIT_A:.*]]
>> +;
>> +; CHECK: [[LOOP_EXIT_A]]:
>> +; CHECK-NEXT: br label %loop_exit
>> +
>> +loop_b:
>> + call void @b()
>> + br label %latch
>> +; Unswitched 'b' loop.
>> +;
>> +; CHECK: [[ENTRY_SPLIT_B]]:
>> +; CHECK-NEXT: br label %[[LOOP_BEGIN_B:.*]]
>> +;
>> +; CHECK: [[LOOP_BEGIN_B]]:
>> +; CHECK-NEXT: br label %[[LOOP_B:.*]]
>> +;
>> +; CHECK: [[LOOP_B]]:
>> +; CHECK-NEXT: call void @b()
>> +; CHECK-NEXT: br label %[[LOOP_LATCH_B:.*]]
>> +;
>> +; CHECK: [[LOOP_LATCH_B]]:
>> +; CHECK-NEXT: %[[V_B:.*]] = load i1, i1* %ptr
>> +; CHECK: br i1 %[[V_B]], label %[[LOOP_BEGIN_B]], label
>> %[[LOOP_EXIT_B:.*]]
>> +;
>> +; CHECK: [[LOOP_EXIT_B]]:
>> +; CHECK-NEXT: br label %loop_exit
>> +
>> +loop_c:
>> + call void @c()
>> + br label %latch
>> +; Unswitched 'c' loop.
>> +;
>> +; CHECK: [[ENTRY_SPLIT_C]]:
>> +; CHECK-NEXT: br label %[[LOOP_BEGIN_C:.*]]
>> +;
>> +; CHECK: [[LOOP_BEGIN_C]]:
>> +; CHECK-NEXT: br label %[[LOOP_C:.*]]
>> +;
>> +; CHECK: [[LOOP_C]]:
>> +; CHECK-NEXT: call void @c()
>> +; CHECK-NEXT: br label %[[LOOP_LATCH_C:.*]]
>> +;
>> +; CHECK: [[LOOP_LATCH_C]]:
>> +; CHECK-NEXT: %[[V_C:.*]] = load i1, i1* %ptr
>> +; CHECK: br i1 %[[V_C]], label %[[LOOP_BEGIN_C]], label
>> %[[LOOP_EXIT_C:.*]]
>> +;
>> +; CHECK: [[LOOP_EXIT_C]]:
>> +; CHECK-NEXT: br label %loop_exit
>> +
>> +latch:
>> + %v = load i1, i1* %ptr
>> + br i1 %v, label %loop_begin, label %loop_exit
>> +; Unswitched the 'latch' only loop.
>> +;
>> +; CHECK: [[ENTRY_SPLIT_LATCH]]:
>> +; CHECK-NEXT: br label %[[LOOP_BEGIN_LATCH:.*]]
>> +;
>> +; CHECK: [[LOOP_BEGIN_LATCH]]:
>> +; CHECK-NEXT: br label %[[LOOP_LATCH_LATCH:.*]]
>> +;
>> +; CHECK: [[LOOP_LATCH_LATCH]]:
>> +; CHECK-NEXT: %[[V_LATCH:.*]] = load i1, i1* %ptr
>> +; CHECK: br i1 %[[V_LATCH]], label %[[LOOP_BEGIN_LATCH]], label
>> %[[LOOP_EXIT_LATCH:.*]]
>> +;
>> +; CHECK: [[LOOP_EXIT_LATCH]]:
>> +; CHECK-NEXT: br label %loop_exit
>> +
>> +loop_exit:
>> + ret i32 0
>> +; CHECK: loop_exit:
>> +; CHECK-NEXT: ret i32 0
>> +}
>> \ No newline at end of file
>>
>>
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180709/4ef679eb/attachment-0001.html>
More information about the llvm-commits
mailing list