[llvm] r199884 - [LPM] Make LoopSimplify no longer a LoopPass and instead both a utility
Andrew Trick
atrick at apple.com
Thu Jan 23 10:15:09 PST 2014
On Jan 23, 2014, at 3:23 AM, Chandler Carruth <chandlerc at gmail.com> wrote:
> Author: chandlerc
> Date: Thu Jan 23 05:23:19 2014
> New Revision: 199884
>
> URL: http://llvm.org/viewvc/llvm-project?rev=199884&view=rev
> Log:
> [LPM] Make LoopSimplify no longer a LoopPass and instead both a utility
> function and a FunctionPass.
>
> This has many benefits. The motivating use case was to be able to
> compute function analysis passes *after* running LoopSimplify (to avoid
> invalidating them) and then to run other passes which require
> LoopSimplify. Specifically passes like unrolling and vectorization are
> critical to wire up to BranchProbabilityInfo and BlockFrequencyInfo so
> that they can be profile aware. For the LoopVectorize pass the only
> things in the way are LoopSimplify and LCSSA. This fixes LoopSimplify
> and LCSSA is next on my list.
>
> There are also a bunch of other benefits of doing this:
> - It is now very feasible to make more passes *preserve* LoopSimplify
> because they can simply run it after changing a loop. Because
> subsequence passes can assume LoopSimplify is preserved we can reduce
> the runs of this pass to the times when we actually mutate a loop
> structure.
> - The new pass manager should be able to more easily support loop passes
> factored in this way.
> - We can at long, long last observe that LoopSimplify is preserved
> across SCEV. This *halves* the number of times we run LoopSimplify!!!
Awesome. I am so happy to see this!
-Andy
> Now, getting here wasn't trivial. First off, the interfaces used by
> LoopSimplify are all over the map regarding how analysis are updated. We
> end up with weird "pass" parameters as a consequence. I'll try to clean
> at least some of this up later -- I'll have to have it all clean for the
> new pass manager.
>
> Next up I discovered a really frustrating bug. LoopUnroll *claims* to
> preserve LoopSimplify. That's actually a lie. But the way the
> LoopPassManager ends up running the passes, it always ran LoopSimplify
> on the unrolled-into loop, rectifying this oversight before any
> verification could kick in and point out that in fact nothing was
> preserved. So I've added code to the unroller to *actually* simplify the
> surrounding loop when it succeeds at unrolling.
>
> The only functional change in the test suite is that we now catch a case
> that was previously missed because SCEV and other loop transforms see
> their containing loops as simplified and thus don't miss some
> opportunities. One test case has been converted to check that we catch
> this case rather than checking that we miss it but at least don't get
> the wrong answer.
>
> Note that I have #if-ed out all of the verification logic in
> LoopSimplify! This is a temporary workaround while extracting these bits
> from the LoopPassManager. Currently, there is no way to have a pass in
> the LoopPassManager which preserves LoopSimplify along with one which
> does not. The LPM will try to verify on each loop in the nest that
> LoopSimplify holds but the now-Function-pass cannot distinguish what
> loop is being verified and so must try to verify all of them. The inner
> most loop is clearly no longer simplified as there is a pass which
> didn't even *attempt* to preserve it. =/ Once I get LCSSA out (and maybe
> LoopVectorize and some other fixes) I'll be able to re-enable this check
> and catch any places where we are still failing to preserve
> LoopSimplify. If this causes problems I can back this out and try to
> commit *all* of this at once, but so far this seems to work and allow
> much more incremental progress.
>
> Modified:
> llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h
> llvm/trunk/include/llvm/Transforms/Utils/UnrollLoop.h
> llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp
> llvm/trunk/lib/Transforms/Utils/LoopSimplify.cpp
> llvm/trunk/lib/Transforms/Utils/LoopUnroll.cpp
> llvm/trunk/test/Transforms/IndVarSimplify/lftr-reuse.ll
>
> Modified: llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h?rev=199884&r1=199883&r2=199884&view=diff
> ==============================================================================
> --- llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h (original)
> +++ llvm/trunk/include/llvm/Transforms/Utils/LoopUtils.h Thu Jan 23 05:23:19 2014
> @@ -15,12 +15,25 @@
> #define LLVM_TRANSFORMS_UTILS_LOOPUTILS_H
>
> namespace llvm {
> -
> +class AliasAnalysis;
> +class BasicBlock;
> +class DominatorTree;
> class Loop;
> +class LoopInfo;
> class Pass;
> +class ScalarEvolution;
>
> BasicBlock *InsertPreheaderForLoop(Loop *L, Pass *P);
>
> +/// \brief Simplify each loop in a loop nest recursively.
> +///
> +/// This takes a potentially un-simplified loop L (and its children) and turns
> +/// it into a simplified loop nest with preheaders and single backedges. It
> +/// will optionally update \c AliasAnalysis and \c ScalarEvolution analyses if
> +/// passed into it.
> +bool simplifyLoop(Loop *L, DominatorTree *DT, LoopInfo *LI, Pass *PP,
> + AliasAnalysis *AA = 0, ScalarEvolution *SE = 0);
> +
> }
>
> #endif
>
> Modified: llvm/trunk/include/llvm/Transforms/Utils/UnrollLoop.h
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Transforms/Utils/UnrollLoop.h?rev=199884&r1=199883&r2=199884&view=diff
> ==============================================================================
> --- llvm/trunk/include/llvm/Transforms/Utils/UnrollLoop.h (original)
> +++ llvm/trunk/include/llvm/Transforms/Utils/UnrollLoop.h Thu Jan 23 05:23:19 2014
> @@ -21,9 +21,11 @@ namespace llvm {
> class Loop;
> class LoopInfo;
> class LPPassManager;
> +class Pass;
>
> bool UnrollLoop(Loop *L, unsigned Count, unsigned TripCount, bool AllowRuntime,
> - unsigned TripMultiple, LoopInfo* LI, LPPassManager* LPM);
> + unsigned TripMultiple, LoopInfo *LI, Pass *PP,
> + LPPassManager *LPM);
>
> bool UnrollRuntimeLoopProlog(Loop *L, unsigned Count, LoopInfo *LI,
> LPPassManager* LPM);
>
> Modified: llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp?rev=199884&r1=199883&r2=199884&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp (original)
> +++ llvm/trunk/lib/Transforms/Scalar/LoopUnrollPass.cpp Thu Jan 23 05:23:19 2014
> @@ -254,7 +254,7 @@ bool LoopUnroll::runOnLoop(Loop *L, LPPa
> }
>
> // Unroll the loop.
> - if (!UnrollLoop(L, Count, TripCount, Runtime, TripMultiple, LI, &LPM))
> + if (!UnrollLoop(L, Count, TripCount, Runtime, TripMultiple, LI, this, &LPM))
> return false;
>
> return true;
>
> Modified: llvm/trunk/lib/Transforms/Utils/LoopSimplify.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/LoopSimplify.cpp?rev=199884&r1=199883&r2=199884&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Transforms/Utils/LoopSimplify.cpp (original)
> +++ llvm/trunk/lib/Transforms/Utils/LoopSimplify.cpp Thu Jan 23 05:23:19 2014
> @@ -42,11 +42,12 @@
> #include "llvm/ADT/DepthFirstIterator.h"
> #include "llvm/ADT/SetOperations.h"
> #include "llvm/ADT/SetVector.h"
> +#include "llvm/ADT/SmallVector.h"
> #include "llvm/ADT/Statistic.h"
> #include "llvm/Analysis/AliasAnalysis.h"
> #include "llvm/Analysis/DependenceAnalysis.h"
> #include "llvm/Analysis/InstructionSimplify.h"
> -#include "llvm/Analysis/LoopPass.h"
> +#include "llvm/Analysis/LoopInfo.h"
> #include "llvm/Analysis/ScalarEvolution.h"
> #include "llvm/IR/Constants.h"
> #include "llvm/IR/Dominators.h"
> @@ -65,142 +66,471 @@ using namespace llvm;
> STATISTIC(NumInserted, "Number of pre-header or exit blocks inserted");
> STATISTIC(NumNested , "Number of nested loops split out");
>
> -namespace {
> - struct LoopSimplify : public LoopPass {
> - static char ID; // Pass identification, replacement for typeid
> - LoopSimplify() : LoopPass(ID) {
> - initializeLoopSimplifyPass(*PassRegistry::getPassRegistry());
> +// If the block isn't already, move the new block to right after some 'outside
> +// block' block. This prevents the preheader from being placed inside the loop
> +// body, e.g. when the loop hasn't been rotated.
> +static void placeSplitBlockCarefully(BasicBlock *NewBB,
> + SmallVectorImpl<BasicBlock *> &SplitPreds,
> + Loop *L) {
> + // Check to see if NewBB is already well placed.
> + Function::iterator BBI = NewBB; --BBI;
> + for (unsigned i = 0, e = SplitPreds.size(); i != e; ++i) {
> + if (&*BBI == SplitPreds[i])
> + return;
> + }
> +
> + // If it isn't already after an outside block, move it after one. This is
> + // always good as it makes the uncond branch from the outside block into a
> + // fall-through.
> +
> + // Figure out *which* outside block to put this after. Prefer an outside
> + // block that neighbors a BB actually in the loop.
> + BasicBlock *FoundBB = 0;
> + for (unsigned i = 0, e = SplitPreds.size(); i != e; ++i) {
> + Function::iterator BBI = SplitPreds[i];
> + if (++BBI != NewBB->getParent()->end() &&
> + L->contains(BBI)) {
> + FoundBB = SplitPreds[i];
> + break;
> }
> + }
>
> - // AA - If we have an alias analysis object to update, this is it, otherwise
> - // this is null.
> - AliasAnalysis *AA;
> - LoopInfo *LI;
> - DominatorTree *DT;
> - ScalarEvolution *SE;
> - Loop *L;
> - virtual bool runOnLoop(Loop *L, LPPassManager &LPM);
> + // If our heuristic for a *good* bb to place this after doesn't find
> + // anything, just pick something. It's likely better than leaving it within
> + // the loop.
> + if (!FoundBB)
> + FoundBB = SplitPreds[0];
> + NewBB->moveAfter(FoundBB);
> +}
>
> - virtual void getAnalysisUsage(AnalysisUsage &AU) const {
> - // We need loop information to identify the loops...
> - AU.addRequired<DominatorTreeWrapperPass>();
> - AU.addPreserved<DominatorTreeWrapperPass>();
> +/// InsertPreheaderForLoop - Once we discover that a loop doesn't have a
> +/// preheader, this method is called to insert one. This method has two phases:
> +/// preheader insertion and analysis updating.
> +///
> +BasicBlock *llvm::InsertPreheaderForLoop(Loop *L, Pass *PP) {
> + BasicBlock *Header = L->getHeader();
>
> - AU.addRequired<LoopInfo>();
> - AU.addPreserved<LoopInfo>();
> + // Compute the set of predecessors of the loop that are not in the loop.
> + SmallVector<BasicBlock*, 8> OutsideBlocks;
> + for (pred_iterator PI = pred_begin(Header), PE = pred_end(Header);
> + PI != PE; ++PI) {
> + BasicBlock *P = *PI;
> + if (!L->contains(P)) { // Coming in from outside the loop?
> + // If the loop is branched to from an indirect branch, we won't
> + // be able to fully transform the loop, because it prohibits
> + // edge splitting.
> + if (isa<IndirectBrInst>(P->getTerminator())) return 0;
>
> - AU.addPreserved<AliasAnalysis>();
> - AU.addPreserved<ScalarEvolution>();
> - AU.addPreserved<DependenceAnalysis>();
> - AU.addPreservedID(BreakCriticalEdgesID); // No critical edges added.
> + // Keep track of it.
> + OutsideBlocks.push_back(P);
> }
> + }
>
> - /// verifyAnalysis() - Verify LoopSimplifyForm's guarantees.
> - void verifyAnalysis() const;
> + // Split out the loop pre-header.
> + BasicBlock *PreheaderBB;
> + if (!Header->isLandingPad()) {
> + PreheaderBB = SplitBlockPredecessors(Header, OutsideBlocks, ".preheader",
> + PP);
> + } else {
> + SmallVector<BasicBlock*, 2> NewBBs;
> + SplitLandingPadPredecessors(Header, OutsideBlocks, ".preheader",
> + ".split-lp", PP, NewBBs);
> + PreheaderBB = NewBBs[0];
> + }
>
> - private:
> - bool ProcessLoop(Loop *L, LPPassManager &LPM);
> - BasicBlock *RewriteLoopExitBlock(Loop *L, BasicBlock *Exit);
> - Loop *SeparateNestedLoop(Loop *L, LPPassManager &LPM,
> - BasicBlock *Preheader);
> - BasicBlock *InsertUniqueBackedgeBlock(Loop *L, BasicBlock *Preheader);
> - };
> + PreheaderBB->getTerminator()->setDebugLoc(
> + Header->getFirstNonPHI()->getDebugLoc());
> + DEBUG(dbgs() << "LoopSimplify: Creating pre-header "
> + << PreheaderBB->getName() << "\n");
> +
> + // Make sure that NewBB is put someplace intelligent, which doesn't mess up
> + // code layout too horribly.
> + placeSplitBlockCarefully(PreheaderBB, OutsideBlocks, L);
> +
> + return PreheaderBB;
> }
>
> -static void PlaceSplitBlockCarefully(BasicBlock *NewBB,
> - SmallVectorImpl<BasicBlock*> &SplitPreds,
> - Loop *L);
> +/// \brief Ensure that the loop preheader dominates all exit blocks.
> +///
> +/// This method is used to split exit blocks that have predecessors outside of
> +/// the loop.
> +static BasicBlock *rewriteLoopExitBlock(Loop *L, BasicBlock *Exit, Pass *PP) {
> + SmallVector<BasicBlock*, 8> LoopBlocks;
> + for (pred_iterator I = pred_begin(Exit), E = pred_end(Exit); I != E; ++I) {
> + BasicBlock *P = *I;
> + if (L->contains(P)) {
> + // Don't do this if the loop is exited via an indirect branch.
> + if (isa<IndirectBrInst>(P->getTerminator())) return 0;
>
> -char LoopSimplify::ID = 0;
> -INITIALIZE_PASS_BEGIN(LoopSimplify, "loop-simplify",
> - "Canonicalize natural loops", true, false)
> -INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
> -INITIALIZE_PASS_DEPENDENCY(LoopInfo)
> -INITIALIZE_PASS_END(LoopSimplify, "loop-simplify",
> - "Canonicalize natural loops", true, false)
> + LoopBlocks.push_back(P);
> + }
> + }
>
> -// Publicly exposed interface to pass...
> -char &llvm::LoopSimplifyID = LoopSimplify::ID;
> -Pass *llvm::createLoopSimplifyPass() { return new LoopSimplify(); }
> + assert(!LoopBlocks.empty() && "No edges coming in from outside the loop?");
> + BasicBlock *NewExitBB = 0;
>
> -/// runOnLoop - Run down all loops in the CFG (recursively, but we could do
> -/// it in any convenient order) inserting preheaders...
> -///
> -bool LoopSimplify::runOnLoop(Loop *l, LPPassManager &LPM) {
> - L = l;
> - bool Changed = false;
> - LI = &getAnalysis<LoopInfo>();
> - AA = getAnalysisIfAvailable<AliasAnalysis>();
> - DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
> - SE = getAnalysisIfAvailable<ScalarEvolution>();
> + if (Exit->isLandingPad()) {
> + SmallVector<BasicBlock*, 2> NewBBs;
> + SplitLandingPadPredecessors(Exit, ArrayRef<BasicBlock*>(&LoopBlocks[0],
> + LoopBlocks.size()),
> + ".loopexit", ".nonloopexit",
> + PP, NewBBs);
> + NewExitBB = NewBBs[0];
> + } else {
> + NewExitBB = SplitBlockPredecessors(Exit, LoopBlocks, ".loopexit", PP);
> + }
>
> - Changed |= ProcessLoop(L, LPM);
> + DEBUG(dbgs() << "LoopSimplify: Creating dedicated exit block "
> + << NewExitBB->getName() << "\n");
> + return NewExitBB;
> +}
>
> - return Changed;
> +/// Add the specified block, and all of its predecessors, to the specified set,
> +/// if it's not already in there. Stop predecessor traversal when we reach
> +/// StopBlock.
> +static void addBlockAndPredsToSet(BasicBlock *InputBB, BasicBlock *StopBlock,
> + std::set<BasicBlock*> &Blocks) {
> + SmallVector<BasicBlock *, 8> Worklist;
> + Worklist.push_back(InputBB);
> + do {
> + BasicBlock *BB = Worklist.pop_back_val();
> + if (Blocks.insert(BB).second && BB != StopBlock)
> + // If BB is not already processed and it is not a stop block then
> + // insert its predecessor in the work list
> + for (pred_iterator I = pred_begin(BB), E = pred_end(BB); I != E; ++I) {
> + BasicBlock *WBB = *I;
> + Worklist.push_back(WBB);
> + }
> + } while (!Worklist.empty());
> }
>
> -/// ProcessLoop - Walk the loop structure in depth first order, ensuring that
> -/// all loops have preheaders.
> -///
> -bool LoopSimplify::ProcessLoop(Loop *L, LPPassManager &LPM) {
> - bool Changed = false;
> -ReprocessLoop:
> +/// \brief The first part of loop-nestification is to find a PHI node that tells
> +/// us how to partition the loops.
> +static PHINode *findPHIToPartitionLoops(Loop *L, AliasAnalysis *AA,
> + DominatorTree *DT) {
> + for (BasicBlock::iterator I = L->getHeader()->begin(); isa<PHINode>(I); ) {
> + PHINode *PN = cast<PHINode>(I);
> + ++I;
> + if (Value *V = SimplifyInstruction(PN, 0, 0, DT)) {
> + // This is a degenerate PHI already, don't modify it!
> + PN->replaceAllUsesWith(V);
> + if (AA) AA->deleteValue(PN);
> + PN->eraseFromParent();
> + continue;
> + }
>
> - // Check to see that no blocks (other than the header) in this loop have
> - // predecessors that are not in the loop. This is not valid for natural
> - // loops, but can occur if the blocks are unreachable. Since they are
> - // unreachable we can just shamelessly delete those CFG edges!
> - for (Loop::block_iterator BB = L->block_begin(), E = L->block_end();
> - BB != E; ++BB) {
> - if (*BB == L->getHeader()) continue;
> + // Scan this PHI node looking for a use of the PHI node by itself.
> + for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i)
> + if (PN->getIncomingValue(i) == PN &&
> + L->contains(PN->getIncomingBlock(i)))
> + // We found something tasty to remove.
> + return PN;
> + }
> + return 0;
> +}
>
> - SmallPtrSet<BasicBlock*, 4> BadPreds;
> - for (pred_iterator PI = pred_begin(*BB),
> - PE = pred_end(*BB); PI != PE; ++PI) {
> - BasicBlock *P = *PI;
> - if (!L->contains(P))
> - BadPreds.insert(P);
> - }
> +/// \brief If this loop has multiple backedges, try to pull one of them out into
> +/// a nested loop.
> +///
> +/// This is important for code that looks like
> +/// this:
> +///
> +/// Loop:
> +/// ...
> +/// br cond, Loop, Next
> +/// ...
> +/// br cond2, Loop, Out
> +///
> +/// To identify this common case, we look at the PHI nodes in the header of the
> +/// loop. PHI nodes with unchanging values on one backedge correspond to values
> +/// that change in the "outer" loop, but not in the "inner" loop.
> +///
> +/// If we are able to separate out a loop, return the new outer loop that was
> +/// created.
> +///
> +static Loop *separateNestedLoop(Loop *L, BasicBlock *Preheader,
> + AliasAnalysis *AA, DominatorTree *DT,
> + LoopInfo *LI, ScalarEvolution *SE, Pass *PP) {
> + // Don't try to separate loops without a preheader.
> + if (!Preheader)
> + return 0;
>
> - // Delete each unique out-of-loop (and thus dead) predecessor.
> - for (SmallPtrSet<BasicBlock*, 4>::iterator I = BadPreds.begin(),
> - E = BadPreds.end(); I != E; ++I) {
> + // The header is not a landing pad; preheader insertion should ensure this.
> + assert(!L->getHeader()->isLandingPad() &&
> + "Can't insert backedge to landing pad");
>
> - DEBUG(dbgs() << "LoopSimplify: Deleting edge from dead predecessor "
> - << (*I)->getName() << "\n");
> + PHINode *PN = findPHIToPartitionLoops(L, AA, DT);
> + if (PN == 0) return 0; // No known way to partition.
>
> - // Inform each successor of each dead pred.
> - for (succ_iterator SI = succ_begin(*I), SE = succ_end(*I); SI != SE; ++SI)
> - (*SI)->removePredecessor(*I);
> - // Zap the dead pred's terminator and replace it with unreachable.
> - TerminatorInst *TI = (*I)->getTerminator();
> - TI->replaceAllUsesWith(UndefValue::get(TI->getType()));
> - (*I)->getTerminator()->eraseFromParent();
> - new UnreachableInst((*I)->getContext(), *I);
> - Changed = true;
> + // Pull out all predecessors that have varying values in the loop. This
> + // handles the case when a PHI node has multiple instances of itself as
> + // arguments.
> + SmallVector<BasicBlock*, 8> OuterLoopPreds;
> + for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) {
> + if (PN->getIncomingValue(i) != PN ||
> + !L->contains(PN->getIncomingBlock(i))) {
> + // We can't split indirectbr edges.
> + if (isa<IndirectBrInst>(PN->getIncomingBlock(i)->getTerminator()))
> + return 0;
> + OuterLoopPreds.push_back(PN->getIncomingBlock(i));
> }
> }
> + DEBUG(dbgs() << "LoopSimplify: Splitting out a new outer loop\n");
>
> - // If there are exiting blocks with branches on undef, resolve the undef in
> - // the direction which will exit the loop. This will help simplify loop
> - // trip count computations.
> - SmallVector<BasicBlock*, 8> ExitingBlocks;
> - L->getExitingBlocks(ExitingBlocks);
> - for (SmallVectorImpl<BasicBlock *>::iterator I = ExitingBlocks.begin(),
> - E = ExitingBlocks.end(); I != E; ++I)
> - if (BranchInst *BI = dyn_cast<BranchInst>((*I)->getTerminator()))
> - if (BI->isConditional()) {
> - if (UndefValue *Cond = dyn_cast<UndefValue>(BI->getCondition())) {
> + // If ScalarEvolution is around and knows anything about values in
> + // this loop, tell it to forget them, because we're about to
> + // substantially change it.
> + if (SE)
> + SE->forgetLoop(L);
>
> - DEBUG(dbgs() << "LoopSimplify: Resolving \"br i1 undef\" to exit in "
> - << (*I)->getName() << "\n");
> + BasicBlock *Header = L->getHeader();
> + BasicBlock *NewBB =
> + SplitBlockPredecessors(Header, OuterLoopPreds, ".outer", PP);
>
> - BI->setCondition(ConstantInt::get(Cond->getType(),
> - !L->contains(BI->getSuccessor(0))));
> + // Make sure that NewBB is put someplace intelligent, which doesn't mess up
> + // code layout too horribly.
> + placeSplitBlockCarefully(NewBB, OuterLoopPreds, L);
>
> - // This may make the loop analyzable, force SCEV recomputation.
> - if (SE)
> - SE->forgetLoop(L);
> + // Create the new outer loop.
> + Loop *NewOuter = new Loop();
> +
> + // Change the parent loop to use the outer loop as its child now.
> + if (Loop *Parent = L->getParentLoop())
> + Parent->replaceChildLoopWith(L, NewOuter);
> + else
> + LI->changeTopLevelLoop(L, NewOuter);
> +
> + // L is now a subloop of our outer loop.
> + NewOuter->addChildLoop(L);
> +
> + for (Loop::block_iterator I = L->block_begin(), E = L->block_end();
> + I != E; ++I)
> + NewOuter->addBlockEntry(*I);
> +
> + // Now reset the header in L, which had been moved by
> + // SplitBlockPredecessors for the outer loop.
> + L->moveToHeader(Header);
> +
> + // Determine which blocks should stay in L and which should be moved out to
> + // the Outer loop now.
> + std::set<BasicBlock*> BlocksInL;
> + for (pred_iterator PI=pred_begin(Header), E = pred_end(Header); PI!=E; ++PI) {
> + BasicBlock *P = *PI;
> + if (DT->dominates(Header, P))
> + addBlockAndPredsToSet(P, Header, BlocksInL);
> + }
> +
> + // Scan all of the loop children of L, moving them to OuterLoop if they are
> + // not part of the inner loop.
> + const std::vector<Loop*> &SubLoops = L->getSubLoops();
> + for (size_t I = 0; I != SubLoops.size(); )
> + if (BlocksInL.count(SubLoops[I]->getHeader()))
> + ++I; // Loop remains in L
> + else
> + NewOuter->addChildLoop(L->removeChildLoop(SubLoops.begin() + I));
> +
> + // Now that we know which blocks are in L and which need to be moved to
> + // OuterLoop, move any blocks that need it.
> + for (unsigned i = 0; i != L->getBlocks().size(); ++i) {
> + BasicBlock *BB = L->getBlocks()[i];
> + if (!BlocksInL.count(BB)) {
> + // Move this block to the parent, updating the exit blocks sets
> + L->removeBlockFromLoop(BB);
> + if ((*LI)[BB] == L)
> + LI->changeLoopFor(BB, NewOuter);
> + --i;
> + }
> + }
> +
> + return NewOuter;
> +}
> +
> +/// \brief This method is called when the specified loop has more than one
> +/// backedge in it.
> +///
> +/// If this occurs, revector all of these backedges to target a new basic block
> +/// and have that block branch to the loop header. This ensures that loops
> +/// have exactly one backedge.
> +static BasicBlock *insertUniqueBackedgeBlock(Loop *L, BasicBlock *Preheader,
> + AliasAnalysis *AA,
> + DominatorTree *DT, LoopInfo *LI) {
> + assert(L->getNumBackEdges() > 1 && "Must have > 1 backedge!");
> +
> + // Get information about the loop
> + BasicBlock *Header = L->getHeader();
> + Function *F = Header->getParent();
> +
> + // Unique backedge insertion currently depends on having a preheader.
> + if (!Preheader)
> + return 0;
> +
> + // The header is not a landing pad; preheader insertion should ensure this.
> + assert(!Header->isLandingPad() && "Can't insert backedge to landing pad");
> +
> + // Figure out which basic blocks contain back-edges to the loop header.
> + std::vector<BasicBlock*> BackedgeBlocks;
> + for (pred_iterator I = pred_begin(Header), E = pred_end(Header); I != E; ++I){
> + BasicBlock *P = *I;
> +
> + // Indirectbr edges cannot be split, so we must fail if we find one.
> + if (isa<IndirectBrInst>(P->getTerminator()))
> + return 0;
> +
> + if (P != Preheader) BackedgeBlocks.push_back(P);
> + }
> +
> + // Create and insert the new backedge block...
> + BasicBlock *BEBlock = BasicBlock::Create(Header->getContext(),
> + Header->getName()+".backedge", F);
> + BranchInst *BETerminator = BranchInst::Create(Header, BEBlock);
> +
> + DEBUG(dbgs() << "LoopSimplify: Inserting unique backedge block "
> + << BEBlock->getName() << "\n");
> +
> + // Move the new backedge block to right after the last backedge block.
> + Function::iterator InsertPos = BackedgeBlocks.back(); ++InsertPos;
> + F->getBasicBlockList().splice(InsertPos, F->getBasicBlockList(), BEBlock);
> +
> + // Now that the block has been inserted into the function, create PHI nodes in
> + // the backedge block which correspond to any PHI nodes in the header block.
> + for (BasicBlock::iterator I = Header->begin(); isa<PHINode>(I); ++I) {
> + PHINode *PN = cast<PHINode>(I);
> + PHINode *NewPN = PHINode::Create(PN->getType(), BackedgeBlocks.size(),
> + PN->getName()+".be", BETerminator);
> + if (AA) AA->copyValue(PN, NewPN);
> +
> + // Loop over the PHI node, moving all entries except the one for the
> + // preheader over to the new PHI node.
> + unsigned PreheaderIdx = ~0U;
> + bool HasUniqueIncomingValue = true;
> + Value *UniqueValue = 0;
> + for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) {
> + BasicBlock *IBB = PN->getIncomingBlock(i);
> + Value *IV = PN->getIncomingValue(i);
> + if (IBB == Preheader) {
> + PreheaderIdx = i;
> + } else {
> + NewPN->addIncoming(IV, IBB);
> + if (HasUniqueIncomingValue) {
> + if (UniqueValue == 0)
> + UniqueValue = IV;
> + else if (UniqueValue != IV)
> + HasUniqueIncomingValue = false;
> + }
> + }
> + }
> +
> + // Delete all of the incoming values from the old PN except the preheader's
> + assert(PreheaderIdx != ~0U && "PHI has no preheader entry??");
> + if (PreheaderIdx != 0) {
> + PN->setIncomingValue(0, PN->getIncomingValue(PreheaderIdx));
> + PN->setIncomingBlock(0, PN->getIncomingBlock(PreheaderIdx));
> + }
> + // Nuke all entries except the zero'th.
> + for (unsigned i = 0, e = PN->getNumIncomingValues()-1; i != e; ++i)
> + PN->removeIncomingValue(e-i, false);
> +
> + // Finally, add the newly constructed PHI node as the entry for the BEBlock.
> + PN->addIncoming(NewPN, BEBlock);
> +
> + // As an optimization, if all incoming values in the new PhiNode (which is a
> + // subset of the incoming values of the old PHI node) have the same value,
> + // eliminate the PHI Node.
> + if (HasUniqueIncomingValue) {
> + NewPN->replaceAllUsesWith(UniqueValue);
> + if (AA) AA->deleteValue(NewPN);
> + BEBlock->getInstList().erase(NewPN);
> + }
> + }
> +
> + // Now that all of the PHI nodes have been inserted and adjusted, modify the
> + // backedge blocks to just to the BEBlock instead of the header.
> + for (unsigned i = 0, e = BackedgeBlocks.size(); i != e; ++i) {
> + TerminatorInst *TI = BackedgeBlocks[i]->getTerminator();
> + for (unsigned Op = 0, e = TI->getNumSuccessors(); Op != e; ++Op)
> + if (TI->getSuccessor(Op) == Header)
> + TI->setSuccessor(Op, BEBlock);
> + }
> +
> + //===--- Update all analyses which we must preserve now -----------------===//
> +
> + // Update Loop Information - we know that this block is now in the current
> + // loop and all parent loops.
> + L->addBasicBlockToLoop(BEBlock, LI->getBase());
> +
> + // Update dominator information
> + DT->splitBlock(BEBlock);
> +
> + return BEBlock;
> +}
> +
> +/// \brief Simplify one loop and queue further loops for simplification.
> +///
> +/// FIXME: Currently this accepts both lots of analyses that it uses and a raw
> +/// Pass pointer. The Pass pointer is used by numerous utilities to update
> +/// specific analyses. Rather than a pass it would be much cleaner and more
> +/// explicit if they accepted the analysis directly and then updated it.
> +static bool simplifyOneLoop(Loop *L, SmallVectorImpl<Loop *> &Worklist,
> + AliasAnalysis *AA, DominatorTree *DT, LoopInfo *LI,
> + ScalarEvolution *SE, Pass *PP) {
> + bool Changed = false;
> +ReprocessLoop:
> +
> + // Check to see that no blocks (other than the header) in this loop have
> + // predecessors that are not in the loop. This is not valid for natural
> + // loops, but can occur if the blocks are unreachable. Since they are
> + // unreachable we can just shamelessly delete those CFG edges!
> + for (Loop::block_iterator BB = L->block_begin(), E = L->block_end();
> + BB != E; ++BB) {
> + if (*BB == L->getHeader()) continue;
> +
> + SmallPtrSet<BasicBlock*, 4> BadPreds;
> + for (pred_iterator PI = pred_begin(*BB),
> + PE = pred_end(*BB); PI != PE; ++PI) {
> + BasicBlock *P = *PI;
> + if (!L->contains(P))
> + BadPreds.insert(P);
> + }
> +
> + // Delete each unique out-of-loop (and thus dead) predecessor.
> + for (SmallPtrSet<BasicBlock*, 4>::iterator I = BadPreds.begin(),
> + E = BadPreds.end(); I != E; ++I) {
> +
> + DEBUG(dbgs() << "LoopSimplify: Deleting edge from dead predecessor "
> + << (*I)->getName() << "\n");
> +
> + // Inform each successor of each dead pred.
> + for (succ_iterator SI = succ_begin(*I), SE = succ_end(*I); SI != SE; ++SI)
> + (*SI)->removePredecessor(*I);
> + // Zap the dead pred's terminator and replace it with unreachable.
> + TerminatorInst *TI = (*I)->getTerminator();
> + TI->replaceAllUsesWith(UndefValue::get(TI->getType()));
> + (*I)->getTerminator()->eraseFromParent();
> + new UnreachableInst((*I)->getContext(), *I);
> + Changed = true;
> + }
> + }
> +
> + // If there are exiting blocks with branches on undef, resolve the undef in
> + // the direction which will exit the loop. This will help simplify loop
> + // trip count computations.
> + SmallVector<BasicBlock*, 8> ExitingBlocks;
> + L->getExitingBlocks(ExitingBlocks);
> + for (SmallVectorImpl<BasicBlock *>::iterator I = ExitingBlocks.begin(),
> + E = ExitingBlocks.end(); I != E; ++I)
> + if (BranchInst *BI = dyn_cast<BranchInst>((*I)->getTerminator()))
> + if (BI->isConditional()) {
> + if (UndefValue *Cond = dyn_cast<UndefValue>(BI->getCondition())) {
> +
> + DEBUG(dbgs() << "LoopSimplify: Resolving \"br i1 undef\" to exit in "
> + << (*I)->getName() << "\n");
> +
> + BI->setCondition(ConstantInt::get(Cond->getType(),
> + !L->contains(BI->getSuccessor(0))));
> +
> + // This may make the loop analyzable, force SCEV recomputation.
> + if (SE)
> + SE->forgetLoop(L);
>
> Changed = true;
> }
> @@ -209,7 +539,7 @@ ReprocessLoop:
> // Does the loop already have a preheader? If so, don't insert one.
> BasicBlock *Preheader = L->getLoopPreheader();
> if (!Preheader) {
> - Preheader = InsertPreheaderForLoop(L, this);
> + Preheader = InsertPreheaderForLoop(L, PP);
> if (Preheader) {
> ++NumInserted;
> Changed = true;
> @@ -233,7 +563,7 @@ ReprocessLoop:
> // Must be exactly this loop: no subloops, parent loops, or non-loop preds
> // allowed.
> if (!L->contains(*PI)) {
> - if (RewriteLoopExitBlock(L, ExitBlock)) {
> + if (rewriteLoopExitBlock(L, ExitBlock, PP)) {
> ++NumInserted;
> Changed = true;
> }
> @@ -249,11 +579,16 @@ ReprocessLoop:
> // this for loops with a giant number of backedges, just factor them into a
> // common backedge instead.
> if (L->getNumBackEdges() < 8) {
> - if (SeparateNestedLoop(L, LPM, Preheader)) {
> + if (Loop *OuterL = separateNestedLoop(L, Preheader, AA, DT, LI, SE, PP)) {
> ++NumNested;
> + // Enqueue the outer loop as it should be processed next in our
> + // depth-first nest walk.
> + Worklist.push_back(OuterL);
> +
> // This is a big restructuring change, reprocess the whole loop.
> Changed = true;
> // GCC doesn't tail recursion eliminate this.
> + // FIXME: It isn't clear we can't rely on LLVM to TRE this.
> goto ReprocessLoop;
> }
> }
> @@ -261,7 +596,7 @@ ReprocessLoop:
> // If we either couldn't, or didn't want to, identify nesting of the loops,
> // insert a new block that all backedges target, then make it jump to the
> // loop header.
> - LoopLatch = InsertUniqueBackedgeBlock(L, Preheader);
> + LoopLatch = insertUniqueBackedgeBlock(L, Preheader, AA, DT, LI);
> if (LoopLatch) {
> ++NumInserted;
> Changed = true;
> @@ -286,490 +621,192 @@ ReprocessLoop:
> // as LoopRotation, which do not support loops with multiple exits.
> // SimplifyCFG also does this (and this code uses the same utility
> // function), however this code is loop-aware, where SimplifyCFG is
> - // not. That gives it the advantage of being able to hoist
> - // loop-invariant instructions out of the way to open up more
> - // opportunities, and the disadvantage of having the responsibility
> - // to preserve dominator information.
> - bool UniqueExit = true;
> - if (!ExitBlocks.empty())
> - for (unsigned i = 1, e = ExitBlocks.size(); i != e; ++i)
> - if (ExitBlocks[i] != ExitBlocks[0]) {
> - UniqueExit = false;
> - break;
> - }
> - if (UniqueExit) {
> - for (unsigned i = 0, e = ExitingBlocks.size(); i != e; ++i) {
> - BasicBlock *ExitingBlock = ExitingBlocks[i];
> - if (!ExitingBlock->getSinglePredecessor()) continue;
> - BranchInst *BI = dyn_cast<BranchInst>(ExitingBlock->getTerminator());
> - if (!BI || !BI->isConditional()) continue;
> - CmpInst *CI = dyn_cast<CmpInst>(BI->getCondition());
> - if (!CI || CI->getParent() != ExitingBlock) continue;
> -
> - // Attempt to hoist out all instructions except for the
> - // comparison and the branch.
> - bool AllInvariant = true;
> - bool AnyInvariant = false;
> - for (BasicBlock::iterator I = ExitingBlock->begin(); &*I != BI; ) {
> - Instruction *Inst = I++;
> - // Skip debug info intrinsics.
> - if (isa<DbgInfoIntrinsic>(Inst))
> - continue;
> - if (Inst == CI)
> - continue;
> - if (!L->makeLoopInvariant(Inst, AnyInvariant,
> - Preheader ? Preheader->getTerminator() : 0)) {
> - AllInvariant = false;
> - break;
> - }
> - }
> - if (AnyInvariant) {
> - Changed = true;
> - // The loop disposition of all SCEV expressions that depend on any
> - // hoisted values have also changed.
> - if (SE)
> - SE->forgetLoopDispositions(L);
> - }
> - if (!AllInvariant) continue;
> -
> - // The block has now been cleared of all instructions except for
> - // a comparison and a conditional branch. SimplifyCFG may be able
> - // to fold it now.
> - if (!FoldBranchToCommonDest(BI)) continue;
> -
> - // Success. The block is now dead, so remove it from the loop,
> - // update the dominator tree and delete it.
> - DEBUG(dbgs() << "LoopSimplify: Eliminating exiting block "
> - << ExitingBlock->getName() << "\n");
> -
> - // Notify ScalarEvolution before deleting this block. Currently assume the
> - // parent loop doesn't change (spliting edges doesn't count). If blocks,
> - // CFG edges, or other values in the parent loop change, then we need call
> - // to forgetLoop() for the parent instead.
> - if (SE)
> - SE->forgetLoop(L);
> -
> - assert(pred_begin(ExitingBlock) == pred_end(ExitingBlock));
> - Changed = true;
> - LI->removeBlock(ExitingBlock);
> -
> - DomTreeNode *Node = DT->getNode(ExitingBlock);
> - const std::vector<DomTreeNodeBase<BasicBlock> *> &Children =
> - Node->getChildren();
> - while (!Children.empty()) {
> - DomTreeNode *Child = Children.front();
> - DT->changeImmediateDominator(Child, Node->getIDom());
> - }
> - DT->eraseNode(ExitingBlock);
> -
> - BI->getSuccessor(0)->removePredecessor(ExitingBlock);
> - BI->getSuccessor(1)->removePredecessor(ExitingBlock);
> - ExitingBlock->eraseFromParent();
> - }
> - }
> -
> - return Changed;
> -}
> -
> -/// InsertPreheaderForLoop - Once we discover that a loop doesn't have a
> -/// preheader, this method is called to insert one. This method has two phases:
> -/// preheader insertion and analysis updating.
> -///
> -BasicBlock *llvm::InsertPreheaderForLoop(Loop *L, Pass *PP) {
> - BasicBlock *Header = L->getHeader();
> -
> - // Compute the set of predecessors of the loop that are not in the loop.
> - SmallVector<BasicBlock*, 8> OutsideBlocks;
> - for (pred_iterator PI = pred_begin(Header), PE = pred_end(Header);
> - PI != PE; ++PI) {
> - BasicBlock *P = *PI;
> - if (!L->contains(P)) { // Coming in from outside the loop?
> - // If the loop is branched to from an indirect branch, we won't
> - // be able to fully transform the loop, because it prohibits
> - // edge splitting.
> - if (isa<IndirectBrInst>(P->getTerminator())) return 0;
> -
> - // Keep track of it.
> - OutsideBlocks.push_back(P);
> - }
> - }
> -
> - // Split out the loop pre-header.
> - BasicBlock *PreheaderBB;
> - if (!Header->isLandingPad()) {
> - PreheaderBB = SplitBlockPredecessors(Header, OutsideBlocks, ".preheader",
> - PP);
> - } else {
> - SmallVector<BasicBlock*, 2> NewBBs;
> - SplitLandingPadPredecessors(Header, OutsideBlocks, ".preheader",
> - ".split-lp", PP, NewBBs);
> - PreheaderBB = NewBBs[0];
> - }
> -
> - PreheaderBB->getTerminator()->setDebugLoc(
> - Header->getFirstNonPHI()->getDebugLoc());
> - DEBUG(dbgs() << "LoopSimplify: Creating pre-header "
> - << PreheaderBB->getName() << "\n");
> -
> - // Make sure that NewBB is put someplace intelligent, which doesn't mess up
> - // code layout too horribly.
> - PlaceSplitBlockCarefully(PreheaderBB, OutsideBlocks, L);
> -
> - return PreheaderBB;
> -}
> -
> -/// RewriteLoopExitBlock - Ensure that the loop preheader dominates all exit
> -/// blocks. This method is used to split exit blocks that have predecessors
> -/// outside of the loop.
> -BasicBlock *LoopSimplify::RewriteLoopExitBlock(Loop *L, BasicBlock *Exit) {
> - SmallVector<BasicBlock*, 8> LoopBlocks;
> - for (pred_iterator I = pred_begin(Exit), E = pred_end(Exit); I != E; ++I) {
> - BasicBlock *P = *I;
> - if (L->contains(P)) {
> - // Don't do this if the loop is exited via an indirect branch.
> - if (isa<IndirectBrInst>(P->getTerminator())) return 0;
> -
> - LoopBlocks.push_back(P);
> - }
> - }
> -
> - assert(!LoopBlocks.empty() && "No edges coming in from outside the loop?");
> - BasicBlock *NewExitBB = 0;
> -
> - if (Exit->isLandingPad()) {
> - SmallVector<BasicBlock*, 2> NewBBs;
> - SplitLandingPadPredecessors(Exit, ArrayRef<BasicBlock*>(&LoopBlocks[0],
> - LoopBlocks.size()),
> - ".loopexit", ".nonloopexit",
> - this, NewBBs);
> - NewExitBB = NewBBs[0];
> - } else {
> - NewExitBB = SplitBlockPredecessors(Exit, LoopBlocks, ".loopexit", this);
> - }
> -
> - DEBUG(dbgs() << "LoopSimplify: Creating dedicated exit block "
> - << NewExitBB->getName() << "\n");
> - return NewExitBB;
> -}
> -
> -/// AddBlockAndPredsToSet - Add the specified block, and all of its
> -/// predecessors, to the specified set, if it's not already in there. Stop
> -/// predecessor traversal when we reach StopBlock.
> -static void AddBlockAndPredsToSet(BasicBlock *InputBB, BasicBlock *StopBlock,
> - std::set<BasicBlock*> &Blocks) {
> - std::vector<BasicBlock *> WorkList;
> - WorkList.push_back(InputBB);
> - do {
> - BasicBlock *BB = WorkList.back(); WorkList.pop_back();
> - if (Blocks.insert(BB).second && BB != StopBlock)
> - // If BB is not already processed and it is not a stop block then
> - // insert its predecessor in the work list
> - for (pred_iterator I = pred_begin(BB), E = pred_end(BB); I != E; ++I) {
> - BasicBlock *WBB = *I;
> - WorkList.push_back(WBB);
> - }
> - } while(!WorkList.empty());
> -}
> -
> -/// FindPHIToPartitionLoops - The first part of loop-nestification is to find a
> -/// PHI node that tells us how to partition the loops.
> -static PHINode *FindPHIToPartitionLoops(Loop *L, DominatorTree *DT,
> - AliasAnalysis *AA, LoopInfo *LI) {
> - for (BasicBlock::iterator I = L->getHeader()->begin(); isa<PHINode>(I); ) {
> - PHINode *PN = cast<PHINode>(I);
> - ++I;
> - if (Value *V = SimplifyInstruction(PN, 0, 0, DT)) {
> - // This is a degenerate PHI already, don't modify it!
> - PN->replaceAllUsesWith(V);
> - if (AA) AA->deleteValue(PN);
> - PN->eraseFromParent();
> - continue;
> - }
> -
> - // Scan this PHI node looking for a use of the PHI node by itself.
> - for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i)
> - if (PN->getIncomingValue(i) == PN &&
> - L->contains(PN->getIncomingBlock(i)))
> - // We found something tasty to remove.
> - return PN;
> - }
> - return 0;
> -}
> -
> -// PlaceSplitBlockCarefully - If the block isn't already, move the new block to
> -// right after some 'outside block' block. This prevents the preheader from
> -// being placed inside the loop body, e.g. when the loop hasn't been rotated.
> -void PlaceSplitBlockCarefully(BasicBlock *NewBB,
> - SmallVectorImpl<BasicBlock*> &SplitPreds,
> - Loop *L) {
> - // Check to see if NewBB is already well placed.
> - Function::iterator BBI = NewBB; --BBI;
> - for (unsigned i = 0, e = SplitPreds.size(); i != e; ++i) {
> - if (&*BBI == SplitPreds[i])
> - return;
> - }
> -
> - // If it isn't already after an outside block, move it after one. This is
> - // always good as it makes the uncond branch from the outside block into a
> - // fall-through.
> -
> - // Figure out *which* outside block to put this after. Prefer an outside
> - // block that neighbors a BB actually in the loop.
> - BasicBlock *FoundBB = 0;
> - for (unsigned i = 0, e = SplitPreds.size(); i != e; ++i) {
> - Function::iterator BBI = SplitPreds[i];
> - if (++BBI != NewBB->getParent()->end() &&
> - L->contains(BBI)) {
> - FoundBB = SplitPreds[i];
> - break;
> - }
> - }
> -
> - // If our heuristic for a *good* bb to place this after doesn't find
> - // anything, just pick something. It's likely better than leaving it within
> - // the loop.
> - if (!FoundBB)
> - FoundBB = SplitPreds[0];
> - NewBB->moveAfter(FoundBB);
> -}
> -
> -
> -/// SeparateNestedLoop - If this loop has multiple backedges, try to pull one of
> -/// them out into a nested loop. This is important for code that looks like
> -/// this:
> -///
> -/// Loop:
> -/// ...
> -/// br cond, Loop, Next
> -/// ...
> -/// br cond2, Loop, Out
> -///
> -/// To identify this common case, we look at the PHI nodes in the header of the
> -/// loop. PHI nodes with unchanging values on one backedge correspond to values
> -/// that change in the "outer" loop, but not in the "inner" loop.
> -///
> -/// If we are able to separate out a loop, return the new outer loop that was
> -/// created.
> -///
> -Loop *LoopSimplify::SeparateNestedLoop(Loop *L, LPPassManager &LPM,
> - BasicBlock *Preheader) {
> - // Don't try to separate loops without a preheader.
> - if (!Preheader)
> - return 0;
> -
> - // The header is not a landing pad; preheader insertion should ensure this.
> - assert(!L->getHeader()->isLandingPad() &&
> - "Can't insert backedge to landing pad");
> -
> - PHINode *PN = FindPHIToPartitionLoops(L, DT, AA, LI);
> - if (PN == 0) return 0; // No known way to partition.
> -
> - // Pull out all predecessors that have varying values in the loop. This
> - // handles the case when a PHI node has multiple instances of itself as
> - // arguments.
> - SmallVector<BasicBlock*, 8> OuterLoopPreds;
> - for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) {
> - if (PN->getIncomingValue(i) != PN ||
> - !L->contains(PN->getIncomingBlock(i))) {
> - // We can't split indirectbr edges.
> - if (isa<IndirectBrInst>(PN->getIncomingBlock(i)->getTerminator()))
> - return 0;
> - OuterLoopPreds.push_back(PN->getIncomingBlock(i));
> - }
> - }
> - DEBUG(dbgs() << "LoopSimplify: Splitting out a new outer loop\n");
> -
> - // If ScalarEvolution is around and knows anything about values in
> - // this loop, tell it to forget them, because we're about to
> - // substantially change it.
> - if (SE)
> - SE->forgetLoop(L);
> -
> - BasicBlock *Header = L->getHeader();
> - BasicBlock *NewBB =
> - SplitBlockPredecessors(Header, OuterLoopPreds, ".outer", this);
> -
> - // Make sure that NewBB is put someplace intelligent, which doesn't mess up
> - // code layout too horribly.
> - PlaceSplitBlockCarefully(NewBB, OuterLoopPreds, L);
> -
> - // Create the new outer loop.
> - Loop *NewOuter = new Loop();
> -
> - // Change the parent loop to use the outer loop as its child now.
> - if (Loop *Parent = L->getParentLoop())
> - Parent->replaceChildLoopWith(L, NewOuter);
> - else
> - LI->changeTopLevelLoop(L, NewOuter);
> + // not. That gives it the advantage of being able to hoist
> + // loop-invariant instructions out of the way to open up more
> + // opportunities, and the disadvantage of having the responsibility
> + // to preserve dominator information.
> + bool UniqueExit = true;
> + if (!ExitBlocks.empty())
> + for (unsigned i = 1, e = ExitBlocks.size(); i != e; ++i)
> + if (ExitBlocks[i] != ExitBlocks[0]) {
> + UniqueExit = false;
> + break;
> + }
> + if (UniqueExit) {
> + for (unsigned i = 0, e = ExitingBlocks.size(); i != e; ++i) {
> + BasicBlock *ExitingBlock = ExitingBlocks[i];
> + if (!ExitingBlock->getSinglePredecessor()) continue;
> + BranchInst *BI = dyn_cast<BranchInst>(ExitingBlock->getTerminator());
> + if (!BI || !BI->isConditional()) continue;
> + CmpInst *CI = dyn_cast<CmpInst>(BI->getCondition());
> + if (!CI || CI->getParent() != ExitingBlock) continue;
>
> - // L is now a subloop of our outer loop.
> - NewOuter->addChildLoop(L);
> + // Attempt to hoist out all instructions except for the
> + // comparison and the branch.
> + bool AllInvariant = true;
> + bool AnyInvariant = false;
> + for (BasicBlock::iterator I = ExitingBlock->begin(); &*I != BI; ) {
> + Instruction *Inst = I++;
> + // Skip debug info intrinsics.
> + if (isa<DbgInfoIntrinsic>(Inst))
> + continue;
> + if (Inst == CI)
> + continue;
> + if (!L->makeLoopInvariant(Inst, AnyInvariant,
> + Preheader ? Preheader->getTerminator() : 0)) {
> + AllInvariant = false;
> + break;
> + }
> + }
> + if (AnyInvariant) {
> + Changed = true;
> + // The loop disposition of all SCEV expressions that depend on any
> + // hoisted values have also changed.
> + if (SE)
> + SE->forgetLoopDispositions(L);
> + }
> + if (!AllInvariant) continue;
>
> - // Add the new loop to the pass manager queue.
> - LPM.insertLoopIntoQueue(NewOuter);
> + // The block has now been cleared of all instructions except for
> + // a comparison and a conditional branch. SimplifyCFG may be able
> + // to fold it now.
> + if (!FoldBranchToCommonDest(BI)) continue;
>
> - for (Loop::block_iterator I = L->block_begin(), E = L->block_end();
> - I != E; ++I)
> - NewOuter->addBlockEntry(*I);
> + // Success. The block is now dead, so remove it from the loop,
> + // update the dominator tree and delete it.
> + DEBUG(dbgs() << "LoopSimplify: Eliminating exiting block "
> + << ExitingBlock->getName() << "\n");
>
> - // Now reset the header in L, which had been moved by
> - // SplitBlockPredecessors for the outer loop.
> - L->moveToHeader(Header);
> + // Notify ScalarEvolution before deleting this block. Currently assume the
> + // parent loop doesn't change (spliting edges doesn't count). If blocks,
> + // CFG edges, or other values in the parent loop change, then we need call
> + // to forgetLoop() for the parent instead.
> + if (SE)
> + SE->forgetLoop(L);
>
> - // Determine which blocks should stay in L and which should be moved out to
> - // the Outer loop now.
> - std::set<BasicBlock*> BlocksInL;
> - for (pred_iterator PI=pred_begin(Header), E = pred_end(Header); PI!=E; ++PI) {
> - BasicBlock *P = *PI;
> - if (DT->dominates(Header, P))
> - AddBlockAndPredsToSet(P, Header, BlocksInL);
> - }
> + assert(pred_begin(ExitingBlock) == pred_end(ExitingBlock));
> + Changed = true;
> + LI->removeBlock(ExitingBlock);
>
> - // Scan all of the loop children of L, moving them to OuterLoop if they are
> - // not part of the inner loop.
> - const std::vector<Loop*> &SubLoops = L->getSubLoops();
> - for (size_t I = 0; I != SubLoops.size(); )
> - if (BlocksInL.count(SubLoops[I]->getHeader()))
> - ++I; // Loop remains in L
> - else
> - NewOuter->addChildLoop(L->removeChildLoop(SubLoops.begin() + I));
> + DomTreeNode *Node = DT->getNode(ExitingBlock);
> + const std::vector<DomTreeNodeBase<BasicBlock> *> &Children =
> + Node->getChildren();
> + while (!Children.empty()) {
> + DomTreeNode *Child = Children.front();
> + DT->changeImmediateDominator(Child, Node->getIDom());
> + }
> + DT->eraseNode(ExitingBlock);
>
> - // Now that we know which blocks are in L and which need to be moved to
> - // OuterLoop, move any blocks that need it.
> - for (unsigned i = 0; i != L->getBlocks().size(); ++i) {
> - BasicBlock *BB = L->getBlocks()[i];
> - if (!BlocksInL.count(BB)) {
> - // Move this block to the parent, updating the exit blocks sets
> - L->removeBlockFromLoop(BB);
> - if ((*LI)[BB] == L)
> - LI->changeLoopFor(BB, NewOuter);
> - --i;
> + BI->getSuccessor(0)->removePredecessor(ExitingBlock);
> + BI->getSuccessor(1)->removePredecessor(ExitingBlock);
> + ExitingBlock->eraseFromParent();
> }
> }
>
> - return NewOuter;
> + return Changed;
> }
>
> +bool llvm::simplifyLoop(Loop *L, DominatorTree *DT, LoopInfo *LI, Pass *PP,
> + AliasAnalysis *AA, ScalarEvolution *SE) {
> + bool Changed = false;
>
> + // Worklist maintains our depth-first queue of loops in this nest to process.
> + SmallVector<Loop *, 4> Worklist;
> + Worklist.push_back(L);
> +
> + // Walk the worklist from front to back, pushing newly found sub loops onto
> + // the back. This will let us process loops from back to front in depth-first
> + // order. We can use this simple process because loops form a tree.
> + for (unsigned Idx = 0; Idx != Worklist.size(); ++Idx) {
> + Loop *L2 = Worklist[Idx];
> + for (Loop::iterator I = L2->begin(), E = L2->end(); I != E; ++I)
> + Worklist.push_back(*I);
> + }
>
> -/// InsertUniqueBackedgeBlock - This method is called when the specified loop
> -/// has more than one backedge in it. If this occurs, revector all of these
> -/// backedges to target a new basic block and have that block branch to the loop
> -/// header. This ensures that loops have exactly one backedge.
> -///
> -BasicBlock *
> -LoopSimplify::InsertUniqueBackedgeBlock(Loop *L, BasicBlock *Preheader) {
> - assert(L->getNumBackEdges() > 1 && "Must have > 1 backedge!");
> -
> - // Get information about the loop
> - BasicBlock *Header = L->getHeader();
> - Function *F = Header->getParent();
> -
> - // Unique backedge insertion currently depends on having a preheader.
> - if (!Preheader)
> - return 0;
> -
> - // The header is not a landing pad; preheader insertion should ensure this.
> - assert(!Header->isLandingPad() && "Can't insert backedge to landing pad");
> -
> - // Figure out which basic blocks contain back-edges to the loop header.
> - std::vector<BasicBlock*> BackedgeBlocks;
> - for (pred_iterator I = pred_begin(Header), E = pred_end(Header); I != E; ++I){
> - BasicBlock *P = *I;
> -
> - // Indirectbr edges cannot be split, so we must fail if we find one.
> - if (isa<IndirectBrInst>(P->getTerminator()))
> - return 0;
> + while (!Worklist.empty())
> + Changed |= simplifyOneLoop(Worklist.pop_back_val(), Worklist, AA, DT, LI, SE, PP);
>
> - if (P != Preheader) BackedgeBlocks.push_back(P);
> - }
> + return Changed;
> +}
>
> - // Create and insert the new backedge block...
> - BasicBlock *BEBlock = BasicBlock::Create(Header->getContext(),
> - Header->getName()+".backedge", F);
> - BranchInst *BETerminator = BranchInst::Create(Header, BEBlock);
> +namespace {
> + struct LoopSimplify : public FunctionPass {
> + static char ID; // Pass identification, replacement for typeid
> + LoopSimplify() : FunctionPass(ID) {
> + initializeLoopSimplifyPass(*PassRegistry::getPassRegistry());
> + }
>
> - DEBUG(dbgs() << "LoopSimplify: Inserting unique backedge block "
> - << BEBlock->getName() << "\n");
> + // AA - If we have an alias analysis object to update, this is it, otherwise
> + // this is null.
> + AliasAnalysis *AA;
> + DominatorTree *DT;
> + LoopInfo *LI;
> + ScalarEvolution *SE;
>
> - // Move the new backedge block to right after the last backedge block.
> - Function::iterator InsertPos = BackedgeBlocks.back(); ++InsertPos;
> - F->getBasicBlockList().splice(InsertPos, F->getBasicBlockList(), BEBlock);
> + virtual bool runOnFunction(Function &F);
>
> - // Now that the block has been inserted into the function, create PHI nodes in
> - // the backedge block which correspond to any PHI nodes in the header block.
> - for (BasicBlock::iterator I = Header->begin(); isa<PHINode>(I); ++I) {
> - PHINode *PN = cast<PHINode>(I);
> - PHINode *NewPN = PHINode::Create(PN->getType(), BackedgeBlocks.size(),
> - PN->getName()+".be", BETerminator);
> - if (AA) AA->copyValue(PN, NewPN);
> + virtual void getAnalysisUsage(AnalysisUsage &AU) const {
> + // We need loop information to identify the loops...
> + AU.addRequired<DominatorTreeWrapperPass>();
> + AU.addPreserved<DominatorTreeWrapperPass>();
>
> - // Loop over the PHI node, moving all entries except the one for the
> - // preheader over to the new PHI node.
> - unsigned PreheaderIdx = ~0U;
> - bool HasUniqueIncomingValue = true;
> - Value *UniqueValue = 0;
> - for (unsigned i = 0, e = PN->getNumIncomingValues(); i != e; ++i) {
> - BasicBlock *IBB = PN->getIncomingBlock(i);
> - Value *IV = PN->getIncomingValue(i);
> - if (IBB == Preheader) {
> - PreheaderIdx = i;
> - } else {
> - NewPN->addIncoming(IV, IBB);
> - if (HasUniqueIncomingValue) {
> - if (UniqueValue == 0)
> - UniqueValue = IV;
> - else if (UniqueValue != IV)
> - HasUniqueIncomingValue = false;
> - }
> - }
> - }
> + AU.addRequired<LoopInfo>();
> + AU.addPreserved<LoopInfo>();
>
> - // Delete all of the incoming values from the old PN except the preheader's
> - assert(PreheaderIdx != ~0U && "PHI has no preheader entry??");
> - if (PreheaderIdx != 0) {
> - PN->setIncomingValue(0, PN->getIncomingValue(PreheaderIdx));
> - PN->setIncomingBlock(0, PN->getIncomingBlock(PreheaderIdx));
> + AU.addPreserved<AliasAnalysis>();
> + AU.addPreserved<ScalarEvolution>();
> + AU.addPreserved<DependenceAnalysis>();
> + AU.addPreservedID(BreakCriticalEdgesID); // No critical edges added.
> }
> - // Nuke all entries except the zero'th.
> - for (unsigned i = 0, e = PN->getNumIncomingValues()-1; i != e; ++i)
> - PN->removeIncomingValue(e-i, false);
>
> - // Finally, add the newly constructed PHI node as the entry for the BEBlock.
> - PN->addIncoming(NewPN, BEBlock);
> + /// verifyAnalysis() - Verify LoopSimplifyForm's guarantees.
> + void verifyAnalysis() const;
>
> - // As an optimization, if all incoming values in the new PhiNode (which is a
> - // subset of the incoming values of the old PHI node) have the same value,
> - // eliminate the PHI Node.
> - if (HasUniqueIncomingValue) {
> - NewPN->replaceAllUsesWith(UniqueValue);
> - if (AA) AA->deleteValue(NewPN);
> - BEBlock->getInstList().erase(NewPN);
> - }
> - }
> + private:
> + bool ProcessLoop(Loop *L);
> + BasicBlock *RewriteLoopExitBlock(Loop *L, BasicBlock *Exit);
> + Loop *SeparateNestedLoop(Loop *L, BasicBlock *Preheader);
> + BasicBlock *InsertUniqueBackedgeBlock(Loop *L, BasicBlock *Preheader);
> + };
> +}
>
> - // Now that all of the PHI nodes have been inserted and adjusted, modify the
> - // backedge blocks to just to the BEBlock instead of the header.
> - for (unsigned i = 0, e = BackedgeBlocks.size(); i != e; ++i) {
> - TerminatorInst *TI = BackedgeBlocks[i]->getTerminator();
> - for (unsigned Op = 0, e = TI->getNumSuccessors(); Op != e; ++Op)
> - if (TI->getSuccessor(Op) == Header)
> - TI->setSuccessor(Op, BEBlock);
> - }
> +char LoopSimplify::ID = 0;
> +INITIALIZE_PASS_BEGIN(LoopSimplify, "loop-simplify",
> + "Canonicalize natural loops", true, false)
> +INITIALIZE_PASS_DEPENDENCY(DominatorTreeWrapperPass)
> +INITIALIZE_PASS_DEPENDENCY(LoopInfo)
> +INITIALIZE_PASS_END(LoopSimplify, "loop-simplify",
> + "Canonicalize natural loops", true, false)
>
> - //===--- Update all analyses which we must preserve now -----------------===//
> +// Publicly exposed interface to pass...
> +char &llvm::LoopSimplifyID = LoopSimplify::ID;
> +Pass *llvm::createLoopSimplifyPass() { return new LoopSimplify(); }
>
> - // Update Loop Information - we know that this block is now in the current
> - // loop and all parent loops.
> - L->addBasicBlockToLoop(BEBlock, LI->getBase());
> +/// runOnLoop - Run down all loops in the CFG (recursively, but we could do
> +/// it in any convenient order) inserting preheaders...
> +///
> +bool LoopSimplify::runOnFunction(Function &F) {
> + bool Changed = false;
> + AA = getAnalysisIfAvailable<AliasAnalysis>();
> + LI = &getAnalysis<LoopInfo>();
> + DT = &getAnalysis<DominatorTreeWrapperPass>().getDomTree();
> + SE = getAnalysisIfAvailable<ScalarEvolution>();
>
> - // Update dominator information
> - DT->splitBlock(BEBlock);
> + // Simplify each loop nest in the function.
> + for (LoopInfo::iterator I = LI->begin(), E = LI->end(); I != E; ++I)
> + Changed |= simplifyLoop(*I, DT, LI, this, AA, SE);
>
> - return BEBlock;
> + return Changed;
> }
>
> -void LoopSimplify::verifyAnalysis() const {
> +// FIXME: Restore this code when we re-enable verification in verifyAnalysis
> +// below.
> +#if 0
> +static void verifyLoop(Loop *L) {
> + // Verify subloops.
> + for (Loop::iterator I = L->begin(), E = L->end(); I != E; ++I)
> + verifyLoop(*I);
> +
> // It used to be possible to just assert L->isLoopSimplifyForm(), however
> // with the introduction of indirectbr, there are now cases where it's
> // not possible to transform a loop as necessary. We can at least check
> @@ -806,3 +843,15 @@ void LoopSimplify::verifyAnalysis() cons
> (void)HasIndBrExiting;
> }
> }
> +#endif
> +
> +void LoopSimplify::verifyAnalysis() const {
> + // FIXME: This routine is being called mid-way through the loop pass manager
> + // as loop passes destroy this analysis. That's actually fine, but we have no
> + // way of expressing that here. Once all of the passes that destroy this are
> + // hoisted out of the loop pass manager we can add back verification here.
> +#if 0
> + for (LoopInfo::iterator I = LI->begin(), E = LI->end(); I != E; ++I)
> + verifyLoop(*I);
> +#endif
> +}
>
> Modified: llvm/trunk/lib/Transforms/Utils/LoopUnroll.cpp
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Utils/LoopUnroll.cpp?rev=199884&r1=199883&r2=199884&view=diff
> ==============================================================================
> --- llvm/trunk/lib/Transforms/Utils/LoopUnroll.cpp (original)
> +++ llvm/trunk/lib/Transforms/Utils/LoopUnroll.cpp Thu Jan 23 05:23:19 2014
> @@ -30,6 +30,7 @@
> #include "llvm/Transforms/Utils/BasicBlockUtils.h"
> #include "llvm/Transforms/Utils/Cloning.h"
> #include "llvm/Transforms/Utils/Local.h"
> +#include "llvm/Transforms/Utils/LoopUtils.h"
> #include "llvm/Transforms/Utils/SimplifyIndVar.h"
> using namespace llvm;
>
> @@ -138,10 +139,10 @@ static BasicBlock *FoldBlockIntoPredeces
> /// removed from the LoopPassManager as well. LPM can also be NULL.
> ///
> /// This utility preserves LoopInfo. If DominatorTree or ScalarEvolution are
> -/// available it must also preserve those analyses.
> +/// available from the Pass it must also preserve those analyses.
> bool llvm::UnrollLoop(Loop *L, unsigned Count, unsigned TripCount,
> bool AllowRuntime, unsigned TripMultiple,
> - LoopInfo *LI, LPPassManager *LPM) {
> + LoopInfo *LI, Pass *PP, LPPassManager *LPM) {
> BasicBlock *Preheader = L->getLoopPreheader();
> if (!Preheader) {
> DEBUG(dbgs() << " Can't unroll; loop preheader-insertion failed.\n");
> @@ -209,8 +210,8 @@ bool llvm::UnrollLoop(Loop *L, unsigned
>
> // Notify ScalarEvolution that the loop will be substantially changed,
> // if not outright eliminated.
> - if (LPM) {
> - ScalarEvolution *SE = LPM->getAnalysisIfAvailable<ScalarEvolution>();
> + if (PP) {
> + ScalarEvolution *SE = PP->getAnalysisIfAvailable<ScalarEvolution>();
> if (SE)
> SE->forgetLoop(L);
> }
> @@ -410,15 +411,18 @@ bool llvm::UnrollLoop(Loop *L, unsigned
> }
> }
>
> - if (LPM) {
> + DominatorTree *DT = 0;
> + if (PP) {
> // FIXME: Reconstruct dom info, because it is not preserved properly.
> // Incrementally updating domtree after loop unrolling would be easy.
> if (DominatorTreeWrapperPass *DTWP =
> - LPM->getAnalysisIfAvailable<DominatorTreeWrapperPass>())
> - DTWP->getDomTree().recalculate(*L->getHeader()->getParent());
> + PP->getAnalysisIfAvailable<DominatorTreeWrapperPass>()) {
> + DT = &DTWP->getDomTree();
> + DT->recalculate(*L->getHeader()->getParent());
> + }
>
> // Simplify any new induction variables in the partially unrolled loop.
> - ScalarEvolution *SE = LPM->getAnalysisIfAvailable<ScalarEvolution>();
> + ScalarEvolution *SE = PP->getAnalysisIfAvailable<ScalarEvolution>();
> if (SE && !CompletelyUnroll) {
> SmallVector<WeakVH, 16> DeadInsts;
> simplifyLoopIVs(L, SE, LPM, DeadInsts);
> @@ -451,9 +455,23 @@ bool llvm::UnrollLoop(Loop *L, unsigned
>
> NumCompletelyUnrolled += CompletelyUnroll;
> ++NumUnrolled;
> +
> + Loop *OuterL = L->getParentLoop();
> // Remove the loop from the LoopPassManager if it's completely removed.
> if (CompletelyUnroll && LPM != NULL)
> LPM->deleteLoopFromQueue(L);
>
> + // If we have a pass and a DominatorTree we should re-simplify impacted loops
> + // to ensure subsequent analyses can rely on this form. We want to simplify
> + // at least one layer outside of the loop that was unrolled so that any
> + // changes to the parent loop exposed by the unrolling are considered.
> + if (PP && DT) {
> + if (!OuterL && !CompletelyUnroll)
> + OuterL = L;
> + if (OuterL)
> + simplifyLoop(OuterL, DT, LI, PP, /*AliasAnalysis*/ 0,
> + PP->getAnalysisIfAvailable<ScalarEvolution>());
> + }
> +
> return true;
> }
>
> Modified: llvm/trunk/test/Transforms/IndVarSimplify/lftr-reuse.ll
> URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/IndVarSimplify/lftr-reuse.ll?rev=199884&r1=199883&r2=199884&view=diff
> ==============================================================================
> --- llvm/trunk/test/Transforms/IndVarSimplify/lftr-reuse.ll (original)
> +++ llvm/trunk/test/Transforms/IndVarSimplify/lftr-reuse.ll Thu Jan 23 05:23:19 2014
> @@ -38,17 +38,16 @@ for.end:
> ret void
> }
>
> -; It would be nice if SCEV and any loop analysis could assume that
> -; preheaders exist. Unfortunately it is not always the case. This test
> -; checks that SCEVExpander can handle an outer loop that has not yet
> -; been simplified. As a result, the inner loop's exit test will not be
> -; rewritten.
> +; This test checks that SCEVExpander can handle an outer loop that has been
> +; simplified, and as a result the inner loop's exit test will be rewritten.
> define void @expandOuterRecurrence(i32 %arg) nounwind {
> entry:
> %sub1 = sub nsw i32 %arg, 1
> %cmp1 = icmp slt i32 0, %sub1
> br i1 %cmp1, label %outer, label %exit
>
> +; CHECK: outer:
> +; CHECK: icmp slt
> outer:
> %i = phi i32 [ 0, %entry ], [ %i.inc, %outer.inc ]
> %sub2 = sub nsw i32 %arg, %i
> @@ -60,7 +59,6 @@ inner.ph:
> br label %inner
>
> ; CHECK: inner:
> -; CHECK: icmp slt
> ; CHECK: br i1
> inner:
> %j = phi i32 [ 0, %inner.ph ], [ %j.inc, %inner ]
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
More information about the llvm-commits
mailing list