<div style="font-family: arial, helvetica, sans-serif"><font size="2"><div class="gmail_quote">On Tue, Jun 19, 2012 at 10:46 PM, Andrew Trick <span dir="ltr"><<a href="mailto:atrick@apple.com" target="_blank">atrick@apple.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><br><div><div class="im"><div>On Jun 19, 2012, at 10:41 PM, Chandler Carruth <<a href="mailto:chandlerc@google.com" target="_blank">chandlerc@google.com</a>> wrote:</div>
<br><blockquote type="cite"><div style="font-family:arial,helvetica,sans-serif"><font size="2"><div class="gmail_quote">On Tue, Jun 19, 2012 at 10:38 PM, Andrew Trick <span dir="ltr"><<a href="mailto:atrick@apple.com" target="_blank">atrick@apple.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Please review and try to break this. I plan to enable it by default shortly.<br>
-Andy<br></blockquote><div><br></div><div>Is there a specific clang commandline flag that you'd like tested? If so, we can throw some serious testing at it very easily.</div></div></font></div></blockquote><div><br></div>
</div>I don't expect any performance impact other than accidental (which I have seen plenty of). LoopInfo compile time is hopefully a bit faster now on average but it's hardly measurable (I saw +1%).</div><div><br>
</div><div>However, if you want to do some correctness testing just do your usual thing with:</div><div><br></div><div>-mllvm -stable-loops -mllvm -stable-machine-loops</div></div></blockquote><div><br></div><div>Yea, this is mostly for correctness testing. I can do a before/after test run with these flags across our entire codebase, and all of our automated tests tomorrow. I'll let you know if anything unusual shows up.</div>
<div><br></div><div>-Chandler</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="word-wrap:break-word"><div><br></div><div>Thanks!</div><div>
-Andy</div><div><div class="h5"><div><br><blockquote type="cite"><div style="font-family:arial,helvetica,sans-serif"><font size="2"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div><div><br>
On Jun 19, 2012, at 10:23 PM, Andrew Trick <<a href="mailto:atrick@apple.com" target="_blank">atrick@apple.com</a>> wrote:<br>
<br>
> Author: atrick<br>
> Date: Wed Jun 20 00:23:33 2012<br>
> New Revision: 158790<br>
><br>
> URL: <a href="http://llvm.org/viewvc/llvm-project?rev=158790&view=rev" target="_blank">http://llvm.org/viewvc/llvm-project?rev=158790&view=rev</a><br>
> Log:<br>
> A new algorithm for computing LoopInfo. Temporarily disabled.<br>
><br>
> -stable-loops enables a new algorithm for generating the Loop<br>
> forest. It differs from the original algorithm in a few respects:<br>
><br>
> - Not determined by use-list order.<br>
> - Initially guarantees RPO order of block and subloops.<br>
> - Linear in the number of CFG edges.<br>
> - Nonrecursive.<br>
><br>
> I didn't want to change the LoopInfo API yet, so the block lists are<br>
> still inclusive. This seems strange to me, and it means that building<br>
> LoopInfo is not strictly linear, but it may not be a problem in<br>
> practice. At least the block lists start out in RPO order now. In the<br>
> future we may add an attribute or wrapper analysis that allows other<br>
> passes to assume RPO order.<br>
><br>
> The primary motivation of this work was not to optimize LoopInfo, but<br>
> to allow reproducing performance issues by decomposing the compilation<br>
> stages. I'm often unable to do this with the current LoopInfo, because<br>
> the loop tree order determines Loop pass order. Serializing the IR<br>
> tends to invert the order, which reverses the optimization order. This<br>
> makes it nearly impossible to debug interdependent loop optimizations<br>
> such as LSR.<br>
><br>
> I also believe this will provide more stable performance results across time.<br>
><br>
> Modified:<br>
> llvm/trunk/include/llvm/Analysis/LoopInfo.h<br>
> llvm/trunk/include/llvm/Analysis/LoopInfoImpl.h<br>
> llvm/trunk/lib/Analysis/LoopInfo.cpp<br>
> llvm/trunk/lib/CodeGen/MachineLoopInfo.cpp<br>
><br>
> Modified: llvm/trunk/include/llvm/Analysis/LoopInfo.h<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/LoopInfo.h?rev=158790&r1=158789&r2=158790&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/LoopInfo.h?rev=158790&r1=158789&r2=158790&view=diff</a><br>
> ==============================================================================<br>
> --- llvm/trunk/include/llvm/Analysis/LoopInfo.h (original)<br>
> +++ llvm/trunk/include/llvm/Analysis/LoopInfo.h Wed Jun 20 00:23:33 2012<br>
> @@ -97,6 +97,9 @@<br>
> BlockT *getHeader() const { return Blocks.front(); }<br>
> LoopT *getParentLoop() const { return ParentLoop; }<br>
><br>
> + /// setParentLoop is a raw interface for bypassing addChildLoop.<br>
> + void setParentLoop(LoopT *L) { ParentLoop = L; }<br>
> +<br>
> /// contains - Return true if the specified loop is contained within in<br>
> /// this loop.<br>
> ///<br>
> @@ -122,6 +125,7 @@<br>
> /// iterator/begin/end - Return the loops contained entirely within this loop.<br>
> ///<br>
> const std::vector<LoopT *> &getSubLoops() const { return SubLoops; }<br>
> + std::vector<LoopT *> &getSubLoopsVector() { return SubLoops; }<br>
> typedef typename std::vector<LoopT *>::const_iterator iterator;<br>
> iterator begin() const { return SubLoops.begin(); }<br>
> iterator end() const { return SubLoops.end(); }<br>
> @@ -130,6 +134,7 @@<br>
> /// getBlocks - Get a list of the basic blocks which make up this loop.<br>
> ///<br>
> const std::vector<BlockT*> &getBlocks() const { return Blocks; }<br>
> + std::vector<BlockT*> &getBlocksVector() { return Blocks; }<br>
> typedef typename std::vector<BlockT*>::const_iterator block_iterator;<br>
> block_iterator block_begin() const { return Blocks.begin(); }<br>
> block_iterator block_end() const { return Blocks.end(); }<br>
> @@ -528,6 +533,9 @@<br>
> /// inserted into L instead.<br>
> void InsertLoopInto(LoopT *L, LoopT *Parent);<br>
><br>
> + /// Create the loop forest using a stable algorithm.<br>
> + void Analyze(DominatorTreeBase<BlockT> &DomTree);<br>
> +<br>
> // Debugging<br>
><br>
> void print(raw_ostream &OS) const;<br>
><br>
> Modified: llvm/trunk/include/llvm/Analysis/LoopInfoImpl.h<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/LoopInfoImpl.h?rev=158790&r1=158789&r2=158790&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/Analysis/LoopInfoImpl.h?rev=158790&r1=158789&r2=158790&view=diff</a><br>
> ==============================================================================<br>
> --- llvm/trunk/include/llvm/Analysis/LoopInfoImpl.h (original)<br>
> +++ llvm/trunk/include/llvm/Analysis/LoopInfoImpl.h Wed Jun 20 00:23:33 2012<br>
> @@ -16,6 +16,7 @@<br>
> #define LLVM_ANALYSIS_LOOP_INFO_IMPL_H<br>
><br>
> #include "llvm/Analysis/LoopInfo.h"<br>
> +#include "llvm/ADT/PostOrderIterator.h"<br>
><br>
> namespace llvm {<br>
><br>
> @@ -531,6 +532,204 @@<br>
> L->ParentLoop = Parent;<br>
> }<br>
><br>
> +//===----------------------------------------------------------------------===//<br>
> +/// Stable LoopInfo Analysis - Build a loop tree using stable iterators so the<br>
> +/// result does / not depend on use list (block predecessor) order.<br>
> +///<br>
> +<br>
> +/// Discover a subloop with the specified backedges such that: All blocks within<br>
> +/// this loop are mapped to this loop or a subloop. And all subloops within this<br>
> +/// loop have their parent loop set to this loop or a subloop.<br>
> +template<class BlockT, class LoopT><br>
> +static void discoverAndMapSubloop(LoopT *L, ArrayRef<BlockT*> Backedges,<br>
> + LoopInfoBase<BlockT, LoopT> *LI,<br>
> + DominatorTreeBase<BlockT> &DomTree) {<br>
> + typedef GraphTraits<Inverse<BlockT*> > InvBlockTraits;<br>
> +<br>
> + unsigned NumBlocks = 0;<br>
> + unsigned NumSubloops = 0;<br>
> +<br>
> + // Perform a backward CFG traversal using a worklist.<br>
> + std::vector<BlockT *> ReverseCFGWorklist(Backedges.begin(), Backedges.end());<br>
> + while (!ReverseCFGWorklist.empty()) {<br>
> + BlockT *PredBB = ReverseCFGWorklist.back();<br>
> + ReverseCFGWorklist.pop_back();<br>
> +<br>
> + LoopT *Subloop = LI->getLoopFor(PredBB);<br>
> + if (!Subloop) {<br>
> + if (!DomTree.isReachableFromEntry(PredBB))<br>
> + continue;<br>
> +<br>
> + // This is an undiscovered block. Map it to the current loop.<br>
> + LI->changeLoopFor(PredBB, L);<br>
> + ++NumBlocks;<br>
> + if (PredBB == L->getHeader())<br>
> + continue;<br>
> + // Push all block predecessors on the worklist.<br>
> + ReverseCFGWorklist.insert(ReverseCFGWorklist.end(),<br>
> + InvBlockTraits::child_begin(PredBB),<br>
> + InvBlockTraits::child_end(PredBB));<br>
> + }<br>
> + else {<br>
> + // This is a discovered block. Find its outermost discovered loop.<br>
> + while (LoopT *Parent = Subloop->getParentLoop())<br>
> + Subloop = Parent;<br>
> +<br>
> + // If it is already discovered to be a subloop of this loop, continue.<br>
> + if (Subloop == L)<br>
> + continue;<br>
> +<br>
> + // Discover a subloop of this loop.<br>
> + Subloop->setParentLoop(L);<br>
> + ++NumSubloops;<br>
> + NumBlocks += Subloop->getBlocks().capacity();<br>
> + PredBB = Subloop->getHeader();<br>
> + // Continue traversal along predecessors that are not loop-back edges from<br>
> + // within this subloop tree itself. Note that a predecessor may directly<br>
> + // reach another subloop that is not yet discovered to be a subloop of<br>
> + // this loop, which we must traverse.<br>
> + for (typename InvBlockTraits::ChildIteratorType PI =<br>
> + InvBlockTraits::child_begin(PredBB),<br>
> + PE = InvBlockTraits::child_end(PredBB); PI != PE; ++PI) {<br>
> + if (LI->getLoopFor(*PI) != Subloop)<br>
> + ReverseCFGWorklist.push_back(*PI);<br>
> + }<br>
> + }<br>
> + }<br>
> + L->getSubLoopsVector().reserve(NumSubloops);<br>
> + L->getBlocksVector().reserve(NumBlocks);<br>
> +}<br>
> +<br>
> +namespace {<br>
> +/// Populate all loop data in a stable order during a single forward DFS.<br>
> +template<class BlockT, class LoopT><br>
> +class PopulateLoopsDFS {<br>
> + typedef GraphTraits<BlockT*> BlockTraits;<br>
> + typedef typename BlockTraits::ChildIteratorType SuccIterTy;<br>
> +<br>
> + LoopInfoBase<BlockT, LoopT> *LI;<br>
> + DenseSet<const BlockT *> VisitedBlocks;<br>
> + std::vector<std::pair<BlockT*, SuccIterTy> > DFSStack;<br>
> +<br>
> +public:<br>
> + PopulateLoopsDFS(LoopInfoBase<BlockT, LoopT> *li):<br>
> + LI(li) {}<br>
> +<br>
> + void traverse(BlockT *EntryBlock);<br>
> +<br>
> +protected:<br>
> + void reverseInsertIntoLoop(BlockT *Block);<br>
> +<br>
> + BlockT *dfsSource() { return DFSStack.back().first; }<br>
> + SuccIterTy &dfsSucc() { return DFSStack.back().second; }<br>
> + SuccIterTy dfsSuccEnd() { return BlockTraits::child_end(dfsSource()); }<br>
> +<br>
> + void pushBlock(BlockT *Block) {<br>
> + DFSStack.push_back(std::make_pair(Block, BlockTraits::child_begin(Block)));<br>
> + }<br>
> +};<br>
> +} // anonymous<br>
> +<br>
> +/// Top-level driver for the forward DFS within the loop.<br>
> +template<class BlockT, class LoopT><br>
> +void PopulateLoopsDFS<BlockT, LoopT>::traverse(BlockT *EntryBlock) {<br>
> + pushBlock(EntryBlock);<br>
> + VisitedBlocks.insert(EntryBlock);<br>
> + while (!DFSStack.empty()) {<br>
> + // Traverse the leftmost path as far as possible.<br>
> + while (dfsSucc() != dfsSuccEnd()) {<br>
> + BlockT *BB = *dfsSucc();<br>
> + ++dfsSucc();<br>
> + if (!VisitedBlocks.insert(BB).second)<br>
> + continue;<br>
> +<br>
> + // Push the next DFS successor onto the stack.<br>
> + pushBlock(BB);<br>
> + }<br>
> + // Visit the top of the stack in postorder and backtrack.<br>
> + reverseInsertIntoLoop(dfsSource());<br>
> + DFSStack.pop_back();<br>
> + }<br>
> +}<br>
> +<br>
> +/// Add a single Block to its ancestor loops in PostOrder. If the block is a<br>
> +/// subloop header, add the subloop to its parent in PostOrder, then reverse the<br>
> +/// Block and Subloop vectors of the now complete subloop to achieve RPO.<br>
> +template<class BlockT, class LoopT><br>
> +void PopulateLoopsDFS<BlockT, LoopT>::reverseInsertIntoLoop(BlockT *Block) {<br>
> + for (LoopT *Subloop = LI->getLoopFor(Block);<br>
> + Subloop; Subloop = Subloop->getParentLoop()) {<br>
> +<br>
> + if (Block != Subloop->getHeader()) {<br>
> + Subloop->getBlocksVector().push_back(Block);<br>
> + continue;<br>
> + }<br>
> + if (Subloop->getParentLoop())<br>
> + Subloop->getParentLoop()->getSubLoopsVector().push_back(Subloop);<br>
> + else<br>
> + LI->addTopLevelLoop(Subloop);<br>
> +<br>
> + // For convenience, Blocks and Subloops are inserted in postorder. Reverse<br>
> + // the lists, except for the loop header, which is always at the beginning.<br>
> + std::reverse(Subloop->getBlocksVector().begin()+1,<br>
> + Subloop->getBlocksVector().end());<br>
> + std::reverse(Subloop->getSubLoopsVector().begin(),<br>
> + Subloop->getSubLoopsVector().end());<br>
> + }<br>
> +}<br>
> +<br>
> +/// Analyze LoopInfo discovers loops during a postorder DominatorTree traversal<br>
> +/// interleaved with backward CFG traversals within each subloop<br>
> +/// (discoverAndMapSubloop). The backward traversal skips inner subloops, so<br>
> +/// this part of the algorithm is linear in the number of CFG edges. Subloop and<br>
> +/// Block vectors are then populated during a single forward CFG traversal<br>
> +/// (PopulateLoopDFS).<br>
> +///<br>
> +/// During the two CFG traversals each block is seen three times:<br>
> +/// 1) Discovered and mapped by a reverse CFG traversal.<br>
> +/// 2) Visited during a forward DFS CFG traversal.<br>
> +/// 3) Reverse-inserted in the loop in postorder following forward DFS.<br>
> +///<br>
> +/// The Block vectors are inclusive, so step 3 requires loop-depth number of<br>
> +/// insertions per block.<br>
> +template<class BlockT, class LoopT><br>
> +void LoopInfoBase<BlockT, LoopT>::<br>
> +Analyze(DominatorTreeBase<BlockT> &DomTree) {<br>
> +<br>
> + // Postorder traversal of the dominator tree.<br>
> + DomTreeNodeBase<BlockT>* DomRoot = DomTree.getRootNode();<br>
> + for (po_iterator<DomTreeNodeBase<BlockT>*> DomIter = po_begin(DomRoot),<br>
> + DomEnd = po_end(DomRoot); DomIter != DomEnd; ++DomIter) {<br>
> +<br>
> + BlockT *Header = DomIter->getBlock();<br>
> + SmallVector<BlockT *, 4> Backedges;<br>
> +<br>
> + // Check each predecessor of the potential loop header.<br>
> + typedef GraphTraits<Inverse<BlockT*> > InvBlockTraits;<br>
> + for (typename InvBlockTraits::ChildIteratorType PI =<br>
> + InvBlockTraits::child_begin(Header),<br>
> + PE = InvBlockTraits::child_end(Header); PI != PE; ++PI) {<br>
> +<br>
> + BlockT *Backedge = *PI;<br>
> +<br>
> + // If Header dominates predBB, this is a new loop. Collect the backedges.<br>
> + if (DomTree.dominates(Header, Backedge)<br>
> + && DomTree.isReachableFromEntry(Backedge)) {<br>
> + Backedges.push_back(Backedge);<br>
> + }<br>
> + }<br>
> + // Perform a backward CFG traversal to discover and map blocks in this loop.<br>
> + if (!Backedges.empty()) {<br>
> + LoopT *L = new LoopT(Header);<br>
> + discoverAndMapSubloop(L, ArrayRef<BlockT*>(Backedges), this, DomTree);<br>
> + }<br>
> + }<br>
> + // Perform a single forward CFG traversal to populate block and subloop<br>
> + // vectors for all loops.<br>
> + PopulateLoopsDFS<BlockT, LoopT> DFS(this);<br>
> + DFS.traverse(DomRoot->getBlock());<br>
> +}<br>
> +<br>
> // Debugging<br>
> template<class BlockT, class LoopT><br>
> void LoopInfoBase<BlockT, LoopT>::print(raw_ostream &OS) const {<br>
><br>
> Modified: llvm/trunk/lib/Analysis/LoopInfo.cpp<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/LoopInfo.cpp?rev=158790&r1=158789&r2=158790&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Analysis/LoopInfo.cpp?rev=158790&r1=158789&r2=158790&view=diff</a><br>
> ==============================================================================<br>
> --- llvm/trunk/lib/Analysis/LoopInfo.cpp (original)<br>
> +++ llvm/trunk/lib/Analysis/LoopInfo.cpp Wed Jun 20 00:23:33 2012<br>
> @@ -44,6 +44,10 @@<br>
> VerifyLoopInfoX("verify-loop-info", cl::location(VerifyLoopInfo),<br>
> cl::desc("Verify loop info (time consuming)"));<br>
><br>
> +static cl::opt<bool><br>
> +StableLoopInfo("stable-loops", cl::Hidden, cl::init(false),<br>
> + cl::desc("Compute a stable loop tree."));<br>
> +<br>
> char LoopInfo::ID = 0;<br>
> INITIALIZE_PASS_BEGIN(LoopInfo, "loops", "Natural Loop Information", true, true)<br>
> INITIALIZE_PASS_DEPENDENCY(DominatorTree)<br>
> @@ -512,7 +516,10 @@<br>
> //<br>
> bool LoopInfo::runOnFunction(Function &) {<br>
> releaseMemory();<br>
> - LI.Calculate(getAnalysis<DominatorTree>().getBase()); // Update<br>
> + if (StableLoopInfo)<br>
> + LI.Analyze(getAnalysis<DominatorTree>().getBase());<br>
> + else<br>
> + LI.Calculate(getAnalysis<DominatorTree>().getBase()); // Update<br>
> return false;<br>
> }<br>
><br>
><br>
> Modified: llvm/trunk/lib/CodeGen/MachineLoopInfo.cpp<br>
> URL: <a href="http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineLoopInfo.cpp?rev=158790&r1=158789&r2=158790&view=diff" target="_blank">http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/MachineLoopInfo.cpp?rev=158790&r1=158789&r2=158790&view=diff</a><br>
> ==============================================================================<br>
> --- llvm/trunk/lib/CodeGen/MachineLoopInfo.cpp (original)<br>
> +++ llvm/trunk/lib/CodeGen/MachineLoopInfo.cpp Wed Jun 20 00:23:33 2012<br>
> @@ -18,6 +18,7 @@<br>
> #include "llvm/CodeGen/MachineDominators.h"<br>
> #include "llvm/CodeGen/Passes.h"<br>
> #include "llvm/Analysis/LoopInfoImpl.h"<br>
> +#include "llvm/Support/CommandLine.h"<br>
> #include "llvm/Support/Debug.h"<br>
> using namespace llvm;<br>
><br>
> @@ -25,6 +26,10 @@<br>
> template class llvm::LoopBase<MachineBasicBlock, MachineLoop>;<br>
> template class llvm::LoopInfoBase<MachineBasicBlock, MachineLoop>;<br>
><br>
> +static cl::opt<bool><br>
> +StableLoopInfo("stable-machine-loops", cl::Hidden, cl::init(false),<br>
> + cl::desc("Compute a stable loop tree."));<br>
> +<br>
> char MachineLoopInfo::ID = 0;<br>
> INITIALIZE_PASS_BEGIN(MachineLoopInfo, "machine-loops",<br>
> "Machine Natural Loop Construction", true, true)<br>
> @@ -36,7 +41,10 @@<br>
><br>
> bool MachineLoopInfo::runOnMachineFunction(MachineFunction &) {<br>
> releaseMemory();<br>
> - LI.Calculate(getAnalysis<MachineDominatorTree>().getBase()); // Update<br>
> + if (StableLoopInfo)<br>
> + LI.Analyze(getAnalysis<MachineDominatorTree>().getBase());<br>
> + else<br>
> + LI.Calculate(getAnalysis<MachineDominatorTree>().getBase()); // Update<br>
> return false;<br>
> }<br>
><br>
><br>
><br>
> _______________________________________________<br>
> llvm-commits mailing list<br>
> <a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>
> <a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
</div></div></blockquote></div><br></font></div>
</blockquote></div><br></div></div></div>
</blockquote></div><br></font></div>