<div dir="ltr"><div>Tobias,<br><br>Thanks for the comments and taking a look at my fork.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"> <span style="font-size:12.8px">-  It would be convenient to allow .ll files to be read.</span></blockquote><div>Sure, I can add it tomorrow.</div><div><br></div><div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span style="font-size:12.8px">- You do not use the BB names in the output, right? This would certainly</span><br style="font-size:12.8px"><span style="font-size:12.8px">   improve readability!</span></blockquote><div>You actually don't have to convert you bitcode files to 'graph' files (as long as they contain a single function) -- then names should be preserved, but you don't get to play with updates. The format I'm using isn't very good, but it's compatible with some other implementation by Loucas -- the author of the algorithm. Do you think that having a format like that in LLVM would make sense? Danny and I though about in the context of quickly writing and modifying tests for dominators and things like the NewGVN.</div></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span style="font-size:12.8px">- My output prints "New Dominator Tree" to times in a row. What is the</span><br style="font-size:12.8px"><span style="font-size:12.8px">difference? What is the difference to Inorder Dominator Tree?</span><br></blockquote><div> When you graph files have updates (i numFrom numTo and d numFrom numTo) then the first one is the original one and the second one is the one after applying all the updates. Inorder Dominator Tree is the existing DomTree -- I used to use it just for visual comparison.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><span style="font-size:12.8px">- How do I get the post-dominator tree with your tool? Do you already<br></span><span style="font-size:12.8px">have test cases? In fact, normal IR test cases would be really cool. We<br></span><span style="font-size:12.8px">do not have a lot of test coverage for all the dominance stuff.</span></blockquote><div> I haven't really played with postdominators yet. Do you have any specific ideas on how to test it in mind? And I definitely agree, dominators need to be tested more thoroughly. I think that because of the manual updates (of questionable correctness) and frequent recalculations there must be many undiscovered bugs that we just haven't had a chance to observe yet.<br><br>Best,<br>Kuba    <br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 13, 2017 at 12:05 AM, Tobias Grosser <span dir="ltr"><<a href="mailto:tobias.grosser@inf.ethz.ch" target="_blank">tobias.grosser@inf.ethz.ch</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

<br>

On Tue, Jun 13, 2017, at 08:47 AM, Jakub (Kuba) Kuderski via llvm-dev<br>

wrote:<br>

<span class="">> Hi Tobias,<br>

><br>

> 1) Daniel and Chandler have for a long time been talking about computing<br>

> > dominance and post-dominance in one shot to reduce the cost of<br>

> > post-dominance and make it (freely) available everywhere.  Is this<br>

> > covered by your current (or planned) work?<br>

><br>

> I'm not sure what you exactly mean by one shot; I'll ask around tomorrow.<br>

<br>

</span>Not sure what algorithm Daniel had in mind, but he was talking about<br>

some approach that basically gives post-dominance "for free" when<br>

computing dominance. Not sure how exactly this would work.<br>

<span class=""><br>

>  I wanted to play a little bit with your work, but run in some issues:<br>

> > > bin/llvm-as ../test/Analysis/Dominators/<wbr>2007-07-11-SplitBlock.ll<br>

> > > bin/dominators -to-graph ../test/Analysis/Dominators/<wbr>2007-07-11-SplitBlock.bc<br>

> > | tee graph<br>

> > p 5 5 1 1<br>

> > a 1 3<br>

> > a 1 5<br>

> > a 2 3<br>

> > a 3 2<br>

> > a 3 4<br>

> > e<br>

> > > bin/dominators graph<br>

> > Unknown file format for graph<br>

> > Invalid input graph<br>

> > I must do something obvious wrong. Any idea what?<br>

><br>

><br>

> You almost got it right -- 'graph' files must end with ".txt"... :P<br>

<br>

</span>I got it. Thank you! I also realized I can directly read in .bc files.<br>

<br>

I have a couple of immediate (but rather minor) comments:<br>

<br>

-  It would be convenient to allow .ll files to be read.<br>

- You do not use the BB names in the output, right? This would certainly<br>

   improve readability!<br>

- My output prints "New Dominator Tree" to times in a row. What is the<br>

difference? What is the difference to Inorder Dominator Tree?<br>

- How do I get the post-dominator tree with your tool? Do you already<br>

have test cases? In fact, normal IR test cases would be really cool. We<br>

do not have a lot of test coverage for all the dominance stuff.<br>

<br>

Best,<br>

Tobias<br>

<div class="HOEnZb"><div class="h5"><br>

><br>

> Best,<br>

> Kuba<br>

><br>

> On Mon, Jun 12, 2017 at 11:24 PM, Tobias Grosser<br>

> <<a href="mailto:tobias.grosser@inf.ethz.ch">tobias.grosser@inf.ethz.ch</a><br>

> > wrote:<br>

><br>

> > Hi  Jakub,<br>

> ><br>

> > thanks for the update. I was already eagerly waiting to hear what your<br>

> > internship on dominance brings. I think the numbers are already very<br>

> > encouraging and I believe moving incrementally over to the new algorithm<br>

> > is exactly the right approach.<br>

> ><br>

> > I have a couple of questions:<br>

> ><br>

> > 1) Daniel and Chandler have for a long time been talking about computing<br>

> > dominance and post-dominance in one shot to reduce the cost of<br>

> > post-dominance and make it (freely) available everywhere.  Is this<br>

> > covered by your current (or planned) work?<br>

> ><br>

> > 2) I tried to use tools/dominators<br>

> ><br>

> > I wanted to play a little bit with your work, but run in some issues:<br>

> ><br>

> > > bin/llvm-as ../test/Analysis/Dominators/<wbr>2007-07-11-SplitBlock.ll<br>

> > > bin/dominators -to-graph ../test/Analysis/Dominators/<wbr>2007-07-11-SplitBlock.bc<br>

> > | tee graph<br>

> > p 5 5 1 1<br>

> > a 1 3<br>

> > a 1 5<br>

> > a 2 3<br>

> > a 3 2<br>

> > a 3 4<br>

> > e<br>

> > > bin/dominators graph<br>

> > Unknown file format for graph<br>

> > Invalid input graph<br>

> ><br>

> > I must do something obvious wrong. Any idea what?<br>

> ><br>

> > Best,<br>

> > Tobias<br>

> ><br>

> > On Tue, Jun 13, 2017, at 02:12 AM, Jakub (Kuba) Kuderski via llvm-dev<br>

> > wrote:<br>

> > > Hi folks,<br>

> > ><br>

> > > This summer I'm working on improving dominators during my internship at<br>

> > > Google. Below is an RFC on switching to dynamic dominators, which you can<br>

> > > also read as a Google Doc<br>

> > > <<a href="https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId" rel="noreferrer" target="_blank">https://docs.google.com/<wbr>document/d/<wbr>1wPYeWykeO51YDPLYQEg4KNTlDIGId</a><br>

> > yF65OTfhSMaNHQ/edit?usp=<wbr>sharing><br>

> > > if you prefer so. Please let us know what you think.<br>

> > ><br>

> > > ~Kuba<br>

> > ><br>

> > > ==============================<wbr>==============================<wbr>===========<br>

> > ><br>

> > > *1. Context*<br>

> > ><br>

> > > Currently LLVM uses the Simple Lengauer-Tarjan algorithm to build the<br>

> > > (post)dominator tree. The only way to update it is by manually setting<br>

> > > IDoms which is not obvious in many cases and can be extremely<br>

> > > error-prone.<br>

> > > And because updating it manually is so difficult, programmers tend to<br>

> > > just<br>

> > > recompute it after changing the CFG (by not AU.addPreserved()'ing the<br>

> > > DomTree). This causes DomTree calculation to fire very often even if only<br>

> > > a<br>

> > > very small portion of it gets really affected by the changes in CFG. As<br>

> > > an<br>

> > > example, when optimizing a Full LTO clang bitcode,<br>

> > > DominatorTreeWrapperPass<br>

> > > alone calls DT.recalculate over 6.5 million times, which takes 20s on my<br>

> > > machine.<br>

> > ><br>

> > > Using an incremental algorithm it would be much easier to keep an<br>

> > > up-to-date DomTree without custom update logic, which will save us the<br>

> > > time<br>

> > > currently spent during DomTree recalculations and reduce the number of<br>

> > > bugs<br>

> > > caused by manual updates. It would also make it feasible to maintain<br>

> > > postdominators and use them more broadly, which currently can be too<br>

> > > complicated and expensive.<br>

> > ><br>

> > > *2. Proposal*<br>

> > ><br>

> > > The proposal is to make dominators use the incremental algorithm that<br>

> > > allows to keep (post)dominator tree up to date as CFG changes. To achieve<br>

> > > that, we would implement the dynamic depth based search algorithm (DBS)<br>

> > > described in this paper [1] <<a href="https://arxiv.org/pdf/1604.02711.pdf" rel="noreferrer" target="_blank">https://arxiv.org/pdf/1604.<wbr>02711.pdf</a>> and<br>

> > > expose 2 main new functions: insertArc(From, To) and deleteArc(From, To).<br>

> > > The algorithm uses SemiNCA under the hood which would replace<br>

> > > Lengauer-Tarjan implementation currently used.<br>

> > ><br>

> > > The second part of the proposal is to gradually deprecate and remove the<br>

> > > existing API for manually manipulating dominator tree<br>

> > > (changeImmediateDominator, addNewBlock) and replace its use within LLVM<br>

> > > with the new incremental API.<br>

> > ><br>

> > > *3. Proof of concept*<br>

> > ><br>

> > > The prototype implementation can be found in my LLVM fork [2]<br>

> > > <<a href="https://github.com/kuhar/llvm-dominators" rel="noreferrer" target="_blank">https://github.com/kuhar/<wbr>llvm-dominators</a>>. It comes with several layers<br>

> > > of<br>

> > > verification and was tested on clang, llvm test suite and a few open<br>

> > > source<br>

> > > libraries.<br>

> > > The code provides the basic functionality and tries be ‘API-equivalent’<br>

> > > with the current DomTree implementation. The NewDomTree is also able to<br>

> > > convert itself to the current one for testing and verification purposes.<br>

> > > Dynamic dominators are hacked into 3 existing passes (DomTreeWrapperPass,<br>

> > > LoopDeletion, LoopSimplifyCFG) to test correctness and experiment with<br>

> > > the<br>

> > > use of the new API.<br>

> > ><br>

> > > The prototype also comes with a bunch of testing utilities and a simple<br>

> > > benchmarking tool.<br>

> > ><br>

> > > *4. Performance*<br>

> > ><br>

> > > The real life performance of full dynamic DBS-based DomTree recalculation<br>

> > > is between 20% and 2% better on a machine with two Xeon E5-2680v2 CPUs<br>

> > > than<br>

> > > the existing SLT algorithm, which is consistent with the findings from<br>

> > > this<br>

> > > thesis [3] <<a href="ftp://ftp.cs.princeton.edu/reports/2005/737.pdf" rel="noreferrer" target="_blank">ftp://ftp.cs.princeton.edu/<wbr>reports/2005/737.pdf</a>>. The<br>

> > > advantage<br>

> > > is not that visible on debug builds, where the DBS-algorithm can be up to<br>

> > > 8% slower. It is most like possible to speed up debug builds, but I<br>

> > > haven’t<br>

> > > looked into that yet.<br>

> > > The benchmark performed [4]<br>

> > > <<a href="https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId" rel="noreferrer" target="_blank">https://docs.google.com/<wbr>document/d/<wbr>1wPYeWykeO51YDPLYQEg4KNTlDIGId</a><br>

> > yF65OTfhSMaNHQ/edit?usp=<wbr>sharing><br>

> > > loads of a (full LTO) bitcode file and builds DomTress of all functions<br>

> > > 15<br>

> > > times.<br>

> > ><br>

> > > The current DomTree updates DFS in-out numbers eagerly upon construction,<br>

> > > while the new one postpones is until they are actually needed. To make<br>

> > > the<br>

> > > benchmark fair, numbers were collected for both eager and lazy strategy<br>

> > > for<br>

> > > the new DomTree.<br>

> > ><br>

> > > The main advantage of the incremental algorithm comes from the fact that<br>

> > > it<br>

> > > allows incremental updates without rebuilding the whole tree, not from<br>

> > > the<br>

> > > slightly faster construction.<br>

> > > What’s more, the insertArc / deleteArc API doesn’t force updates to<br>

> > > happen<br>

> > > immediately -- they can be queued behind the scenes and happen in batches<br>

> > > if we decide to pursue that design later.<br>

> > ><br>

> > > *5. Transition plan*<br>

> > ><br>

> > > There are several possibilities when it comes to transition. The biggest<br>

> > > problem is that the current DomTree exposes an interface for manual<br>

> > > updates<br>

> > > (setIDom, changeImmediateDominator), and for manual construction<br>

> > > (addNewBlock). Because of the additional data stored in the incremental<br>

> > > algorithm (relative dominators, preorder parents, levels) it is not<br>

> > > really<br>

> > > possible to use the old API while keeping this data up-to-date. The<br>

> > > incremental algorithm relies on this information when performing fast arc<br>

> > > deletions; It is still able to perform them without it -- deletions are<br>

> > > then just slower in some cases.<br>

> > > The most straightforward solutions are:<br>

> > ><br>

> > > a) Keep the existing DomTree and gradually replace its uses with the new<br>

> > > one. It is possible to convert the DBS-based dominators to the old ones.<br>

> > > b) Replace the existing DomTree with the new, dynamic dominators. Nuke<br>

> > > all<br>

> > > of the old update functions and replace their uses with the new<br>

> > > incremental<br>

> > > API in one commit.<br>

> > > c) Replace the existing DomTree with the new one, but keep the old API<br>

> > > around and mark it as deprecated. If someone invokes one of the old<br>

> > > update<br>

> > > functions, mark the the additional information as invalid for dynamic<br>

> > > deletions. Follow the pessimistic but correct dynamic deletion path if<br>

> > > the<br>

> > > additional information is marked as invalidated. Gradually modernize the<br>

> > > projects to use the new API instead and then remove the old update<br>

> > > functions.<br>

> > ><br>

> > > Maintaining the old DT and the new one simultaneously seems like a waste<br>

> > > of<br>

> > > time as the DBS offers better or similar performance to the existing<br>

> > > SLT-based implementation.<br>

> > > The problem with replacing the old API with the new one is that it’s used<br>

> > > in many places (~100) across the project and I believe that doing that<br>

> > > all<br>

> > > at once would be tricky to get correct. Gradual update seems much easier<br>

> > > to<br>

> > > ensure correctness and the transitional API (invalid additional data<br>

> > > check)<br>

> > > is trivial to implement. Because of that, I propose to follow the option<br>

> > > (c).<br>

> > ><br>

> > ><br>

> > ><br>

> > > [1] Georgiadis et al., An Experimental Study of Dynamic Dominators,<br>

> > > <a href="https://arxiv.org/pdf/1604.02711.pdf" rel="noreferrer" target="_blank">https://arxiv.org/pdf/1604.<wbr>02711.pdf</a><br>

> > > [2] llvm-dominators LLVM fork on Github,<br>

> > > <a href="https://github.com/kuhar/llvm-dominators" rel="noreferrer" target="_blank">https://github.com/kuhar/llvm-<wbr>dominators</a><br>

> > > [3] L. Georgiadis, Linear-Time Algorithms for Dominators and Related<br>

> > > Problems, <a href="ftp://ftp.cs.princeton.edu/reports/2005/737.pdf" rel="noreferrer" target="_blank">ftp://ftp.cs.princeton.edu/<wbr>reports/2005/737.pdf</a> p. 38<br>

> > > [4] Google Doc with the performance numbers,<br>

> > > <a href="https://docs.google.com/document/d/1wPYeWykeO51YDPLYQEg4KNTlDIGId" rel="noreferrer" target="_blank">https://docs.google.com/<wbr>document/d/<wbr>1wPYeWykeO51YDPLYQEg4KNTlDIGId</a><br>

> > yF65OTfhSMaNHQ/edit?usp=<wbr>sharing<br>

> > > ______________________________<wbr>_________________<br>

> > > LLVM Developers mailing list<br>

> > > <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

> > > <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

> ><br>

><br>

><br>

><br>

> --<br>

> Jakub Kuderski<br>

> ______________________________<wbr>_________________<br>

> LLVM Developers mailing list<br>

> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div>Jakub Kuderski</div></div>

</div>