<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><div>On Jan 6, 2012, at 8:27 PM, Daniel Berlin wrote:</div><br><blockquote type="cite"><div>Note: GCC takes exactly the same approach as LLVM here, for exactly<br>the reason chris specifies.<br>In fact, until we started local SSA updating (which is now many years<br>ago, but ...), dominance frontier calculation for ssa updating was in<br>the top 10 profile functions for GCC compiles of large source files.<br>I had tried a number of the linear time phi placement algorithms<br>instead, but sadly, none were fast enough in practice (I believe<br>someone later published a paper comparing them all and finding the<br>same thing).<br></div></blockquote><div><br></div><div>I wrote the current implementation of the up-front SSA construction in LLVM, found in Utils/PromoteMemoryToRegister.cpp. It uses a modified version of the Sreedhar-Gao algorithm from "A linear time algorithm for placing phi-nodes". The most interesting modification is that it uses per-variable liveness information to compute pruned SSA by pruning the entire IDF computation, rather than merely pruning the DF -> IDF step like is usually done. I was a bit surprised that it worked so well, given the reputation of the "linear time" algorithms, but it performed the best out of everything I attempted.</div><br><blockquote type="cite"><div>As a completely random aside, the algorithm given in the original<br>paper you cited is actually not quite the best way in practice.  It<br>performs more comparisons than strictly necessary. When I moved us to<br>the algorithm found in <a href="http://www.cs.rice.edu/~keith/Embed/dom.pdf">http://www.cs.rice.edu/~keith/Embed/dom.pdf</a>,<br>Figure 5, it was ~30% faster (AFAIK, Jeanne Ferrante is somewhat<br>responsible for both algorithms :P). It is also a lot simpler to<br>explain, IMHO.<br></div></blockquote></div><br><div>There is a followup paper from other authors (including Tarjan himself) that discusses some implementation tricks for the Lengauer-Tarjan algorithm:</div><div><br></div><div><a href="http://www.cs.princeton.edu/~rwerneck/docs/GWTTA04.pdf">http://www.cs.princeton.edu/~rwerneck/docs/GWTTA04.pdf</a></div><div><br></div><div>They cite personal communication from Keith Cooper that a later more careful implementation of Lengauer-Tarjan led to different benchmark results than what they published. I added those tricks and some others to LLVM's implementation of Lengauer-Tarjan (the simple O(n log n) version) for a significant performance improvement. I should do a comparison with a careful implementation of the Cooper-Harvey-Kennedy algorithm some day.</div><div><br></div><div>Cameron</div></body></html>