[PATCH] D29381: [PM/LCG] Remove the lazy RefSCC formation from the LazyCallGraph during iteration.
Daniel Berlin via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sun Feb 5 12:14:20 PST 2017
dberlin added inline comments.
Comment at: lib/Analysis/LazyCallGraph.cpp:1690
// node onto the stack.
> dberlin wrote:
> > Just to note, you don't actually have to push the stack here.
> > This will push every node onto the stack.
> > it's possible to push only the root nodes onto the stack.
> > This is more memory efficient, and about 25% faster.
> > This is nuutila's variant, which is easy to implement:
> > http://www.cse.tkk.fi/~enu/ps/ipl-scc.ps
> > or
> > http://www.cse.tkk.fi/~enu/ps/tko-b94-scc.ps
> > or
> > the tarjan scc part of https://reviews.llvm.org/D28934
> > (For the papers, you want algorithm 1, not 2)
> My understanding of Nuutila is that it's faster if the graph is sparse, and when I spoke and proposed this to Chandler he mentioned the graph maybe not-that-sparse (for some definition of). I never found the SCC computation being a bottleneck (even for large programs), but I like Nuutila's variant so I wouldn't be opposed to that (and I think it's worth a try)
It's either faster or the same in every case, and given it's switching like two lines, IMHO, i'd do it, but ¯\_(ツ)_/¯
The real speed is unrelated to whether it's a sparse or not. What it's really related to is what happens on trees/dags.
Tarjan's algorithm pushes always
Nuutila's variant only pushes if it's actually a cycle.
So nodes that are parts of trees/dags, will not end up on the stack.
More information about the llvm-commits