[PATCH] D29381: [PM/LCG] Remove the lazy RefSCC formation from the LazyCallGraph during iteration.

Sun Feb 5 12:14:20 PST 2017

dberlin added inline comments.

================
Comment at: lib/Analysis/LazyCallGraph.cpp:1690
           // node onto the stack.
           DFSStack.push_back({N, I});

----------------
davide wrote:
> dberlin wrote:
> > Just to note, you don't actually have to push the stack here.
> > 
> > This will push every node onto the stack.
> > it's possible to push only the root nodes onto the stack.
> > 
> > This is more memory efficient, and about 25% faster.
> > 
> > This is  nuutila's variant, which is easy to implement:
> > 
> > http://www.cse.tkk.fi/~enu/ps/ipl-scc.ps
> > or
> > http://www.cse.tkk.fi/~enu/ps/tko-b94-scc.ps
> > or 
> > the tarjan scc part of https://reviews.llvm.org/D28934
> > 
> > (For the papers, you want algorithm 1, not 2)
> > 
> My understanding of Nuutila is that it's faster if the graph is sparse, and when I spoke and proposed this to Chandler he mentioned the graph maybe not-that-sparse (for some definition of). I never found the SCC computation being a bottleneck (even for large programs), but I like Nuutila's variant so I wouldn't be opposed to that (and I think it's worth a try)
It's either faster or the same in every case, and given it's switching like two lines, IMHO, i'd do it, but ¯\_(ツ)_/¯

The real speed is unrelated to whether it's a sparse or not. What it's really related to is what happens on trees/dags.

Tarjan's algorithm pushes always
Nuutila's variant only pushes if it's actually a cycle.
So nodes that are parts of trees/dags, will not end up on the stack.

https://reviews.llvm.org/D29381