[PATCH] D29381: [PM/LCG] Remove the lazy RefSCC formation from the LazyCallGraph during iteration.

Sun Feb 5 13:28:17 PST 2017

davide added inline comments.

================
Comment at: lib/Analysis/LazyCallGraph.cpp:1690
           // node onto the stack.
           DFSStack.push_back({N, I});

----------------
dberlin wrote:
> davide wrote:
> > dberlin wrote:
> > > Just to note, you don't actually have to push the stack here.
> > > 
> > > This will push every node onto the stack.
> > > it's possible to push only the root nodes onto the stack.
> > > 
> > > This is more memory efficient, and about 25% faster.
> > > 
> > > This is  nuutila's variant, which is easy to implement:
> > > 
> > > http://www.cse.tkk.fi/~enu/ps/ipl-scc.ps
> > > or
> > > http://www.cse.tkk.fi/~enu/ps/tko-b94-scc.ps
> > > or 
> > > the tarjan scc part of https://reviews.llvm.org/D28934
> > > 
> > > (For the papers, you want algorithm 1, not 2)
> > > 
> > My understanding of Nuutila is that it's faster if the graph is sparse, and when I spoke and proposed this to Chandler he mentioned the graph maybe not-that-sparse (for some definition of). I never found the SCC computation being a bottleneck (even for large programs), but I like Nuutila's variant so I wouldn't be opposed to that (and I think it's worth a try)
> It's either faster or the same in every case, and given it's switching like two lines, IMHO, i'd do it, but ¯\_(ツ)_/¯
> 
> The real speed is unrelated to whether it's a sparse or not. What it's really related to is what happens on trees/dags.
> 
> Tarjan's algorithm pushes always
> Nuutila's variant only pushes if it's actually a cycle.
> So nodes that are parts of trees/dags, will not end up on the stack.
> 
STGM =)

https://reviews.llvm.org/D29381