[PATCH] D21464: [PM] WIP: Introduce basic update capabilities to the new PM's CGSCC pass manager, including both plumbing and logic to handle function pass updates.

Fri Jul 1 00:44:40 PDT 2016

>
>
>
>
>> whereas if you do it the other way around, you don't have to do anything
>> special other than control the visitation order and actually do collapsing
>> as you discover things  - it's otherwise the standard SCC algorithm
>>
>
> Right -- this will be more efficient.
>
>
Yes, you can even prove it to be correct pretty easily.

It's easy to see that if you run the scc algorithm on the non-ref edges,
form a condensation graph from those scc's that has both the non-ref and
ref edges, and then run the scc algorithm on that condensation graph
visiting both edges, you will have a graph of maximal ref-scc's with
non-ref maximal non-ref scc's embedded in them

If you stare are tarjan's algorithm (
https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm),
you can see at the point where you pop stuff off the stack, you are done
with that scc. you will never visit it again. You are, at that point,
guaranteed to have a maximal non-ref scc.

So if you start by doing that algorithm on non-ref edges,and, after you
form a given scc , condense those nodes to a single node with all
non-self-pointing ref and non-ref edges (It's easier to think of it this
way than trying to reason about all the nodes separately), push that new
node as a root onto a ref-stack, and go visiting the ref-edges for that
node (with the same set of rules as we used for non-ref edges).

Since it always still processes non-ref edges before ref-edges, and is
depth first, it will always form maximal non-ref sccs *before* putting them
onto the ref-stack.

If you maintain duplicate ref-stack, ref-lowlink/etc, you can see it will
still do the right thing.

The tricky part is trying to do it without maintaining a duplicate lowlink,
etc.

I'm too lazy to prove it's possible in this message, but trust me, it is
possible to not duplicate most of it :)

Note that there are better algorithms nowadays.

>
>> Perhaps i am missing something, however.  I will stare at the patch.
>>
>
>
> Looks like what complicates the matter is that the SCC formation is done
> on the fly while the graph is traversed/iterated (lazily formed).
>

This is standard, tarjan's algorithm works just fine in that case.

> In order to ensure the visit order, the RefSCCs are formed first.
>

In the algorithm above, the ref-scc's should also be output in topo-order,
because we only visit successor ref-scc's after we've formed a given
ref-scc.

You just have to track the relative orders.

Note that you can even do it multi-core in quasi-linear time and get
serious speedups.
http://conf.researchr.org/home/PPoPP-2016

(search for "Multi-core on-the-fly SCC decomposition". the conference is
open access, so you will have access to the paper, but if i direct link it,
it don't work)

> Thanks for the reference! I have not read it in details, but just browsing
> through the algorithm, if it is employed to do dynamic cycle update and
> node reordering -- there does not seem to be need for Ref edges and Ref SCC
> at all.  Looking at the example in Figure 2. Reverse top-order of the
> callgraph is first formed. The SCC pass manager processes 'j', and then
> node 'v'. After processing 'v', a new call edge is discovered to node 'w',
> incremental update is applied, and 'v' 's analysis result is invalidated
> and will be processed later. The pass manager will then process 'h', 'i',
> 'f'
>
>
Be careful, all of these algorithms are N^2.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160701/079e63ea/attachment.html>