[PATCH] D21464: [PM] WIP: Introduce basic update capabilities to the new PM's CGSCC pass manager, including both plumbing and logic to handle function pass updates.

Thu Jun 30 22:40:14 PDT 2016

(clicked sent by accident)

On Thu, Jun 30, 2016 at 10:38 PM, Xinliang David Li <davidxl at google.com>
wrote:

>
>
> On Thu, Jun 30, 2016 at 9:43 PM, Daniel Berlin <dberlin at dberlin.org>
> wrote:
>
>>
>>
>> On Thu, Jun 30, 2016 at 9:13 PM, Xinliang David Li <davidxl at google.com>
>> wrote:
>>
>>>
>>>
>>> On Thu, Jun 30, 2016 at 6:36 PM, Daniel Berlin <dberlin at dberlin.org>
>>> wrote:
>>>
>>>>
>>>>> 4. The patch introduces a new SCC formation algorithm (double layer
>>>>> with support for CG mutation on the fly), however design document on how
>>>>> this works is missing.  It needs to document
>>>>>     * What exactly is expected (in terms of vistation order) when a)
>>>>> an edge is added; b) an edge is deleted; c) when a new node is introduced
>>>>> and connected.
>>>>>     * How ref-edges are formed, how indirect callsites are handled etc.
>>>>>     * If it is a modification of the classic SCC formation algorithm,
>>>>> describe the change and prove that the algorithm works as expected.
>>>>>
>>>>>
>>>> FWIW: I agree that for algorithms this complex, where we have no other
>>>> reference for how they are supposed to work, and no way to know what
>>>> invariants make them correct or not (other than reading code), we should
>>>> have some design doc.
>>>>
>>>> In this case, it should not be hard to prove that you can validly form
>>>> both ref and non-ref SCC's at once:
>>>>
>>>> If you were to always follow non-ref edges first, and eagerly collapse
>>>> non-ref SCC's, you can easily prove it will discover the maximal set of
>>>> non-ref sccs (because it's depth first, if there was a cycle that could be
>>>> formed with non-ref edges, it will form it), and each will collapse to a
>>>> node in the ref SCC graph which will then form ref-scc's on top of it by
>>>> following ref-edges.
>>>>
>>>> If you do not follow non-ref edges first, you will get into situations
>>>> where it could have formed a non-ref SCC but formed a ref SCC instead, and
>>>> the non-ref SCC can be non-maximal.
>>>>
>>>
>>> Chandler's RefSCC formation considers both ref and non-ref edges,
>>>
>>
>> Yes, that's a RefSCC, i'm aware :)
>>
>> If you want maximal non-ref SCC's embedded in the ref-SCC graph, it seems
>> possible but a lot more complicated to discover the larger Ref-SCC and then
>> try to discover the smaller non-ref SCC's embedded in it.  It's actually
>> very natural to do the reverse, and discover the non-ref scc's and then the
>> larger ref-scc's
>>
>> a Ref SCC's definition is actually maximal SCC with 'mixed' edges.
>>>
>>
>> Yes, i'm aware.
>>
>>
>>> non-ref SCCs are then formed from a RefSCC.
>>>
>>
>> This seems .... non-optimal, and complicated, IMHO.  But i haven't looked
>> at the patch.  It's definitely harder to prove correct. Among other things,
>> while you are guaranteed the nodes forming a regular SCC are a subset of
>> your refscc, it would seem very tricky to do it without re-exploring those
>> nodes in the ref-scc to find the regular sccs embedded in it,
>>
>
> Right.
>
>
>> whereas if you do it the other way around, you don't have to do anything
>> special other than control the visitation order and actually do collapsing
>> as you discover things  - it's otherwise the standard SCC algorithm
>>
>
> Right -- this will be more efficient.
>
>
>
>> Perhaps i am missing something, however.  I will stare at the patch.
>>
>
>
> Looks like what complicates the matter is that the SCC formation is done
> on the fly while the graph is traversed/iterated (lazily formed). In order
> to ensure the visit order, the RefSCCs are formed first.
>
> However there will no such constraints if the visitation order is
> pre-calculated and cached -- what you suggested will be great for it.
>
>
>>
>>
>>> This part seems ok, I think. However when the CG is mutated, what is the
>>> definition of 'current' SCC and current RefSCC? What is the incremental
>>> update algorithm?
>>>
>>
>> The incremental update algorithm for the way i described it can be done
>> in a variety of ways, and is pretty easy to understand.
>>
>> I'm not sure about incremental update algorithms that try to discover the
>> ref-changes then the non-ref changes inside of it.
>> As I said, that seems more complicated as it's not a simple graph
>> embedding.
>>
>> It can be done in O(m^1/2) time per arc addition, or O(m^5/2) for a batch
>> of arc additions
>>
>> (basically, O(N^2))
>>
>> see, e.g., http://www.cs.princeton.edu/~sssix/papers/dto-journal.pdf
>>
>>
> Thanks for the reference! I have not read it in details, but just browsing
> through the algorithm, if it is employed to do dynamic cycle update and
> node reordering -- there does not seem to be need for Ref edges and Ref SCC
> at all.  Looking at the example in Figure 2. Reverse top-order of the
> callgraph is first formed. The SCC pass manager processes 'j', and then
> node 'v'. After processing 'v', a new call edge is discovered to node 'w',
> incremental update is applied, and 'v' 's analysis result is invalidated
> and will be processed later. The pass manager will then process 'h', 'i',
> 'f'
>

'f', 'c', 'w', and then 'v' is revisited again ...

It looks like it will work nicely.

David

>
>
>
>
>>
>>
>>
>>> Just some random thoughts: can we assume ref-edges never gets deleted or
>>> added? Can we assume any newly created/exposed call (non-ref) edges always
>>> have corresponding ref edges?  If the above conditions are true, then the
>>> refSCC DAG (DAG with collapsed refSCCs as nodes) can not be mutated. There
>>> is another condition that needs to be met in order for refSCC DAG become
>>> 'immutable': a ref edge needs to be introduced for any back edge in CG (if
>>> it is not already a ref edge). This means no new refSCC can appear via
>>> splitting or merging when CG is changed.
>>>
>>> If we can guarantee RefSCC DAG is non-mutable,  the incremental update
>>> may be simplified: the SCC update is now guaranteed to be 'intra'-RefSCC
>>> (the current one). Rebuilding SCC within a refSCC could be cheap enough
>>> (assuming refSCCs are not large).
>>>
>>> David
>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160630/1ae7f9d1/attachment.html>