[llvm-dev] Intended behavior of CGSCC pass manager.

Xinliang David Li via llvm-dev llvm-dev at lists.llvm.org
Fri Jun 17 00:49:22 PDT 2016


On Fri, Jun 17, 2016 at 12:27 AM, Sean Silva <chisophugis at gmail.com> wrote:

>
>
> On Fri, Jun 17, 2016 at 12:10 AM, Xinliang David Li <davidxl at google.com>
> wrote:
>
>>
>>
>> On Thu, Jun 16, 2016 at 11:51 PM, Sean Silva <chisophugis at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Thu, Jun 16, 2016 at 11:07 PM, Xinliang David Li <davidxl at google.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Jun 16, 2016 at 10:47 PM, Sean Silva <chisophugis at gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Thu, Jun 16, 2016 at 9:53 PM, Xinliang David Li <davidxl at google.com
>>>>> > wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Jun 16, 2016 at 11:12 AM, Sanjoy Das <
>>>>>> sanjoy at playingwithpointers.com> wrote:
>>>>>>
>>>>>>> Hi Sean,
>>>>>>>
>>>>>>> On Thu, Jun 16, 2016 at 4:48 AM, Sean Silva via llvm-dev
>>>>>>> <llvm-dev at lists.llvm.org> wrote:
>>>>>>> > One question is what invariants we want to provide for the
>>>>>>> visitation.
>>>>>>> >
>>>>>>> > For example, should a CGSCC pass be able to assume that all
>>>>>>> "child" SCC's
>>>>>>> > (SCC's it can reach via direct calls emanating from the SCC being
>>>>>>> visited)
>>>>>>> > have already been visited? Currently I don't think it can, and
>>>>>>> IIRC from the
>>>>>>> > discussion at the social this is one thing that Chandler is hoping
>>>>>>> to fix.
>>>>>>> > The "ref edge" notion in LazyCallGraph ensures that we cannot
>>>>>>> create a call
>>>>>>> > edge (devirtualizing a ref edge) that will point at an SCC that
>>>>>>> has not yet
>>>>>>> > been visited.
>>>>>>> >
>>>>>>> > E.g. consider this graph:
>>>>>>> >
>>>>>>> > digraph G {
>>>>>>> >   A -> B; B -> A; // SCC {A,B}
>>>>>>> >   S -> T; T -> S; // SCC {S,T}
>>>>>>> >   X -> Y; Y -> X; // SCC {X,Y}
>>>>>>> >
>>>>>>> >   B -> X;
>>>>>>> >   B -> S;
>>>>>>> >   T -> Y [label="Ref edge that is devirtualized\nwhen visiting SCC
>>>>>>> > {S,T}",constraint=false,style=dashed]
>>>>>>> > }
>>>>>>> >
>>>>>>> > (can visualize conveniently at http://www.webgraphviz.com/ or I
>>>>>>> have put an
>>>>>>> > image at http://reviews.llvm.org/F2073104)
>>>>>>> >
>>>>>>> > If we do not consider the ref graph, then it is possible for SCC
>>>>>>> {S,T} to be
>>>>>>>
>>>>>>> I'm not sure why you wouldn't consider the ref graph?  I think the
>>>>>>> general idea is to visit RefSCCs in bottom up order, and when
>>>>>>> visiting
>>>>>>> a RefSCC, visiting the SCC's inside the RefSCC in bottom up order.
>>>>>>>
>>>>>>> So in your example, given the edges you've shown, we will visit {X,Y}
>>>>>>> before visiting {S,T}.
>>>>>>>
>>>>>>> > A more complicated case is when SCC {S,T} and SCC {X,Y} both call
>>>>>>> into each
>>>>>>> > other via function pointers. So eventually after devirtualizing
>>>>>>> the calls in
>>>>>>> > both directions there will be a single SCC {S,T,X,Y}.
>>>>>>> >
>>>>>>> > digraph G {
>>>>>>> >   A -> B; B -> A; // SCC {A,B}
>>>>>>> >   S -> T; T -> S; // SCC {S,T}
>>>>>>> >   X -> Y; Y -> X; // SCC {X,Y}
>>>>>>> >
>>>>>>> >   B -> X;
>>>>>>> >   B -> S;
>>>>>>> >   T -> Y [label="Ref edge that is devirtualized\nwhen visiting SCC
>>>>>>> > {S,T}",constraint=false,style=dashed]
>>>>>>> >   X -> S [label="Ref edge that is devirtualized\nwhen visiting SCC
>>>>>>> > {X,Y}",constraint=false,style=dashed]
>>>>>>> > }
>>>>>>> >
>>>>>>> > (rendering at: http://reviews.llvm.org/F2073479)
>>>>>>> >
>>>>>>> > Due to the cyclic dependence there is no SCC visitation order that
>>>>>>> can
>>>>>>> > directly provide the invariant above. Indeed, the problem of
>>>>>>> maintaining the
>>>>>>>
>>>>>>> I think the workflow in the above will (roughly) be:
>>>>>>>
>>>>>>>  Visit the RefSCC {X,Y,S,T}
>>>>>>>
>>>>>>
>>>>>> Are we sure RefSCC has ref edges between {X, Y} and {S, T} in this
>>>>>> case?  I may miss the code handling it.
>>>>>>
>>>>>
>>>>> The ref graph is conservative and so it would have the appropriate
>>>>> edges.
>>>>>
>>>>> https://github.com/llvm-project/llvm-project/blob/master/llvm/lib/Analysis/LazyCallGraph.cpp#L90
>>>>>
>>>>
>>>>
>>>> That is where the confusion is -- the findReferences's function body
>>>> only handles constant operands which may be function addresses.  Did I miss
>>>> something obvious?
>>>>
>>>
>>> It is somewhat subtle. There are 3 potential meanings of "call graph" in
>>> this thread:
>>> 1. The graph of *direct calls* in the current module. (this is mutated
>>> during optimization)
>>> 2. A supergraph of 1., conservatively chosen such 1. remains a subgraph
>>> under arbitrary semantics-preserving function transformations (note: 2. and
>>> 1. must be updated in sync during deletion of functions).
>>> 3. The conservative graph representing all edges which may exist *at
>>> runtime*.
>>>
>>> LazyCallGraph models 1. (the "call graph") and 2. (the "ref graph"). It
>>> does not model 3.
>>> The search for constants in findReferences guarantees that we find (a
>>> conserative superset of) all call destinations addresses that
>>> transformations may add direct calls (and hence update 1.).
>>>
>>> The existing CallGraph data structure (used by e.g. the old PM CGSCC
>>> visitation) only models 1.
>>>
>>>
>>>
>>>>   Another point is that it may not be practical to model edges to
>>>> indirect targets. For virtual calls, without CHA, each virtual callsite
>>>> will end up referencing all potential address taken functions.
>>>>
>>>
>>> In this statement, you are thinking in terms of 3.
>>> Note that in both 1. and 2. the indirect calls are modeled as calling an
>>> external dummy node. This dummy node is *distinct* from a second external
>>> dummy node that calls all address-taken functions. (hence the indirect
>>> calls do not end up forming a RefSCC with all address taken functions).
>>>
>>
>>
>> For C++ programs, most ref edges are probably from ctors and dtors that
>> reference vtable -- they will have large fan-out of ref edges to virtual
>> member functions they are unlikely to call.  In other words, such ref edges
>> mostly do not represent real caller->callee relationship, so I wonder what
>> is the advantage of forming refSCC from SCCs?
>>
>
>
> I believe it is primarily used for ordering the visitation of CallSCC's
> (i.e. SCC's in the "call graph").
>

This is what it can do -- but what benefit does it provide?

David



>
> -- Sean Silva
>
>
>>
>> David
>>
>>
>>
>>> Note that `opt -dot-callgraph` can produce a graphviz file for the old
>>> PM callgraph. I'm not sure if we have one for LazyCallGraph (which has both
>>> ref graph and call graph); I'll post a patch adding one if not.
>>>
>>> -- Sean Silva
>>>
>>>
>>>
>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>    Visit the SCC {X,Y} // arbitrary
>>>>>>>      Optimize({X,Y})
>>>>>>>      // Now there's an edge to {S,T}, invalidate
>>>>>>>      // the analyses cached for {X,Y} and visit {S,T}
>>>>>>>
>>>>>>
>>>>>> I am not sure if this makes sense.   If dynamically, the call edge
>>>>>> from {X, Y} to {S, T} does exist, but not discovered by the analysis, then
>>>>>> the cached {X, Y} will still be invalid, but who is going to invalidate it?
>>>>>>
>>>>>
>>>>> I assume that if dynamically there was a call from {X,Y} to {S,T},
>>>>> then the analysis would have observed an indirect call and would have
>>>>> behaved conservatively.
>>>>>
>>>>
>>>> See above, I did not see how indirect call is handled. Also if the
>>>> result for {X, Y} is conservatively correct before the direct call edge is
>>>> discovered, why bother invalidating its analysis when the call edge is
>>>> exposed?
>>>>
>>>> David
>>>>
>>>>
>>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>
>>>>>>
>>>>>> David
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>    Visit the SCC {S,T}
>>>>>>>      Optimize({S,T})
>>>>>>>      // Now {X,Y,S,T} collapses to form a single SCC
>>>>>>>    Visit the SCC {S,T,X,Y}
>>>>>>>      Optimize({S,T,X,Y})
>>>>>>>
>>>>>>> The difficult bit is to make the inner "// Now.*" bits work well.
>>>>>>>
>>>>>>> -- Sanjoy
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160617/3fe66e8d/attachment.html>


More information about the llvm-dev mailing list