<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jun 16, 2016 at 11:12 AM, Sanjoy Das <span dir="ltr"><<a href="mailto:sanjoy@playingwithpointers.com" target="_blank">sanjoy@playingwithpointers.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Hi Sean,<br>

<span class=""><br>

On Thu, Jun 16, 2016 at 4:48 AM, Sean Silva via llvm-dev<br>

<<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br>

> One question is what invariants we want to provide for the visitation.<br>

><br>

> For example, should a CGSCC pass be able to assume that all "child" SCC's<br>

> (SCC's it can reach via direct calls emanating from the SCC being visited)<br>

> have already been visited? Currently I don't think it can, and IIRC from the<br>

> discussion at the social this is one thing that Chandler is hoping to fix.<br>

> The "ref edge" notion in LazyCallGraph ensures that we cannot create a call<br>

> edge (devirtualizing a ref edge) that will point at an SCC that has not yet<br>

> been visited.<br>

><br>

> E.g. consider this graph:<br>

><br>

> digraph G {<br>

>   A -> B; B -> A; // SCC {A,B}<br>

>   S -> T; T -> S; // SCC {S,T}<br>

>   X -> Y; Y -> X; // SCC {X,Y}<br>

><br>

>   B -> X;<br>

>   B -> S;<br>

>   T -> Y [label="Ref edge that is devirtualized\nwhen visiting SCC<br>

> {S,T}",constraint=false,style=dashed]<br>

> }<br>

><br>

> (can visualize conveniently at <a href="http://www.webgraphviz.com/" rel="noreferrer" target="_blank">http://www.webgraphviz.com/</a> or I have put an<br>

> image at <a href="http://reviews.llvm.org/F2073104" rel="noreferrer" target="_blank">http://reviews.llvm.org/F2073104</a>)<br>

><br>

> If we do not consider the ref graph, then it is possible for SCC {S,T} to be<br>

<br>

</span>I'm not sure why you wouldn't consider the ref graph?</blockquote><div><br></div><div>The simple answer is that this is the current state of things. The SCC visitation logic in the old PM does not consider the ref graph.</div><div><br></div><div>So in some sense the question is why *should* we consider the ref graph? What is it buying us? Presumably this will take the form of some invariant on the `run(SCC &)` calls. But I have yet to see any explicit statement of an invariant that it gives us.</div><div><br></div><div>For example, the examples I gave show that without bailing out in the middle of a cgscc pass manager (e.g. after the `<span style="font-size:12.8px">function(...</span><span style="font-size:12.8px">simplifications that can devirtualize...)`) then we cannot even guarantee that the thing passed to the </span>`run(SCC &)`<span style="font-size:12.8px"> function is actually an SCC.</span></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">  I think the<br>

general idea is to visit RefSCCs in bottom up order, and when visiting<br>

a RefSCC, visiting the SCC's inside the RefSCC in bottom up order.<br>

<br>

So in your example, given the edges you've shown, we will visit {X,Y}<br>

before visiting {S,T}.<br>

<span class=""><br>

> A more complicated case is when SCC {S,T} and SCC {X,Y} both call into each<br>

> other via function pointers. So eventually after devirtualizing the calls in<br>

> both directions there will be a single SCC {S,T,X,Y}.<br>

><br>

> digraph G {<br>

>   A -> B; B -> A; // SCC {A,B}<br>

>   S -> T; T -> S; // SCC {S,T}<br>

>   X -> Y; Y -> X; // SCC {X,Y}<br>

><br>

>   B -> X;<br>

>   B -> S;<br>

>   T -> Y [label="Ref edge that is devirtualized\nwhen visiting SCC<br>

> {S,T}",constraint=false,style=dashed]<br>

>   X -> S [label="Ref edge that is devirtualized\nwhen visiting SCC<br>

> {X,Y}",constraint=false,style=dashed]<br>

> }<br>

><br>

> (rendering at: <a href="http://reviews.llvm.org/F2073479" rel="noreferrer" target="_blank">http://reviews.llvm.org/F2073479</a>)<br>

><br>

> Due to the cyclic dependence there is no SCC visitation order that can<br>

> directly provide the invariant above. Indeed, the problem of maintaining the<br>

<br>

</span>I think the workflow in the above will (roughly) be:<br>

<br>

 Visit the RefSCC {X,Y,S,T}<br>

   Visit the SCC {X,Y} // arbitrary<br>

     Optimize({X,Y})<br>

     // Now there's an edge to {S,T}, invalidate<br>

     // the analyses cached for {X,Y} and visit {S,T}<br>

   Visit the SCC {S,T}<br>

     Optimize({S,T})<br>

     // Now {X,Y,S,T} collapses to form a single SCC<br>

   Visit the SCC {S,T,X,Y}<br>

     Optimize({S,T,X,Y})<br>

<br>

The difficult bit is to make the inner "// Now.*" bits work well.<br></blockquote><div><br></div><div>But consider that Optimize({S,T}) might be of the form: <span style="font-size:12.8px">`cgscc(function(...</span><span style="font-size:12.8px">simplifications that can devirtualize...),foo-cgscc-</span><span style="font-size:12.8px">pass)`.</span></div><div><span style="font-size:12.8px">After running `</span><span style="font-size:12.8px">function(...</span><span style="font-size:12.8px">simplifications that can devirtualize...)` we would end up running `foo-cgscc-pass` on {S,T} which is no longer an SCC anymore.</span></div><div><span style="font-size:12.8px">What is the invariant here? What do we actually guarantee for the `run(SCC &)` function?</span></div><div><span style="font-size:12.8px"><br></span></div><div><span style="font-size:12.8px">-- Sean Silva</span></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">

<span class=""><font color="#888888"><br>

-- Sanjoy<br>

</font></span></blockquote></div><br></div></div>