<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jun 8, 2016 at 11:14 AM, Sanjoy Das <span dir="ltr"><<a href="mailto:sanjoy@playingwithpointers.com" target="_blank">sanjoy@playingwithpointers.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Hi Sean,<span class=""><br>

<br>

Sean Silva wrote:<br>

> Hi Chandler, Philip, Mehdi, (and llvm-dev,)<br>

><br>

> (this is partially a summary of some discussions that happened at the last LLVM bay area social, and partially a<br>

> discussion about the direction of the CGSCC pass manager)<br>

<br></span>

Thanks for writing this up!  This sort of thing is very helpful for<br>

people who don't attend the social, or had to leave early. :)<span class=""><br>

<br>

> Chandler described that he had a major breakthrough in that the CGSCC pass manager only had to deal with 3 classes of<br>

> modifications that can occur:<br>

> - a pass may e.g. propagate a load of a function pointer into an indirect call, turning it into an direct call. This<br>

> requires adding an edge in the CG but not in the ref graph.<br>

> - a pass may take a direct call and turn it into an indirect call. This requires removing an edge from the CG, but not<br>

> in the ref graph.<br>

> - a pass may delete a direct call. This removes an edge in the CG and also in the ref graph.<br>

<br></span>

At what granularity are we modeling these things?  E.g. if SimplifyCFG<br>

deletes a basic block, will we remove call edges that start from that<br>

block?<br>

<br>

Note there is fourth kind of modification that isn't modeled here:<br>

devirtualization, in which we transform IR like<br>

<br>

  %fnptr = compute_fnptr(%receiver, i32 <method id>)<br>

  call %fptr(%receiver)<br>

<br>

to (via a pass that understands the semantics of compute_fnptr).<br>

<br>

  call some.specific.Class::method(%receiver)<br>

<br>

<br>

However, at this time I don't think modeling "out of thin air"<br>

devirtualization of this sort is important, since nothing upstream<br>

does things like this (we do have ModulePasses downstream that do<br>

this).<span class=""><br>

<br>

>  From the perspective of the CGSCC pass manager, these operations can affect the SCC structure. Adding an edge might<br>

> merge SCC's and deleting an edge might split SCC's. Chandler mentioned that apparently the issues of splitting and<br>

> merging SCC's within the current infrastructure are actually quite challenging and lead to e.g. iterator invalidation<br>

> issues, and that is what he is working on.<br>

><br>

> (<br>

> The ref graph is important to guide the overall SCC visitation order because it basically represents "the largest graph<br>

> that the CG may turn into due to our static analysis of this module". I.e. no transformation we can statically make in<br>

> the CGSCC passes can ever cause us to need to merge SCC's in the ref graph.<br>

> )<br>

<br></span>

Except in the above "out of thin air" devirtualization case.<br>

<br>

I think cross-function store forwarding can also be problematic:<br>

<br>

void foo(fnptr* bar) {<br>

  if (bar)<br>

    (*bar)();<br>

}<br>

<br>

void baz() { foo(null); }<br>

<br>

void caller() {<br>

  fnptr *t = malloc();<br>

  *t = baz;<br>

  foo(t);<br>

}<br>

<br>

// RefGraph is<br>

//  caller -> baz<br>

//  caller -> foo<br>

//  baz -> foo<br>

<br>

Now the RefSCCs are {foo}, {caller}, {baz} but if we forward the store<br>

to *t in caller into the load in foo (and are smart about the null<br>

check) we get:<br>

<br>

void foo(fnptr* bar) {<br>

  if (bar)<br>

    baz();<br>

}<br>

<br>

void baz() { foo(null); }<br>

<br>

void caller() {<br>

  fnptr *t = malloc();<br>

  *t = baz;<br>

  foo(t);<br>

}<br>

<br>

// RefGraph is<br>

//  foo -> baz<br>

//  baz -> foo<br>

//  caller -> foo<br>

//  caller -> baz<br>

<br>

and now the RefSCCs are {foo, baz}, {caller}<br></blockquote><div><br></div><div>In both of these examples you gave (out-of-thin-air devirtualization and forwarding into a callee) is the contract of a CGSCC pass being violated? (I believe the contract is that a CGSCC pass cannot invalidate the analyses of a different SCC (<a href="https://github.com/llvm-mirror/llvm/blob/master/include/llvm/Analysis/CGSCCPassManager.h#L104">https://github.com/llvm-mirror/llvm/blob/master/include/llvm/Analysis/CGSCCPassManager.h#L104</a>) )</div><div><br></div><div>If not, then Chandler's analysis was missing a case (beyond the 3 cases I mentioned which he had convinced himself were the only possible ones).</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">

<br>

But again, I think this is fine since nothing upstream does this at<br>

this time; and we can cross that bridge when we come to it.<span class=""><br>

<br>

> 2. What is the intended behavior of CGSCC passes when SCC's are split or merged? E.g. a CGSCC pass runs on an SCC (e.g.<br>

> the inliner). Now we run some function passes nested inside the CGSCC pass manager (e.g. to simplify things after<br>

> inlining). Consider:<br>

><br>

> a) These function passes are e.g. now able to devirtualize a call, adding an edge to the CG, forming a larger CG SCC. Do<br>

> you re-run the CGSCC pass (say, the inliner) on this larger SCC?<br>

><br>

> b) These function passes are e.g. able to DCE a call, removing an edge from the CG. This converts, say, a CG SCC which<br>

> is a cycle graph (like a->b->c->a) into a path graph (a->b->c, with no edge back to a). The inliner had already visited<br>

> a, b, and c as a single SCC. Now does it have to re-visit c, then b, then a, as single-node SCC's?<br>

<br></span>

Okay, so this is the same question I wrote above: "At what granularity<br>

are we modeling these things?", phrased more eloquently. :)<span class=""><br>

<br>

> One way that I have found it useful to think about this is in terms of the visitation during Tarjan's SCC algorithm.<br>

> I'll reference the pseudocode in <a href="https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm" rel="noreferrer" target="_blank">https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm</a>.<br>

> Inside the "strongconnect" routine when we have identified an SCC (the true branch of `if (v.lowlink = v.index)` test )<br>

> we can visit stack[v.index:stack.size()] as an SCC. This may or may not invalidate some things on the stack (the<br>

> variable `S` in the pseudocode) and we may need to fix it up (e.g. inlining deleted a function, so we can't have an<br>

> entry on the stack). Then, we can run function passes as we pop individual functions off the stack, but it is easier to<br>

> think about IMO than merging of SCC data structures: if we add edges to the CG then we have to do more DFS on the new<br>

> edges and if we delete edges then the DFS order of the stack gives us certain guarantees.<br>

<br></span>

I'm not sure how this will be easier  E.g. consider<br>

<br>

X -> A<br>

A -> B<br>

B -> C<br>

C -> B<br>

B -> A<br>

<br>

Now you're going to pop A,B,C together, as an SCC; and you start<br>

optimizing them, and the edge from B to A falls out.  Don't you have<br>

the same problem now (i.e. we've removed an edge from an SCC we're<br>

iterating over)?  Perhaps I misunderstood your scheme?</blockquote><div><br></div><div>I wasn't really proposing a specific algorithm, but rather a different way to think about the problem and what semantics we really want (hence the thread title) and are implementable.</div><div><br></div><div>A good example of what I mean is David's example of caching the post-order traversal: thinking in terms of a raw graph algorithm can provide some insights into the actual invariants that are (or can be) maintained. Whereas that example was in terms of just a post-order CG walk, to think about the whole CGSCC pass manager visitation, thinking in terms of something like Tarjan's algorithm seems likely to provide insight about the invariants both on the function order and the SCC order.</div><div><br></div><div>(btw, <a href="http://www.webgraphviz.com/">http://www.webgraphviz.com/</a> helped me visualize your graph; handy!)</div><div><br></div><div>-- Sean Silva</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><span class=""><font color="#888888"><br>

<br>

-- Sanjoy<br>

<br>

</font></span></blockquote></div><br></div></div>