[llvm-dev] Intended behavior of CGSCC pass manager.

Wed Jun 8 11:14:17 PDT 2016

Hi Sean,

Sean Silva wrote:
 > Hi Chandler, Philip, Mehdi, (and llvm-dev,)
 >
 > (this is partially a summary of some discussions that happened at the last LLVM bay area social, and partially a
 > discussion about the direction of the CGSCC pass manager)

Thanks for writing this up!  This sort of thing is very helpful for
people who don't attend the social, or had to leave early. :)

 > Chandler described that he had a major breakthrough in that the CGSCC pass manager only had to deal with 3 classes of
 > modifications that can occur:
 > - a pass may e.g. propagate a load of a function pointer into an indirect call, turning it into an direct call. This
 > requires adding an edge in the CG but not in the ref graph.
 > - a pass may take a direct call and turn it into an indirect call. This requires removing an edge from the CG, but not
 > in the ref graph.
 > - a pass may delete a direct call. This removes an edge in the CG and also in the ref graph.

At what granularity are we modeling these things?  E.g. if SimplifyCFG
deletes a basic block, will we remove call edges that start from that
block?

Note there is fourth kind of modification that isn't modeled here:
devirtualization, in which we transform IR like

   %fnptr = compute_fnptr(%receiver, i32 <method id>)
   call %fptr(%receiver)

to (via a pass that understands the semantics of compute_fnptr).

   call some.specific.Class::method(%receiver)

However, at this time I don't think modeling "out of thin air"
devirtualization of this sort is important, since nothing upstream
does things like this (we do have ModulePasses downstream that do
this).

 >  From the perspective of the CGSCC pass manager, these operations can affect the SCC structure. Adding an edge might
 > merge SCC's and deleting an edge might split SCC's. Chandler mentioned that apparently the issues of splitting and
 > merging SCC's within the current infrastructure are actually quite challenging and lead to e.g. iterator invalidation
 > issues, and that is what he is working on.
 >
 > (
 > The ref graph is important to guide the overall SCC visitation order because it basically represents "the largest graph
 > that the CG may turn into due to our static analysis of this module". I.e. no transformation we can statically make in
 > the CGSCC passes can ever cause us to need to merge SCC's in the ref graph.
 > )

Except in the above "out of thin air" devirtualization case.

I think cross-function store forwarding can also be problematic:

void foo(fnptr* bar) {
   if (bar)
     (*bar)();
}

void baz() { foo(null); }

void caller() {
   fnptr *t = malloc();
   *t = baz;
   foo(t);
}

// RefGraph is
//  caller -> baz
//  caller -> foo
//  baz -> foo

Now the RefSCCs are {foo}, {caller}, {baz} but if we forward the store
to *t in caller into the load in foo (and are smart about the null
check) we get:

void foo(fnptr* bar) {
   if (bar)
     baz();
}

void baz() { foo(null); }

void caller() {
   fnptr *t = malloc();
   *t = baz;
   foo(t);
}

// RefGraph is
//  foo -> baz
//  baz -> foo
//  caller -> foo
//  caller -> baz

and now the RefSCCs are {foo, baz}, {caller}

But again, I think this is fine since nothing upstream does this at
this time; and we can cross that bridge when we come to it.

 > 2. What is the intended behavior of CGSCC passes when SCC's are split or merged? E.g. a CGSCC pass runs on an SCC (e.g.
 > the inliner). Now we run some function passes nested inside the CGSCC pass manager (e.g. to simplify things after
 > inlining). Consider:
 >
 > a) These function passes are e.g. now able to devirtualize a call, adding an edge to the CG, forming a larger CG SCC. Do
 > you re-run the CGSCC pass (say, the inliner) on this larger SCC?
 >
 > b) These function passes are e.g. able to DCE a call, removing an edge from the CG. This converts, say, a CG SCC which
 > is a cycle graph (like a->b->c->a) into a path graph (a->b->c, with no edge back to a). The inliner had already visited
 > a, b, and c as a single SCC. Now does it have to re-visit c, then b, then a, as single-node SCC's?

Okay, so this is the same question I wrote above: "At what granularity
are we modeling these things?", phrased more eloquently. :)

 > One way that I have found it useful to think about this is in terms of the visitation during Tarjan's SCC algorithm.
 > I'll reference the pseudocode in https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm.
 > Inside the "strongconnect" routine when we have identified an SCC (the true branch of `if (v.lowlink = v.index)` test )
 > we can visit stack[v.index:stack.size()] as an SCC. This may or may not invalidate some things on the stack (the
 > variable `S` in the pseudocode) and we may need to fix it up (e.g. inlining deleted a function, so we can't have an
 > entry on the stack). Then, we can run function passes as we pop individual functions off the stack, but it is easier to
 > think about IMO than merging of SCC data structures: if we add edges to the CG then we have to do more DFS on the new
 > edges and if we delete edges then the DFS order of the stack gives us certain guarantees.

I'm not sure how this will be easier  E.g. consider

X -> A
A -> B
B -> C
C -> B
B -> A

Now you're going to pop A,B,C together, as an SCC; and you start
optimizing them, and the edge from B to A falls out.  Don't you have
the same problem now (i.e. we've removed an edge from an SCC we're
iterating over)?  Perhaps I misunderstood your scheme?

-- Sanjoy