[llvm-dev] Intended behavior of CGSCC pass manager.

Sanjoy Das via llvm-dev llvm-dev at lists.llvm.org
Wed Jun 8 17:54:44 PDT 2016


Hi David,

On Wed, Jun 8, 2016 at 4:20 PM, Xinliang David Li <xinliangli at gmail.com> wrote:
> Is it in the category of invalidating the iterator while iterating' which
> feels very wrong to me.  We should avoid going there and find better ways to
> solve the motivating problems (perhaps defining them more clearly first ?).

I'm not a 100% sure of what you meant by that, so I'll try to give a
general answer, and hope that it covers the points you wanted to
raise. :)

In the scheme above I'm not trying to solve iterator invalidation --
I'm trying to solve the following problem: the CGSCC pass manager ran
the inliner and a set of function passes on a function, and they did
something to invalidate the RefSCC we were iterating over[0].  How do
we _continue_ our iteration over this now non-existent RefSCC?

The solution I'm trying to propose is this: The only possibility for
invalidation is that the RefSCC was broken up into a forest of
RefSCC-DAGs[1].  This means if we had a way to get to the leaves of
this forest of RefSCCs, we could restart our iteration from there
(I've tacitly assumed we're interested in a bottom-up SCC order).
This may be difficult in general, but my idea was to "cheat" and
explicitly remember the Ref-SCC-forest a RefSCC was broken down into
when we do that invalidation.  Then once an RefSCC is split, we can
pick up the forest from the original RefSCC* (which is otherwise
useless now), gather the leaves, and re-start our iteration from those
leaves.

This leaves the question of what to do with the SCC DAG nested inside
the RefSCC.  I'm not sure what Chandler has in mind in how much
influence these should have over the iteration order, but if we wanted
to iterate over the SCC-DAG in bottom up order as well (as we iterated
over a single RefSCC), we could have the same scheme to handle
SCC-splits, and a similar scheme to handle SCC-merges (when you merge
an SCC, the SCC that gets cleaned out gets a pointer to the SCC where
all the functions went, and if the SCC you were iterating over gets
such a pointer after running the inliner/FPM you chase that pointer
(possibly multiple times, if more than one SCC was merged) and
re-start iteration over that SCC).

By "incident data structure" I meant that with these additions the
RefSCC or SCC is no longer a "pure" function of the structure of the
module, but has state that is a function of what the pass manager did
which is not ideal.  That is, in theory this isn't significantly
cleaner than the passes reaching out into and changing the CGSCC pass
manager's state, but perhaps we are okay with this kind of design for
practicality's sake?

[0]: One question I don't know the answer to -- how will we detect
that something has removed a call or ref edge?  Will we rescan
functions to verify that edges that we though existed still exist?  Or
will we have a ValueHandles-like scheme?

[1]: As Sean mentioned, by design nothing in the function pass
manaager pipeline could have invalidated the RefSCC by merging it with
other RefSCCs.

-- Sanjoy


More information about the llvm-dev mailing list