[llvm-dev] Intended behavior of CGSCC pass manager.

Mon Jun 20 13:50:14 PDT 2016

Hi David,

Xinliang David Li wrote:
 > [snip]
 >
 > However, in real applications, what I see is the following pattern (for
 > instances LLVM's Pass )
 >
 > Caller() {
 >      Base *B =  Factory::create(...);
 >      Stash (B);  // store the object in some container to be 
retrieved later
 >    ...
 > }
 >
 > SomeTask() {
 >
 >     Base *B = findObject(...);
 >     B->vCall(); // do the work
 > }
 >
 > Driver() {
 >       Caller();  // create objects ...
 >       SomeTask();
 > }
 >
 > Set aside the fact that it is usually much harder to do
 > de-viritualization in this case,   assuming  the virtual call in
 > SomeTask can be devritualized. What we need is that the virtual
 > functions are processed before SomeTask node, but this is not guaranteed
 > unless we also model the call edge ordering imposed by control flow.

I think the thesis here is you cannot devirtualize the call in
`SomeTask` without also looking at `Caller` [0].  So the flow is:

  - Optimize Caller, SomeTask independently as much as you want
    * Caller -refs-> Factory::create which -refs-> the constructors
      which -refs-> the various implementation of virtual functions
      (based on my current understanding of how C++ vtables are
      lowered); so these implementations should have been simplified by
      the time we look at Caller.

  - Then look at Driver.  Caller, SomeTask are all maximally
    simplified.  We now (presumably) inline Caller and SomeTask,
    devirtualize the B->vCall (as you said: theoretically possible, but
    if findObject etc. are complex then practically maybe not), and now
    inline the maximally simplified devirtualized call targets.

 > However, this is enforcing virtual methods to be processed before their
 > object's creators. Are there other simpler ways to achieve the effect
 > (if we have data to justify it)?

Honestly: I'll have to think about it.  It is entirely possible that a
(much?) simpler design will catch 99% (or even better) of the
idiomatic cases, I just don't have a good mental model for what those
cases are.

At this point I'm waiting for Chandler to upload his patch so that we
can have this discussion on the review thread. :)

[0]: This breaks down when we allow "out of thin air"
devirtualizations (I'm stealing this term from memory models, but I
think it is appropriate here :) ), where you look at call site and
"magically" (i.e. in a way not expressible in terms of "normal"
optimizations like store forwarding, pre, gvn etc.) are able to
devirtualize the call site.  We do this all the time in Java (we'll
look at the type of the receiver object, look at the current class
hierarchy and directly mandate that a certain call site has to have a
certain target), but the RefSCC call graph does not allow for that.
These kinds of out-of-thin-air devirtualizations will have to be
modeled as ModulePass es, IIUC.

-- Sanjoy