[LLVMdev] Implementing devirtualization

John McCall rjmccall at apple.com
Sun Dec 11 03:18:56 PST 2011

On Dec 10, 2011, at 12:26 PM, Nick Lewycky wrote:
> John McCall wrote:
>> On Dec 8, 2011, at 10:03 PM, Nick Lewycky wrote:
>>> Noalias returns, nocapture, SCC refinement, linkonce_odr and
>>> available_externally were added with the goal of making devirtualization
>>> in LLVM happen, but as orthogonal independent optimizations. I think
>>> LLVM should continue with this design. If you want to implement a single
>>> substantial optimization pass, I would suggest FSIPSCCP as the largest
>>> thing you should write.
>> This is a lot of work that is going to be completely foiled by the presence
>> of almost any opaque call at all.
> Yes, but it's still useful.

Sure, there are generally applications for any general optimization you can suggest.  I'm just saying that FSIPSCCP is not really a very compelling way to do devirtualization.

> Also, anything based on knowing the type hierarchy could be foiled by new derivations in other translation units, or that show up with dlopen.

I am not proposing anything that requires full-program knowledge of class hierarchies.  If that's the idea, we are going to actually have to have full-program knowledge somehow, which I don't remember being one of the many, many appositions in FSIPSCCP, either. :)

Both of our proposals obviously only work when we can statically see the construction point of an object in some way.  However, using a generic memory optimization would require us to be able to see both the actual store to the vtable field and the entire intervening history of that memory to verify that there are no subsequent stores.  That analysis is likely prohibitively expensive even where possible, and it will frequently *not* be possible:
  Example #1:  I have a constructor which is not defined in this translation unit.  You are doomed.
  Example #2:  I pass the address of a mutable global variable to a function which performs a virtual call on it.  You must prove that literally no code (except possibly a global constructor) can ever store to that vtable.
  Example #3:  I construct an object, call a global function foo(), and then do a virtual call on my object.  You must either prove that foo() cannot possibly have a handle to the object or hope it's defined in this translation unit.

Language guarantees are *really, really useful*.  I understand the desire to improve optimizations that don't require language-specific annotations, but I am not sure it is very practical.


More information about the llvm-dev mailing list