[llvm-dev] Proposal: virtual constant propagation

Thu Jan 28 10:17:51 PST 2016

Hans wrote:
> (and start-up time if we can drop the vtables and
> void the dynamic relocations).

On Thu, Jan 28, 2016 at 09:15:05AM -0800, Mehdi Amini wrote:
> Hi,
> 
> I just thought about another use case: VTable compression.
> If you know that an entry in the Vtable is never used, just remove it!
> I’d hope we could even eliminate some unused virtual functions from the final binary.

We could most likely extend the design to do both of these things, but we'd
need to be a little careful as these would both be ABI breaks, so we'd probably
neeed some separate way of saying "I promise not to call virtual functions
from these classes" as opposed to "I promise not to extend these classes".

> 
> — 
> Mehdi
> 
> 
> > On Jan 27, 2016, at 10:29 PM, Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> > 
> > Hi Peter,
> > 
> > Pete (Cooper, CC'ed) had a similar idea a few months ago about devirtualization and LTO: we can know the set of possible overload based on the knowledge of a closed type hierarchy. He worked on a prototype, and I continued to evolve it. I planned to submit something on the mailing list but couldn’t fit it in my planning before April :)
> > Our plan was about removing virtual call in general, or maybe in some cases turning the indirect call into a switch table.
> > 
> > In both use case (your constant propagation and devirtualization), I think the interesting part is “how to figure the set of possible callees for a call site” and how we encode this in the IR.
> > 
> > I haven’t worked on this for a few weeks, but I’ll try to give a rough description of where I stopped on my prototype: 
> > I was storing the full inheritance tree in metadata. There was one metadata per entry of the VTable, and any method that override one of the base class will have a metadata attached pointing to the slot in the base class.
> > The front-end is modified to emit any load of a VTable through a new intrinsic llvm.read.vtable that takes a pointer to the VTable, the index of the slot in the VTable, and a pointer to the metadata that describe the slot for the current type. Using the metadata you access directly to any override and construct naturally and very easily the set of possible overloads. The pass that transform the IR is quite easy to implement since you don’t need to walk the IR but access directly all the uses of the llvm.read.vtable intrinsic and have all the needed information in the operands.
> > I haven’t give much thought about your representation yet to know how it compares (and I’m not very familiar with how CFI works), but I’m interested in your feedback!
> > 
> > I was limited during my experiment by the lack of black/white list to define what hierarchy is/isn’t closed, I’m glad if such flag/control would be added!

Thanks for sharing that design, but I think it would be best to have a single
representation of the type information in the tree. The bitset representation
has the advantage of being proven with other features like CFI, while I'm
not sure that a design that worked at the virtual function level would work
well with CFI. (It also includes the insight that you don't need to store
an inheritance graph or anything like that, just the valid address points
for each vtable).

I did think about having something like your llvm.load.vtable intrinsic to
load function pointers, but I thought it would be more orthogonal to extend
existing mechanisms to load the vtable than to have a special type of load
that only works with vtables, to avoid having to teach passes about this
new type of load. Did you find this to be an issue in practice?

Now that I think about it, I believe that this sort of intrinsic could make
ABI-breaking vtable transformations like the ones above easier to implement
(there could be some IR module flag that means "I promise not to load from
vtables except via this intrinsic" that could be used to turn on those
transformations).

I had another look at http://llvm.org/docs/BitSets.html and it occurred
to me that it doesn't clearly explain how to use bitsets to encode type
information. I can certainly take another pass over that page to see if
I can improve the documentation there as we start to use bitsets in more
places in the compiler.

Thanks,
-- 
Peter