[cfe-dev] RFC: A new ABI for virtual calls, and a change to the virtual call representation in the IR

Fri Mar 4 12:47:11 PST 2016

On Fri, Mar 4, 2016 at 9:32 AM, Mehdi Amini <mehdi.amini at apple.com> wrote:

>
> > On Feb 29, 2016, at 1:53 PM, Peter Collingbourne <peter at pcc.me.uk>
> wrote:
> >
> > Hi all,
> >
> > I'd like to make a proposal to implement the new vtable ABI described in
> > PR26723, which I'll call the relative ABI. That bug gives more details
> and
> > justification for that ABI.
> >
> > The user interface for the new ABI would be that -fwhole-program-vtables
> > would take an optional value indicating which aspects of the program have
> > whole-program scope. For example, the existing implementation of
> whole-program
> > vcall optimization allows external code to call into translation units
> > compiled with -fwhole-program-vtables, but does not allow external code
> to
> > derive from classes defined in such translation units, so you could
> request
> > the current behaviour with "-fwhole-program-vtables=derive", which means
> > that derived classes are not allowed from outside the program. To request
> > the new ABI, you can specify "-fwhole-program-vtables=call,derive",
> > which means that calls and derived classes are both not allowed from
> > outside the program. "-fwhole-program-vtables" would be short for
> > "-fwhole-program-vtables=call,derive,anythingelseweaddinfuture".
> >
> > I'll also make the observation that the new ABI does not require LTO or
> > whole-program visibility at compile time; to decide whether to use the
> new
> > ABI for a class, we just need to check that it and its bases are not in
> the
> > whole-program-vtables blacklist.
> >
> > At the same time, I'd like to change how virtual calls are represented in
> > the IR. This is for a few reasons:
> >
> > 1) Would allow whole-program virtual call optimization to work well with
> the
> >   relative ABI. This ABI would complicate the IR at call sites and make
> it
> >   harder to do matching and rewriting.
> >
> > 2) Simplifies the whole-program virtual call optimization pass.
> Currently we
> >   need to walk uses in the IR in order to determine the slot and callees
> for
> >   each call site. This can all be avoided with a simpler representation.
> >
> > 3) Would make it easier to implement dead virtual function stripping.
> This would
> >   involve reshaping any vtable initializers and rewriting call
> >   sites. Implementing this correctly is harder than it needs to be
> because
> >   of the current representation.
> >
> > My proposal is to add the following new intrinsics:
>
> Thanks, I'm really glad you're moving forward on improving the IR
> representation so fast after our previous discussion. The use of these
> intrinsics looks a lot more friendly to me! :)
> (even if I still does not make sense of the "bitset" terminology to
> represent the hierarchy for the metadata part)
>
> >
> > i32 @llvm.vtable.slot.offset(metadata, i32)
> >
> > This intrinsic takes a bitset name B and an offset I. It returns the byte
> > offset of the I'th virtual function pointer in each of the vtables in B.
> >
> > i8* @llvm.vtable.load(i8*, i32)
>
> Why is the vtable.load taking a byte offset instead of a slot index
> directly? (the IR could be simpler by not requiring to call
> @llvm.vtable.slot.offset() for every @llvm.vtable.load())
>

I decided to split these in order to support virtual member function
pointers correctly. In the Itanium ABI, member function pointers use a byte
offset. The idea is that llvm.vtable.slot.offset would be used to create a
member function pointer, while llvm.vtable.load would be used to call it
(see also the getmfp and callmfp examples).

Thanks,
Peter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20160304/bd784ca7/attachment.html>