[cfe-dev] C++11 and enhacned devirtualization

Thu Jul 16 13:40:03 PDT 2015

On Thu, Jul 16, 2015 at 12:24 PM, Hal Finkel <hfinkel at anl.gov> wrote:

> ----- Original Message -----
> > From: "Richard Smith" <richard at metafoo.co.uk>
> > To: "John McCall" <rjmccall at apple.com>
> > Cc: "Hal Finkel" <hfinkel at anl.gov>, "cfe-dev at cs.uiuc.edu Developers" <
> cfe-dev at cs.uiuc.edu>, "chandlerc"
> > <chandlerc at gmail.com>
> > Sent: Thursday, July 16, 2015 1:46:05 PM
> > Subject: Re: C++11 and enhacned devirtualization
> >
> >
> >
> >
> > On Thu, Jul 16, 2015 at 11:29 AM, John McCall < rjmccall at apple.com >
> > wrote:
> >
> >
> > > On Jul 15, 2015, at 10:11 PM, Hal Finkel < hfinkel at anl.gov > wrote:
> > >
> > > Hi everyone,
> > >
> > > C++11 added features that allow for certain parts of the class
> > > hierarchy to be closed, specifically the 'final' keyword and the
> > > semantics of anonymous namespaces, and I think we take advantage
> > > of these to enhance our ability to perform devirtualization. For
> > > example, given this situation:
> > >
> > > struct Base {
> > > virtual void foo() = 0;
> > > };
> > >
> > > void external();
> > > struct Final final : Base {
> > > void foo() {
> > > external();
> > > }
> > > };
> > >
> > > void dispatch(Base *B) {
> > > B->foo();
> > > }
> > >
> > > void opportunity(Final *F) {
> > > dispatch(F);
> > > }
> > >
> > > When we optimize this code, we do the expected thing and inline
> > > 'dispatch' into 'opportunity' but we don't devirtualize the call
> > > to foo(). The fact that we know what the vtable of F is at that
> > > callsite is not exploited. To a lesser extent, we can do similar
> > > things for final virtual methods, and derived classes in anonymous
> > > namespaces (because Clang could determine whether or not a class
> > > (or method) there is effectively final).
> > >
> > > One possibility might be to @llvm.assume to say something about
> > > what the vtable ptr of F might be/contain should it be needed
> > > later when we emit the initial IR for 'opportunity' (and then
> > > teach the optimizer to use that information), but I'm not at all
> > > sure that's the best solution. Thoughts?
> >
> > The problem with any sort of @llvm.assume-encoded information about
> > memory contents is that C++ does actually allow you to replace
> > objects in memory, up to and including stuff like:
> >
> > {
> > MyClass c;
> >
> > // Reuse the storage temporarily. UB to access the object through ‘c’
> > now.
> > c.~MyClass();
> > auto c2 = new (&c) MyOtherClass();
> >
> > // The storage has to contain a ‘MyClass’ when it goes out of scope.
> > c2->~MyOtherClass();
> > new (&c) MyClass();
> > }
> >
> > The standard frontend devirtualization optimizations are permitted
> > under a couple of different language rules, specifically that:
> > 1. If you access an object through an l-value of a type, it has to
> > dynamically be an object of that type (potentially a subobject).
> > 2. Object replacement as above only “forwards” existing formal
> > references under specific conditions, e.g. the dynamic type has to
> > be the same, ‘const’ members have to have the same value, etc. Using
> > an unforwarded reference (like the name of the local variable ‘c’
> > above) doesn’t formally refer to a valid object and thus has
> > undefined behavior.
> >
> > You can apply those rules much more broadly than the frontend does,
> > of course; but those are the language tools you get.
> >
> >
> > Right. Our current plan for modelling this is:
> >
> >
> > 1) Change the meaning of the existing !invariant.load metadata (or
> > add another parallel metadata kind) so that it allows load-load
> > forwarding (even if the memory is not known to be unmodified between
> > the loads) if:
> > a) both loads have !invariant.load metadata with the same operand,
> > and
> > b) the pointer operands are the same SSA value (being must-alias is
> > not sufficient)
>
> Why is being must-alias not sufficient? This seems scary.

Must-alias remains sufficient if the value is known to not have changed
between (but that's orthogonal to this metadata). The key property is that
given:

  %x = load %p, !invariant.load !a
  %q = call @llvm.invariant.barrier(%p)
  call @foo(%q)
  %y = load %q, !invariant.load !a

... we cannot deduce that %x and %y load the same value unless we can see
the definition of @foo (the value could have been overwritten by @foo).
However, it may still be reasonable to deduce that %p and %q are mustalias
(it's probably not particularly important to actually deduce that, though,
so we'd probably lose little if alias analysis can't look through the
intrinsic).

 -Hal
>
> > 2) Add a new intrinsic "i8* @llvm.invariant.barrier(i8*)" that
> > produces a new pointer that is different for the purpose of
> > !invariant.load. (Some other optimizations are permitted to look
> > through the barrier.)
> >
> >
> > In particular, "new (&c) MyOtherClass()" would be emitted as
> > something like this:
> >
> >
> > %1 = call @operator new(size, %c)
> > %2 = call @llvm.invariant.barrier(%1)
> > call @MyOtherClass::MyOtherClass(%2)
> > %vptr = load %2
> > %known.vptr = icmp eq %vptr, @MyOtherClass::vptr, !invariant.load
> > !MyBaseClass.vptr
> > call @llvm.assume(%known.vptr)
> >
> >
> > -- Richard
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150716/276a8464/attachment.html>