[cfe-dev] C++11 and enhacned devirtualization

Thu Jul 16 12:24:48 PDT 2015

----- Original Message -----
> From: "Richard Smith" <richard at metafoo.co.uk>
> To: "John McCall" <rjmccall at apple.com>
> Cc: "Hal Finkel" <hfinkel at anl.gov>, "cfe-dev at cs.uiuc.edu Developers" <cfe-dev at cs.uiuc.edu>, "chandlerc"
> <chandlerc at gmail.com>
> Sent: Thursday, July 16, 2015 1:46:05 PM
> Subject: Re: C++11 and enhacned devirtualization
> 
> 
> 
> 
> On Thu, Jul 16, 2015 at 11:29 AM, John McCall < rjmccall at apple.com >
> wrote:
> 
> 
> > On Jul 15, 2015, at 10:11 PM, Hal Finkel < hfinkel at anl.gov > wrote:
> > 
> > Hi everyone,
> > 
> > C++11 added features that allow for certain parts of the class
> > hierarchy to be closed, specifically the 'final' keyword and the
> > semantics of anonymous namespaces, and I think we take advantage
> > of these to enhance our ability to perform devirtualization. For
> > example, given this situation:
> > 
> > struct Base {
> > virtual void foo() = 0;
> > };
> > 
> > void external();
> > struct Final final : Base {
> > void foo() {
> > external();
> > }
> > };
> > 
> > void dispatch(Base *B) {
> > B->foo();
> > }
> > 
> > void opportunity(Final *F) {
> > dispatch(F);
> > }
> > 
> > When we optimize this code, we do the expected thing and inline
> > 'dispatch' into 'opportunity' but we don't devirtualize the call
> > to foo(). The fact that we know what the vtable of F is at that
> > callsite is not exploited. To a lesser extent, we can do similar
> > things for final virtual methods, and derived classes in anonymous
> > namespaces (because Clang could determine whether or not a class
> > (or method) there is effectively final).
> > 
> > One possibility might be to @llvm.assume to say something about
> > what the vtable ptr of F might be/contain should it be needed
> > later when we emit the initial IR for 'opportunity' (and then
> > teach the optimizer to use that information), but I'm not at all
> > sure that's the best solution. Thoughts?
> 
> The problem with any sort of @llvm.assume-encoded information about
> memory contents is that C++ does actually allow you to replace
> objects in memory, up to and including stuff like:
> 
> {
> MyClass c;
> 
> // Reuse the storage temporarily. UB to access the object through ‘c’
> now.
> c.~MyClass();
> auto c2 = new (&c) MyOtherClass();
> 
> // The storage has to contain a ‘MyClass’ when it goes out of scope.
> c2->~MyOtherClass();
> new (&c) MyClass();
> }
> 
> The standard frontend devirtualization optimizations are permitted
> under a couple of different language rules, specifically that:
> 1. If you access an object through an l-value of a type, it has to
> dynamically be an object of that type (potentially a subobject).
> 2. Object replacement as above only “forwards” existing formal
> references under specific conditions, e.g. the dynamic type has to
> be the same, ‘const’ members have to have the same value, etc. Using
> an unforwarded reference (like the name of the local variable ‘c’
> above) doesn’t formally refer to a valid object and thus has
> undefined behavior.
> 
> You can apply those rules much more broadly than the frontend does,
> of course; but those are the language tools you get.
> 
> 
> Right. Our current plan for modelling this is:
> 
> 
> 1) Change the meaning of the existing !invariant.load metadata (or
> add another parallel metadata kind) so that it allows load-load
> forwarding (even if the memory is not known to be unmodified between
> the loads) if:
> a) both loads have !invariant.load metadata with the same operand,
> and
> b) the pointer operands are the same SSA value (being must-alias is
> not sufficient)

Why is being must-alias not sufficient? This seems scary.

 -Hal

> 2) Add a new intrinsic "i8* @llvm.invariant.barrier(i8*)" that
> produces a new pointer that is different for the purpose of
> !invariant.load. (Some other optimizations are permitted to look
> through the barrier.)
> 
> 
> In particular, "new (&c) MyOtherClass()" would be emitted as
> something like this:
> 
> 
> %1 = call @operator new(size, %c)
> %2 = call @llvm.invariant.barrier(%1)
> call @MyOtherClass::MyOtherClass(%2)
> %vptr = load %2
> %known.vptr = icmp eq %vptr, @MyOtherClass::vptr, !invariant.load
> !MyBaseClass.vptr
> call @llvm.assume(%known.vptr)
> 
> 
> -- Richard

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory