[cfe-dev] C++11 and enhacned devirtualization

Fri Jul 17 14:49:53 PDT 2015

On Fri, Jul 17, 2015 at 2:05 PM, Philip Reames <listmail at philipreames.com>
wrote:

>
>
> On 07/16/2015 02:38 PM, Richard Smith wrote:
>
>  On Thu, Jul 16, 2015 at 2:03 PM, John McCall <rjmccall at apple.com> wrote:
>
>>    On Jul 16, 2015, at 11:46 AM, Richard Smith <richard at metafoo.co.uk>
>> wrote:
>>   On Thu, Jul 16, 2015 at 11:29 AM, John McCall <rjmccall at apple.com>
>> wrote:
>>
>>> > On Jul 15, 2015, at 10:11 PM, Hal Finkel <hfinkel at anl.gov> wrote:
>>> >
>>> > Hi everyone,
>>> >
>>> > C++11 added features that allow for certain parts of the class
>>> hierarchy to be closed, specifically the 'final' keyword and the semantics
>>> of anonymous namespaces, and I think we take advantage of these to enhance
>>> our ability to perform devirtualization. For example, given this situation:
>>> >
>>> > struct Base {
>>> >  virtual void foo() = 0;
>>> > };
>>> >
>>> > void external();
>>> > struct Final final : Base {
>>> >  void foo() {
>>> >    external();
>>> >  }
>>> > };
>>> >
>>> > void dispatch(Base *B) {
>>> >  B->foo();
>>> > }
>>> >
>>> > void opportunity(Final *F) {
>>> >  dispatch(F);
>>> > }
>>> >
>>> > When we optimize this code, we do the expected thing and inline
>>> 'dispatch' into 'opportunity' but we don't devirtualize the call to foo().
>>> The fact that we know what the vtable of F is at that callsite is not
>>> exploited. To a lesser extent, we can do similar things for final virtual
>>> methods, and derived classes in anonymous namespaces (because Clang could
>>> determine whether or not a class (or method) there is effectively final).
>>> >
>>> > One possibility might be to @llvm.assume to say something about what
>>> the vtable ptr of F might be/contain should it be needed later when we emit
>>> the initial IR for 'opportunity' (and then teach the optimizer to use that
>>> information), but I'm not at all sure that's the best solution. Thoughts?
>>>
>>> The problem with any sort of @llvm.assume-encoded information about
>>> memory contents is that C++ does actually allow you to replace objects in
>>> memory, up to and including stuff like:
>>>
>>> {
>>>   MyClass c;
>>>
>>>   // Reuse the storage temporarily.  UB to access the object through ‘c’
>>> now.
>>>   c.~MyClass();
>>>   auto c2 = new (&c) MyOtherClass();
>>>
>>>   // The storage has to contain a ‘MyClass’ when it goes out of scope.
>>>   c2->~MyOtherClass();
>>>   new (&c) MyClass();
>>> }
>>>
>>> The standard frontend devirtualization optimizations are permitted under
>>> a couple of different language rules, specifically that:
>>> 1. If you access an object through an l-value of a type, it has to
>>> dynamically be an object of that type (potentially a subobject).
>>> 2. Object replacement as above only “forwards” existing formal
>>> references under specific conditions, e.g. the dynamic type has to be the
>>> same, ‘const’ members have to have the same value, etc.  Using an
>>> unforwarded reference (like the name of the local variable ‘c’ above)
>>> doesn’t formally refer to a valid object and thus has undefined behavior.
>>>
>>> You can apply those rules much more broadly than the frontend does, of
>>> course; but those are the language tools you get.
>>
>>
>>  Right. Our current plan for modelling this is:
>>
>>  1) Change the meaning of the existing !invariant.load metadata (or add
>> another parallel metadata kind) so that it allows load-load forwarding
>> (even if the memory is not known to be unmodified between the loads) if:
>>
>>
>>   invariant.load currently allows the load to be reordered pretty
>> aggressively, so I think you need a new metadata.
>>
>
>  Our thoughts were:
> 1) The existing !invariant.load is redundant because it's exactly
> equivalent to a call to @llvm.invariant.start and a load.
> 2) The new semantics are a more strict form of the old semantics, so no
> special action is required to upgrade old IR.
> ... so changing the meaning of the existing metadata seemed preferable to
> adding a new, similar-but-not-quite-identical, form of the metadata. But
> either way seems fine.
>
> I'm going to argue pretty strongly in favour of the new form of metadata.
> We've spent a lot of time getting !invariant.load working well for use
> cases like the "length" field in a Java array and I'd really hate to give
> that up.
>
> (One way of framing this is that the current !invariant.load gives a
> guarantee that there can't be a @llvm.invariant.end call anywhere in the
> program and that any @llvm.invariant.start occurs outside the visible scope
> of the compilation unit (Module, LTO, what have you) and must have executed
> before any code contained in said module which can describe the memory
> location can execute.  FYI, that last bit of strange wording is to allow
> initialization inside a malloc like function which returns a noalias
> pointer.)
>

I had overlooked that !invariant.load also applies for loads /before/ the
invariant load. I agree that this is different both from what we're
proposing and from what you can achieve with @llvm.invariant.start. I would
expect that you can use our metadata for the length in a Java array -- it
seems like it'd be straightforward for you to arrange that all loads of the
array field have the metadata (and that you use the same operand on all of
them) -- but there's no real motivation behind reusing the existing
metadata besides simplicity and cleanliness.

I'm definitely open to working together on a revised version of a more
> general invariant mechanism.  In particular, we don't have a good way of
> modelling Java's "final" fields* in the IR today since the initialization
> logic may be visible to the compiler.  Coming up with something which
> supports both use cases would be really useful.
>

This seems like something that our proposed mechanism may be able to
support; we intend to use it for const and reference data members in C++,
though the semantics of those are not quite the same.

* Let's ignore the fact that few Java final fields are actually final.
> That part of the problem is decidedly out of scope for LLVM.  :)
>
>         a) both loads have !invariant.load metadata with the same
>> operand, and
>>   b) the pointer operands are the same SSA value (being must-alias is not
>> sufficient)
>> 2) Add a new intrinsic "i8* @llvm.invariant.barrier(i8*)" that produces a
>> new pointer that is different for the purpose of !invariant.load. (Some
>> other optimizations are permitted to look through the barrier.)
>>
>>
>>  In particular, "new (&c) MyOtherClass()" would be emitted as something
>> like this:
>>
>>    %1 = call @operator new(size, %c)
>>   %2 = call @llvm.invariant.barrier(%1)
>>   call @MyOtherClass::MyOtherClass(%2)
>>   %vptr = load %2
>>   %known.vptr = icmp eq %vptr, @MyOtherClass::vptr, !invariant.load
>> !MyBaseClass.vptr
>>   call @llvm.assume(%known.vptr)
>>
>>
>>  Hmm.  And all v-table loads have this invariant metadata?
>>
>
>  That's the idea (but it's not essential that they do, we just lose
> optimization power if not).
>
>
>>  I am concerned about mixing files with and without barriers.
>>
>
>  I think we'd need to always generate the barrier (even at -O0, to
> support LTO between non-optimized and optimized code). I don't think we can
> support LTO between IR using the metadata and old IR that didn't contain
> the relevant barriers. How important is that use case? We were probably
> going to put this behind a -fstrict-something flag, at least to start off
> with, so we can create a transition period where we generate the barrier by
> default but don't generate the metadata if necessary.
>
>
> _______________________________________________
> cfe-dev mailing listcfe-dev at cs.uiuc.eduhttp://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20150717/c932f379/attachment.html>