<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Thu, Jul 16, 2015 at 12:24 PM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">----- Original Message -----<br>

> From: "Richard Smith" <<a href="mailto:richard@metafoo.co.uk">richard@metafoo.co.uk</a>><br>

> To: "John McCall" <<a href="mailto:rjmccall@apple.com">rjmccall@apple.com</a>><br>

> Cc: "Hal Finkel" <<a href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a>>, "<a href="mailto:cfe-dev@cs.uiuc.edu">cfe-dev@cs.uiuc.edu</a> Developers" <<a href="mailto:cfe-dev@cs.uiuc.edu">cfe-dev@cs.uiuc.edu</a>>, "chandlerc"<br>

> <<a href="mailto:chandlerc@gmail.com">chandlerc@gmail.com</a>><br>

> Sent: Thursday, July 16, 2015 1:46:05 PM<br>

> Subject: Re: C++11 and enhacned devirtualization<br>

><br>

><br>

><br>

><br>

> On Thu, Jul 16, 2015 at 11:29 AM, John McCall < <a href="mailto:rjmccall@apple.com">rjmccall@apple.com</a> ><br>

> wrote:<br>

<div><div class="h5">><br>

><br>

> > On Jul 15, 2015, at 10:11 PM, Hal Finkel < <a href="mailto:hfinkel@anl.gov">hfinkel@anl.gov</a> > wrote:<br>

> ><br>

> > Hi everyone,<br>

> ><br>

> > C++11 added features that allow for certain parts of the class<br>

> > hierarchy to be closed, specifically the 'final' keyword and the<br>

> > semantics of anonymous namespaces, and I think we take advantage<br>

> > of these to enhance our ability to perform devirtualization. For<br>

> > example, given this situation:<br>

> ><br>

> > struct Base {<br>

> > virtual void foo() = 0;<br>

> > };<br>

> ><br>

> > void external();<br>

> > struct Final final : Base {<br>

> > void foo() {<br>

> > external();<br>

> > }<br>

> > };<br>

> ><br>

> > void dispatch(Base *B) {<br>

> > B->foo();<br>

> > }<br>

> ><br>

> > void opportunity(Final *F) {<br>

> > dispatch(F);<br>

> > }<br>

> ><br>

> > When we optimize this code, we do the expected thing and inline<br>

> > 'dispatch' into 'opportunity' but we don't devirtualize the call<br>

> > to foo(). The fact that we know what the vtable of F is at that<br>

> > callsite is not exploited. To a lesser extent, we can do similar<br>

> > things for final virtual methods, and derived classes in anonymous<br>

> > namespaces (because Clang could determine whether or not a class<br>

> > (or method) there is effectively final).<br>

> ><br>

> > One possibility might be to @llvm.assume to say something about<br>

> > what the vtable ptr of F might be/contain should it be needed<br>

> > later when we emit the initial IR for 'opportunity' (and then<br>

> > teach the optimizer to use that information), but I'm not at all<br>

> > sure that's the best solution. Thoughts?<br>

><br>

> The problem with any sort of @llvm.assume-encoded information about<br>

> memory contents is that C++ does actually allow you to replace<br>

> objects in memory, up to and including stuff like:<br>

><br>

> {<br>

> MyClass c;<br>

><br>

> // Reuse the storage temporarily. UB to access the object through ‘c’<br>

> now.<br>

> c.~MyClass();<br>

> auto c2 = new (&c) MyOtherClass();<br>

><br>

> // The storage has to contain a ‘MyClass’ when it goes out of scope.<br>

> c2->~MyOtherClass();<br>

> new (&c) MyClass();<br>

> }<br>

><br>

> The standard frontend devirtualization optimizations are permitted<br>

> under a couple of different language rules, specifically that:<br>

> 1. If you access an object through an l-value of a type, it has to<br>

> dynamically be an object of that type (potentially a subobject).<br>

> 2. Object replacement as above only “forwards” existing formal<br>

> references under specific conditions, e.g. the dynamic type has to<br>

> be the same, ‘const’ members have to have the same value, etc. Using<br>

> an unforwarded reference (like the name of the local variable ‘c’<br>

> above) doesn’t formally refer to a valid object and thus has<br>

> undefined behavior.<br>

><br>

> You can apply those rules much more broadly than the frontend does,<br>

> of course; but those are the language tools you get.<br>

><br>

><br>

> Right. Our current plan for modelling this is:<br>

><br>

><br>

> 1) Change the meaning of the existing !invariant.load metadata (or<br>

> add another parallel metadata kind) so that it allows load-load<br>

> forwarding (even if the memory is not known to be unmodified between<br>

> the loads) if:<br>

> a) both loads have !invariant.load metadata with the same operand,<br>

> and<br>

> b) the pointer operands are the same SSA value (being must-alias is<br>

> not sufficient)<br>

<br>

</div></div>Why is being must-alias not sufficient? This seems scary.</blockquote><div><br></div><div>Must-alias remains sufficient if the value is known to not have changed between (but that's orthogonal to this metadata). The key property is that given:</div><div><br></div><div>  %x = load %p, !invariant.load !a</div><div>  %q = call @llvm.invariant.barrier(%p)</div><div>  call @foo(%q)</div><div>  %y = load %q, !invariant.load !a</div><div><br></div><div>... we cannot deduce that %x and %y load the same value unless we can see the definition of @foo (the value could have been overwritten by @foo). However, it may still be reasonable to deduce that %p and %q are mustalias (it's probably not particularly important to actually deduce that, though, so we'd probably lose little if alias analysis can't look through the intrinsic).</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="HOEnZb"><font color="#888888">

 -Hal<br>

</font></span><span class="im HOEnZb"><br>

> 2) Add a new intrinsic "i8* @llvm.invariant.barrier(i8*)" that<br>

> produces a new pointer that is different for the purpose of<br>

> !invariant.load. (Some other optimizations are permitted to look<br>

> through the barrier.)<br>

><br>

><br>

> In particular, "new (&c) MyOtherClass()" would be emitted as<br>

> something like this:<br>

><br>

><br>

> %1 = call @operator new(size, %c)<br>

> %2 = call @llvm.invariant.barrier(%1)<br>

> call @MyOtherClass::MyOtherClass(%2)<br>

> %vptr = load %2<br>

> %known.vptr = icmp eq %vptr, @MyOtherClass::vptr, !invariant.load<br>

> !MyBaseClass.vptr<br>

> call @llvm.assume(%known.vptr)<br>

><br>

><br>

> -- Richard<br>

<br>

</span><div class="HOEnZb"><div class="h5">--<br>

Hal Finkel<br>

Assistant Computational Scientist<br>

Leadership Computing Facility<br>

Argonne National Laboratory<br>

</div></div></blockquote></div><br></div></div>