<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<br>
<br>
<div class="moz-cite-prefix">On 07/16/2015 02:38 PM, Richard Smith
wrote:<br>
</div>
<blockquote
cite="mid:CAOfiQqkACnC5bLavwDFHM12xtNuKuqtus2EJsCJym0C3Gh5ipA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">On Thu, Jul 16, 2015 at 2:03 PM, John
McCall <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:rjmccall@apple.com" target="_blank">rjmccall@apple.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word">
<div>
<div>
<div class="h5">
<blockquote type="cite">
<div>On Jul 16, 2015, at 11:46 AM, Richard Smith
<<a moz-do-not-send="true"
href="mailto:richard@metafoo.co.uk"
target="_blank">richard@metafoo.co.uk</a>>
wrote:</div>
<div>
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">On Thu, Jul 16,
2015 at 11:29 AM, John McCall <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:rjmccall@apple.com"
target="_blank">rjmccall@apple.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0 0 0
.8ex;border-left:1px #ccc
solid;padding-left:1ex"><span>> On
Jul 15, 2015, at 10:11 PM, Hal
Finkel <<a moz-do-not-send="true"
href="mailto:hfinkel@anl.gov"
target="_blank">hfinkel@anl.gov</a>>
wrote:<br>
><br>
> Hi everyone,<br>
><br>
> C++11 added features that allow
for certain parts of the class
hierarchy to be closed, specifically
the 'final' keyword and the
semantics of anonymous namespaces,
and I think we take advantage of
these to enhance our ability to
perform devirtualization. For
example, given this situation:<br>
><br>
> struct Base {<br>
> virtual void foo() = 0;<br>
> };<br>
><br>
> void external();<br>
> struct Final final : Base {<br>
> void foo() {<br>
> external();<br>
> }<br>
> };<br>
><br>
> void dispatch(Base *B) {<br>
> B->foo();<br>
> }<br>
><br>
> void opportunity(Final *F) {<br>
> dispatch(F);<br>
> }<br>
><br>
> When we optimize this code, we
do the expected thing and inline
'dispatch' into 'opportunity' but we
don't devirtualize the call to
foo(). The fact that we know what
the vtable of F is at that callsite
is not exploited. To a lesser
extent, we can do similar things for
final virtual methods, and derived
classes in anonymous namespaces
(because Clang could determine
whether or not a class (or method)
there is effectively final).<br>
><br>
> One possibility might be to
@llvm.assume to say something about
what the vtable ptr of F might
be/contain should it be needed later
when we emit the initial IR for
'opportunity' (and then teach the
optimizer to use that information),
but I'm not at all sure that's the
best solution. Thoughts?<br>
<br>
</span>The problem with any sort of
@llvm.assume-encoded information about
memory contents is that C++ does
actually allow you to replace objects
in memory, up to and including stuff
like:<br>
<br>
{<br>
MyClass c;<br>
<br>
// Reuse the storage temporarily.
UB to access the object through ‘c’
now.<br>
c.~MyClass();<br>
auto c2 = new (&c)
MyOtherClass();<br>
<br>
// The storage has to contain a
‘MyClass’ when it goes out of scope.<br>
c2->~MyOtherClass();<br>
new (&c) MyClass();<br>
}<br>
<br>
The standard frontend devirtualization
optimizations are permitted under a
couple of different language rules,
specifically that:<br>
1. If you access an object through an
l-value of a type, it has to
dynamically be an object of that type
(potentially a subobject).<br>
2. Object replacement as above only
“forwards” existing formal references
under specific conditions, e.g. the
dynamic type has to be the same,
‘const’ members have to have the same
value, etc. Using an unforwarded
reference (like the name of the local
variable ‘c’ above) doesn’t formally
refer to a valid object and thus has
undefined behavior.<br>
<br>
You can apply those rules much more
broadly than the frontend does, of
course; but those are the language
tools you get.</blockquote>
<div><br>
</div>
<div>Right. Our current plan for
modelling this is:</div>
<div><br>
</div>
<div>1) Change the meaning of the
existing !invariant.load metadata (or
add another parallel metadata kind) so
that it allows load-load forwarding
(even if the memory is not known to be
unmodified between the loads) if:</div>
</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
</div>
</div>
invariant.load currently allows the load to be
reordered pretty aggressively, so I think you need a
new metadata.</div>
</div>
</blockquote>
<div><br>
</div>
<div>Our thoughts were:</div>
<div>1) The existing !invariant.load is redundant because
it's exactly equivalent to a call to @llvm.invariant.start
and a load.</div>
<div>2) The new semantics are a more strict form of the old
semantics, so no special action is required to upgrade old
IR.</div>
<div>... so changing the meaning of the existing metadata
seemed preferable to adding a new,
similar-but-not-quite-identical, form of the metadata. But
either way seems fine.</div>
</div>
</div>
</div>
</blockquote>
I'm going to argue pretty strongly in favour of the new form of
metadata. We've spent a lot of time getting !invariant.load working
well for use cases like the "length" field in a Java array and I'd
really hate to give that up.<br>
<br>
(One way of framing this is that the current !invariant.load gives a
guarantee that there can't be a @llvm.invariant.end call anywhere in
the program and that any @llvm.invariant.start occurs outside the
visible scope of the compilation unit (Module, LTO, what have you)
and must have executed before any code contained in said module
which can describe the memory location can execute. FYI, that last
bit of strange wording is to allow initialization inside a malloc
like function which returns a noalias pointer.)<br>
<br>
I'm definitely open to working together on a revised version of a
more general invariant mechanism. In particular, we don't have a
good way of modelling Java's "final" fields* in the IR today since
the initialization logic may be visible to the compiler. Coming up
with something which supports both use cases would be really
useful. <br>
<br>
* Let's ignore the fact that few Java final fields are actually
final. That part of the problem is decidedly out of scope for
LLVM. :)<br>
<br>
<blockquote
cite="mid:CAOfiQqkACnC5bLavwDFHM12xtNuKuqtus2EJsCJym0C3Gh5ipA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word">
<div><span class="">
<blockquote type="cite">
<div>
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> a) both loads have !invariant.load
metadata with the same operand, and</div>
<div> b) the pointer operands are the
same SSA value (being must-alias is not
sufficient)</div>
<div>2) Add a new intrinsic "i8*
@llvm.invariant.barrier(i8*)" that
produces a new pointer that is different
for the purpose of !invariant.load.
(Some other optimizations are permitted
to look through the barrier.)</div>
</div>
</div>
</div>
</div>
</blockquote>
<blockquote type="cite">
<div>
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><br>
</div>
<div>In particular, "new (&c)
MyOtherClass()" would be emitted as
something like this:</div>
<div><br>
</div>
<div> %1 = call @operator new(size, %c)</div>
<div> %2 = call
@llvm.invariant.barrier(%1)</div>
<div> call
@MyOtherClass::MyOtherClass(%2)</div>
<div> %vptr = load %2</div>
<div> %known.vptr = icmp eq %vptr,
@MyOtherClass::vptr, !invariant.load
!MyBaseClass.vptr</div>
<div> call @llvm.assume(%known.vptr)</div>
</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
</span>Hmm. And all v-table loads have this invariant
metadata?</div>
</div>
</blockquote>
<div><br>
</div>
<div>That's the idea (but it's not essential that they do,
we just lose optimization power if not).</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word">
<div>I am concerned about mixing files with and without
barriers.</div>
</div>
</blockquote>
<div><br>
</div>
<div>I think we'd need to always generate the barrier (even
at -O0, to support LTO between non-optimized and optimized
code). I don't think we can support LTO between IR using
the metadata and old IR that didn't contain the relevant
barriers. How important is that use case? We were probably
going to put this behind a -fstrict-something flag, at
least to start off with, so we can create a transition
period where we generate the barrier by default but don't
generate the metadata if necessary.</div>
</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
cfe-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:cfe-dev@cs.uiuc.edu">cfe-dev@cs.uiuc.edu</a>
<a class="moz-txt-link-freetext" href="http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev">http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev</a>
</pre>
</blockquote>
<br>
</body>
</html>