<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Apr 13, 2015 at 1:51 PM, Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><span class="">
<br>
<div>On 04/13/2015 11:20 AM, Xinliang David
Li wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Indirect call promotion + inlining does have very
large performance benefit. IIRC, it improves eon, h264ref,
povray, gap, ......etc. Many requires cross module inlining too.
For instance povray improves >20% (if my memory does not
betray me). Our internal programs all show large improvement
with it.</div>
</blockquote></span>
I am not questioning the benefit of profile guided devirtualization
and inlining. Getting these optimizations implemented is a clear
win if done right. <br>
<br>
I am seeking clarification on the approach being taken to implement
them and the design choices being proposed. That's a different
matter. <br></div></blockquote><div><br></div><div>Sure. However in order to demostrate the full benefit of the value profiling transformations, many others may need to be in place, for instance making inlining aware of profile data, which is currently missing.</div><div><br></div><div>For value profiling, I think it is more important to have metrics on the instrumentation side -- the low memory overhead, low runtime overhead will be obvious ones.</div><div><br></div><div>David</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div bgcolor="#FFFFFF" text="#000000"><div><div class="h5">
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>David</div>
<div><br>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Mon, Apr 13, 2015 at 11:02 AM,
Philip Reames <span dir="ltr"><<a href="mailto:listmail@philipreames.com" target="_blank">listmail@philipreames.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<div><br>
On 04/13/2015 10:57 AM, <a href="mailto:betulb@codeaurora.org" target="_blank">betulb@codeaurora.org</a>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On Apr 10, 2015, at 9:25 AM, <a href="mailto:betulb@codeaurora.org" target="_blank">betulb@codeaurora.org</a> wrote:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
On 04/09/2015 11:06 AM, Betul Buyukkurt wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
In <a href="http://reviews.llvm.org/D8908#153838" target="_blank">http://reviews.llvm.org/D8908#153838</a>,
@reames wrote:<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Have the IR level construct patches made it
up for review? If so,<br>
can<br>
</blockquote>
So far I've posted two patches. These two
patches should apply cleanly<br>
to the tip, working with the present profile
infrastructure. The next<br>
set of patches will be the enabler ones: i.e.
three more patches one<br>
for<br>
each of clang, llvm and compiler-rt. Clang
patch will be up for review<br>
later today.<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
you send me a link? I managed to miss them.<br>
</blockquote>
So far there is this patch and the instrinsic
instruction definitions:<br>
<a href="http://reviews.llvm.org/D8877" target="_blank">http://reviews.llvm.org/D8877</a>.
All patches are necessary for getting<br>
the<br>
IC targets and having them displayed by the
llvm-profdata.<br>
</blockquote>
Ok, I'm really not convinced that the
instrumentation code needs to be<br>
or should be an intrinsic. This seems like
something which should be<br>
emitted by the frontend and optimized like any
other code. To say this<br>
a different way, my instrumentation is going to
be entirely different<br>
than your instrumentation.<br>
<br>
Having said that, I really don't care about this
part of the proposed<br>
changes since they aren't going to impact me at
all. I'm am<br>
specifically not objecting to the changes, just
commenting. :)<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I'm assuming this will be some type of per
call site metadata?<br>
</blockquote>
We do assign metadata at the indirect call
sites. Format looks like as<br>
follows:<br>
<br>
!33 = metadata !{metadata
!"indirect_call_targets", i64<br>
<total_exec_count>, metadata
!"target_fn1†, i64<br>
<target_fn1_count>,<br>
metadata !"target_fn2†, i64
<target_fn2_count>, ….}<br>
<br>
Currently, we're recording only the top most
called five function<br>
names<br>
at each indirect call site. Following the
string literal<br>
“indirect_call_targets†are the
fields <total_exec_count><br>
i.e. a 64<br>
bit value for the total number of times the
indirect call is executed<br>
followed by the function names and execution
counts of each target.<br>
</blockquote>
This was the part I was trying to ask about. I
really want to see<br>
where<br>
you're going with this optimization wise. My
naive guess is that this<br>
is going to be slightly off for what you
actually want.<br>
<br>
Assuming you're going for profile guided
devirtualization (and thus<br>
inlining), being able to check the type of the
receiver (as opposed to<br>
the result of the virtual lookup) might be
advantageous. (Or, to say<br>
it<br>
differently, that's what I'm used to seeing.
Your approach might be<br>
completely reasonable, it's just not what I'm
used to seeing.) Have<br>
you<br>
thought about the tradeoffs here?<br>
</blockquote>
Not sure if I understood the problem here,
however, we're recording both<br>
the target address and the addresses/names of the
instrumented functions<br>
during the execution of the instrumented binary.
During profile reading<br>
these addresses are used to match the target
addresses to corresponding<br>
functions. During optimization, target function
name should be adequate<br>
to<br>
retrieve the function type information which then
can be used to compare<br>
the types of the arguments as well as the return
type w/ the call site<br>
to<br>
ensure the match.<br>
<br>
To clarify, our interest at this time is to
upstream all the<br>
infrastructure related changes. The optimizations
using the profile data<br>
will not be part of our initial set of patches. On
the other hand,<br>
agreeing on the metadata format at the call sites
would be advantageous<br>
in<br>
terms of supporting multiple profile methods.<br>
<br>
Thanks,<br>
-Betul<br>
</blockquote>
I have the same questions and concerns as Philip. If
the goal of this is<br>
to support devirtualization, this is not how I would
expect it to be done.<br>
If you really want something more general, can you
provide data to show<br>
that it is worth the cost? Recording all that
information when profiling<br>
will not be cheap, and it adds significant
complexity as well.<br>
</blockquote>
There was an earlier email of mine that went to the
llvm-dev mailing list<br>
discussing about the implementation dependencies and
gains/costs.<br>
<a href="http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084271.html" target="_blank">http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084271.html</a><br>
</blockquote>
</div>
</div>
On that thread, you claim an 8% performance boost on spec.
What optimizations did you implement to get that? Can you
outline the structure?
<div>
<div><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I also don’t think we should consider taking all
of your infrastructure<br>
changes without first seeing at least preliminary
versions of the<br>
optimization patches that demonstrate the benefits.<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Thanks.<br>
-Betul<br>
<br>
<br>
<a href="http://reviews.llvm.org/D8908" target="_blank">http://reviews.llvm.org/D8908</a>
<<a href="http://reviews.llvm.org/D8908" target="_blank">http://reviews.llvm.org/D8908</a>><br>
<br>
EMAIL PREFERENCES<br>
<a href="http://reviews.llvm.org/settings/panel/emailpreferences/" target="_blank">http://reviews.llvm.org/settings/panel/emailpreferences/</a><br>
<<a href="http://reviews.llvm.org/settings/panel/emailpreferences/" target="_blank">http://reviews.llvm.org/settings/panel/emailpreferences/</a>><br>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
<br>
</blockquote>
<br>
<br>
</div>
</div>
<br>
_______________________________________________<br>
llvm-commits mailing list<br>
<a href="mailto:llvm-commits@cs.uiuc.edu" target="_blank">llvm-commits@cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</div></div></div>
</blockquote></div><br></div></div>