<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 03/24/2015 09:59 AM, Philip Reames
wrote:<br>
</div>
<blockquote cite="mid:5511980C.5080901@philipreames.com" type="cite">
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
<div class="moz-cite-prefix">On 03/10/2015 10:14 AM, Diego Novillo
wrote:<br>
</div>
<blockquote
cite="mid:CAD_=9DSzNSqpku0FjpeLyugaxshcxY9fLyyZ5EmD=i5TVDiqRA@mail.gmail.com"
type="cite">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Thu, Mar 5, 2015 at 11:29 AM,
Bob Wilson <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:bob.wilson@apple.com" target="_blank"
class="cremed">bob.wilson@apple.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div style="word-wrap:break-word">
<div>
<div class="h5"><br>
<div>
<blockquote type="cite">
<div>On Mar 2, 2015, at 4:19 PM, Diego Novillo
<<a moz-do-not-send="true"
href="mailto:dnovillo@google.com"
target="_blank" class="cremed">dnovillo@google.com</a>>
wrote:</div>
<br>
<div>
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">On Thu, Feb 26,
2015 at 6:54 PM, Diego Novillo <span
dir="ltr"><<a
moz-do-not-send="true"
href="mailto:dnovillo@google.com"
target="_blank" class="cremed">dnovillo@google.com</a>></span>
wrote:</div>
<div class="gmail_quote"><br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div dir="ltr">
<div>I've created a few bugzilla
issues with details of some of
the things I'll be looking into.
I'm not yet done wordsmithing
the overall design document.
I'll try to finish it by early
next week at the latest.</div>
</div>
</blockquote>
<div><br>
</div>
<div>The document is available at</div>
<div><br>
</div>
<div><a moz-do-not-send="true"
href="https://docs.google.com/document/d/15VNiD-TmHqqao_8P-ArIsWj1KdtU-ElLFaYPmZdrDMI/edit?usp=sharing"
target="_blank" class="cremed">https://docs.google.com/document/d/15VNiD-TmHqqao_8P-ArIsWj1KdtU-ElLFaYPmZdrDMI/edit?usp=sharing </a></div>
<div><br>
</div>
<div>There are several topics covered.
Ideally, I would prefer that we
discuss each topic separately. The
main ones I will start working on
are the ones described in the
bugzilla links we have in the doc.</div>
<div><br>
</div>
<div>This is just a starting point for
us. I am not at all concerned with
implementing exactly what is
proposed in the document. In fact,
if we can get the same value using
the existing support, all the
better.</div>
<div><br>
</div>
<div>OTOH, any other ideas that folks
may have that work better than this
are more than welcome. I don't have
really strong opinions on the
matter. I am fine with whatever
works.</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
</div>
</div>
</div>
<div>Thanks for the detailed write-up on this. Some of
the issues definitely need to be addressed. I am
concerned, though, that some of the ideas may be
leading toward a scenario where we have essentially
two completely different ways of representing
profile information in LLVM IR. It is great to have
two complementary approaches to collecting profile
data, but two representations in the IR would not
make sense.</div>
</div>
</blockquote>
<div><br>
</div>
<div>Yeah, I don't think I'll continue to push for a new
MD_count attribute. If we were to make MD_prof be a
"real" execution count, that would be enough. Note that
by re-using MD_prof we are not changing its meaning at
all. The execution count is still a weight and the ratio
is still branch probability. All that we are changing
are the absolute values of the number and increasing its
data type width to remove the 32bit limitation.</div>
</div>
</div>
</div>
</blockquote>
Independent of everything else, relaxing the 32 bit restriction is
clearly a good idea. This would make a great standalone patch. <br>
<blockquote
cite="mid:CAD_=9DSzNSqpku0FjpeLyugaxshcxY9fLyyZ5EmD=i5TVDiqRA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><span style="font-size:12.8000001907349px"><br>
</span></div>
<div><br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div style="word-wrap:break-word">
<div><br>
</div>
<div>The first issue raised is that profile execution
counts are not represented in the IR. This was a
very intentional decision. I know it goes against
what other compilers have done in the past. It took
me a while to get used to the idea when Andy first
suggested it, so I know it seems awkward at first.
The advantage is that branch probabilities are much
easier to keep updated in the face of compiler
transformations, compared to execution counts.</div>
</div>
</blockquote>
<div><br>
</div>
<div>Sorry. I don't follow. Updating counts as the CFG is
transformed is not difficult at all. What examples do
you have in mind? The big advantage of making MD_prof
an actual <span style="font-size:12.8000001907349px">execution
count is that it is a meaningful metric wrt scaling
and transformation.</span></div>
<div>
<div><span style="font-size:12.8000001907349px"><br>
</span></div>
<div><span style="font-size:12.8000001907349px">Say, for
instance, that we have a branch instruction with two
targets with counts {100, 300} inside a function
'foo' that has entry count 2. The edge probability
for the first edge (count 100) is 100/(100+300) =
25%.</span></div>
<div><span style="font-size:12.8000001907349px"><br>
</span></div>
<div><span style="font-size:12.8000001907349px">If we
inline foo() inside another function bar() at a
callsite with profile count == 1, the cloned branch
instruction gets its counters scaled with the
callsite count. So the new branch has counts {100 *
1 / 2, 300 * 1 / 2} = {50, 150}. But the branch
probability did not change. Currently, we are
cloning the branch without changing the edge
weights.</span></div>
<div><span style="font-size:12.8000001907349px"><br>
</span></div>
<div><span style="font-size:12.8000001907349px">This
scaling is not difficult at all and can be
incrementally very quickly. We cannot afford to
recompute all frequencies on the fly because it
would be detrimental to compile time. If foo()
itself has N callees inlined into it, each inlined
callee needs to trigger a re-computation. When foo()
is inlined into bar(), the frequencies will need to
be recomputed for foo() and all N callees inlined
into foo().</span></div>
</div>
</div>
</div>
</div>
</blockquote>
It really sounds like your proposal is to essentially eagerly
compute scaling rather than lazyily compute it on demand.
Assuming perfect implementations for both (with no rounding
losses), the results should be the same. Is that a correct
restatement? I'm going to hold off on responding to why that's a
bad idea until you confirm, because I'm not sure I follow what
you're trying to say. :)<br>
<br>
Also, trusting exact entry counts is going to be somewhat
suspect. These are *highly* susceptible to racy updates,
overflow, etc... Anything which puts too much implicit trust in
these numbers is going to be problematic. <br>
<blockquote
cite="mid:CAD_=9DSzNSqpku0FjpeLyugaxshcxY9fLyyZ5EmD=i5TVDiqRA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>
<div> <br>
</div>
</div>
<div> <br style="font-size:12.8000001907349px">
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div style="word-wrap:break-word">
<div> We are definitely missing the per-function
execution counts that are needed to be able to
compare relative “hotness” across functions, and I
think that would be a good place to start making
improvements. In the long term, we should keep our
options open to making major changes, but before we
go there, we should try to make incremental
improvements to fix the existing infrastructure.</div>
</div>
</blockquote>
<div><br>
</div>
<div>Right, and that's the core of our proposal. We don't
really want to make major infrastructure changes at this
point. The only thing I'd like to explore is making
MD_prof a real count. This will be useful for the
inliner changes and it would also make incremental
updates easier, because the scaling that needs to be
done is very straightforward and quick.</div>
<div><br>
</div>
<div>Note that this change should not modify the current
behaviour we get from profile analysis. Things that were
hot before should continue to be hot now.</div>
</div>
</div>
</div>
</blockquote>
I have no objection to adding a mechanism for expressing an entry
count. I am still very hesitant about the proposals with regards
to redefining the current MD_prof. <br>
<br>
I'd encourage you to post a patch for the entry count mechanism,
but not tie its semantics to exact execution count. (Something
like "the value provided must correctly describe the relative
hotness of this routine against others in the program annoatated
with the same metadata. It is the relative scaling that is
important, not the absolute value. In particular, the value need
not be an exact execution count.")<br>
<blockquote
cite="mid:CAD_=9DSzNSqpku0FjpeLyugaxshcxY9fLyyZ5EmD=i5TVDiqRA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div style="word-wrap:break-word">
<div><br>
</div>
<div>Many of the other issues you raise seem like they
could also be addressed without major changes to the
existing infrastructure. Let’s try to fix those
first.</div>
</div>
</blockquote>
</div>
<br>
</div>
<div class="gmail_extra">That's exactly the point of the
proposal. We definitely don't want to make major changes to
the infrastructure at first. My thinking is to start working
on making MD_prof a real count. One of the things that are
happening is that the combination of real profile data plus
the frequency propagation that we are currently doing is
misleading the analysis.</div>
</div>
</blockquote>
I consider this a major change. You're trying to redefine a major
part of the current system. <br>
<br>
Multiple people have spoken up and objected to this change (as
currently described). Please start somewhere else. <br>
</blockquote>
Sorry, "objected" is too strong a word here. I should have said
"have reservations about". <br>
<br>
My point here - which I'm not sure I expressed well - is that I'm
concerned we're going to bog down on a controversial change rather
than make progress on the parts we agree on. I'm not saying that we
should never redefine things in the way your proposing, but I would
like to see the parts that work under both schemes be addressed
first. We can continue this discussion in parallel. <br>
<blockquote cite="mid:5511980C.5080901@philipreames.com" type="cite">
<blockquote
cite="mid:CAD_=9DSzNSqpku0FjpeLyugaxshcxY9fLyyZ5EmD=i5TVDiqRA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra"><br>
</div>
<div class="gmail_extra">For example (thanks David for the
code and data). In the following code:</div>
<div class="gmail_extra"><br>
</div>
<div class="gmail_extra">
<div class="gmail_extra"><font face="monospace, monospace">int
g;</font></div>
<div class="gmail_extra"><font face="monospace, monospace">__attribute__((noinline))
void bar() {</font></div>
<div class="gmail_extra"><font face="monospace, monospace"> g++;</font></div>
<div class="gmail_extra"><font face="monospace, monospace">}</font></div>
<div class="gmail_extra"><font face="monospace, monospace"><br>
</font></div>
<div class="gmail_extra"><font face="monospace, monospace">extern
int printf(const char*, ...);</font></div>
<div class="gmail_extra"><font face="monospace, monospace"><br>
</font></div>
<div class="gmail_extra"><font face="monospace, monospace">int
main()</font></div>
<div class="gmail_extra"><font face="monospace, monospace">{</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
int i, j, k;</font></div>
<div class="gmail_extra"><font face="monospace, monospace"><br>
</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
g = 0;</font></div>
<div class="gmail_extra"><font face="monospace, monospace"><br>
</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
// Loop 1.</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
for (i = 0; i < 100; i++)</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
for (j = 0; j < 100; j++)</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
for (k = 0; k < 100; k++)</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
bar();</font></div>
<div class="gmail_extra"><font face="monospace, monospace"><br>
</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
printf ("g = %d\n", g);</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
g = 0;</font></div>
<div class="gmail_extra"><font face="monospace, monospace"><br>
</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
// Loop 2.</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
for (i = 0; i < 100; i++)</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
for (j = 0; j < 10000; j++)</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
bar();</font></div>
<div class="gmail_extra"><font face="monospace, monospace"><br>
</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
printf ("g = %d\n", g);</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
g = 0;</font></div>
<div class="gmail_extra"><font face="monospace, monospace"><br>
</font></div>
<div class="gmail_extra"><font face="monospace, monospace"><br>
</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
// Loop 3.</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
for (i = 0; i < 1000000; i++)</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
bar();</font></div>
<div class="gmail_extra"><font face="monospace, monospace"><br>
</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
printf ("g = %d\n", g);</font></div>
<div class="gmail_extra"><font face="monospace, monospace">
g = 0;</font></div>
<div class="gmail_extra"><font face="monospace, monospace">}</font></div>
<div class="gmail_extra"><font face="monospace, monospace"><br>
</font></div>
<div class="gmail_extra"><font face="arial, helvetica,
sans-serif">When compiled with profile instrumentation,
frequency propagation is distorting the real profile
because it gives different frequency to the calls to
bar() in the 3 different loops. All 3 loops execute
1,000,000 times, but after frequency propagation, the
first call to bar() gets a weight of 520,202 in loop #1,
210,944 in loop #2 and 4,096 in loop #3. In reality,
every call to bar() should have a weight of 1,000,000.</font></div>
</div>
</div>
</blockquote>
<font face="arial, helvetica, sans-serif">Duncan responded to
this. My conclusion from his response: this is a bug, not a
fundamental issue. Remove the max scaling factor, switch the
counts to 64 bits and everything should be fine. If you
disagree, let's discuss. </font><br>
<blockquote
cite="mid:CAD_=9DSzNSqpku0FjpeLyugaxshcxY9fLyyZ5EmD=i5TVDiqRA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_extra"><font face="arial, helvetica,
sans-serif"><br>
</font></div>
<div class="gmail_extra"><br>
</div>
<div class="gmail_extra">Thanks. Diego.</div>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
LLVM Developers mailing list
<a moz-do-not-send="true" class="moz-txt-link-abbreviated" href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a>
<a moz-do-not-send="true" class="moz-txt-link-freetext" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a>
</pre>
</blockquote>
<br>
</blockquote>
<br>
</body>
</html>