<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">On 02/24/2015 03:31 PM, Diego Novillo
wrote:<br>
</div>
<blockquote
cite="mid:CAD_=9DRzhohJkCy9VMZsSfGrGcZokFvfbLtEmbXDCvLE+FkKnQ@mail.gmail.com"
type="cite">
<div dir="ltr"><br>
<div>We (Google) have started to look more closely at the
profiling infrastructure in LLVM. Internally, we have a large
dependency on PGO to get peak performance in generated code.</div>
<div><br>
</div>
<div>Some of the dependencies we have on profiling are still not
present in LLVM (e.g., the inliner) but we will still need to
incorporate changes to support our work on these
optimizations. Some of the changes may be addressed as
individual bug fixes on the existing profiling infrastructure.
Other changes may be better implemented as either new
extensions or as replacements of existing code.</div>
<div><br>
</div>
<div>I think we will try to minimize infrastructure replacement
at least in the short/medium term. After all, it doesn't make
too much sense to replace infrastructure that is broken for
code that doesn't exist yet.</div>
<div><br>
</div>
<div>David Li and I are preparing a document where we describe
the major issues that we'd like to address. The document is a
bit on the lengthy side, so it may be easier to start with an
email discussion. </div>
</div>
</blockquote>
I would personally be interested in seeing a copy of that document,
but it might be more appropriate for a blog post then a discussion
on llvm-dev. I worry that we'd end up with a very unfocused
discussion. It might be better to frame this as your plan of attack
and reserve discussion on llvm-dev for things that are being
proposed semi near term. Just my 2 cents.<br>
<br>
<blockquote
cite="mid:CAD_=9DRzhohJkCy9VMZsSfGrGcZokFvfbLtEmbXDCvLE+FkKnQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>This is a summary of the main changes we are looking at:</div>
<div>
<ol>
<li>Need to faithfully represent the execution count taken
from dynamic profiles. Currently, <font face="monospace,
monospace">MD_prof</font> does not really represent an
execution count. This makes things like comparing hotness
across functions hard or impossible. We need a concept of
global hotness.<br>
</li>
</ol>
</div>
</div>
</blockquote>
What does MD_prof actually represent when used from Clang? I know
I've been using it for execution counters in my frontend. Am I
approaching that wrong?<br>
<br>
As a side comment: I'm a bit leery of the notion of a consistent
notion of hotness based on counters across functions. These
counters are almost always approximate in practice and counting
problems run rampant. I'd almost rather see a consistent count
inferred from data that's assumed to be questionable than make the
frontend try to generate consistent profiling metadata. I think
either approach could be made to work, we just need to think about
it carefully. <br>
<blockquote
cite="mid:CAD_=9DRzhohJkCy9VMZsSfGrGcZokFvfbLtEmbXDCvLE+FkKnQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>
<ol>
<li>When the CFG or callgraph change, there need to exist an
API for incrementally updating/scaling counts. For
instance, when a function is inlined or partially inlined,
when the CFG is modified, etc. These counts need to be
updated incrementally (or perhaps re-computed as a first
step into that direction).</li>
</ol>
</div>
</div>
</blockquote>
Agreed. Do you have a sense how much of an issue this in practice?
I haven't see it kick in much, but it's also not something I've been
looking for. <br>
<blockquote
cite="mid:CAD_=9DRzhohJkCy9VMZsSfGrGcZokFvfbLtEmbXDCvLE+FkKnQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>
<ol>
<li>The inliner (and other optimizations) needs to use
profile information and update it accordingly. This is
predicated on Chandler's work on the pass manager, of
course.<br>
</li>
</ol>
</div>
</div>
</blockquote>
Its worth noting that the inliner work can be done independently of
the pass manager work. We can always explicitly recompute relevant
analysis in the inliner if needed. This will cost compile time, so
we might need to make this an off by default option. (Maybe -O3
only?) Being able to work on the inliner independently of the pass
management structure is valuable enough that we should probably
consider doing this.<br>
<br>
PGO inlining is an area I'm very interested in. I'd really
encourage you to work incrementally in tree. I'm likely to start
putting non-trivial amounts of time into this topic in the next few
weeks. I just need to clear a few things off my plate first. <br>
<br>
Other than the inliner, can you list the passes you think are
profitable to teach about profiling data? My list so far is: PRE
(particularly of loads!), the vectorizer (i.e. duplicate work down
both a hot and cold path when it can be vectorized on the hot path),
LoopUnswitch, IRCE, & LoopUnroll (avoiding code size explosion
in cold code). I'm much more interested in sources of improved
performance than I am simply code size reduction. (Reducing code
size can improve performance of course.)<br>
<blockquote
cite="mid:CAD_=9DRzhohJkCy9VMZsSfGrGcZokFvfbLtEmbXDCvLE+FkKnQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>
<ol>
<li>Need to represent global profile summary data. For
example, for global hotness determination, it is useful to
compute additional global summary info, such as a
histogram of counts that can be used to determine hotness
and working set size estimates for a large percentage of
the profiled execution.</li>
</ol>
</div>
</div>
</blockquote>
Er, not clear what you're trying to say here?<br>
<blockquote
cite="mid:CAD_=9DRzhohJkCy9VMZsSfGrGcZokFvfbLtEmbXDCvLE+FkKnQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>
<div>There are other changes that we will need to incorporate.
David, Teresa, Chandler, please add anything large that I
missed.</div>
<div><br>
</div>
<div>My main question at the moment is what would be the best
way of addressing them. Some seem to require new concepts to
be implemented (e.g., execution counts). Others could be
addressed as simple bugs to be fixed in the current
framework.</div>
</div>
<div><br>
</div>
<div>Would it make sense to present everything in a unified
document and discuss that? I've got some reservations about
that approach because we will end up discussing everything at
once and it may not lead to concrete progress. Another
approach would be to present each issue individually either as
patches or RFCs or bugs.</div>
</div>
</blockquote>
See above. <br>
<blockquote
cite="mid:CAD_=9DRzhohJkCy9VMZsSfGrGcZokFvfbLtEmbXDCvLE+FkKnQ@mail.gmail.com"
type="cite">
<div dir="ltr">
<div><br>
</div>
<div>I will be taking on the implementation of several of these
issues. Some of them involve the SamplePGO harness that I
added last year. I would also like to know what other bugs or
problems people have in mind that I could also roll into this
work.</div>
<div><br>
</div>
<div><br>
</div>
<div>Thanks. Diego.<br>
</div>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">_______________________________________________
LLVM Developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a class="moz-txt-link-freetext" href="http://llvm.cs.uiuc.edu">http://llvm.cs.uiuc.edu</a>
<a class="moz-txt-link-freetext" href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a>
</pre>
</blockquote>
<br>
</body>
</html>