[LLVMdev] RFC - Making SamplePGO a module pass

Diego Novillo dnovillo at google.com
Fri Jul 31 13:15:17 PDT 2015


Dehao and I have been discussing changes we need to make to SamplePGO to
make it more effective.

Currently, SamplePGO is a scalar pass that limits itself to add branch
weight annotations.  It runs pretty early in the pipeline, so this is fine
for other scalar passes that want to use profile data (block layout and
regalloc).

However, it does nothing to help module passes. Notably, the inliner. What
Dehao has found in his experience with GCC is that in order to help the
inliner, SamplePGO needs to become a module pass.

Mainly, it needs to be able to affect inlining decisions.  If a branch into
a call site has many samples, we want to tell the inliner about it so it
increases the inlinining score for that call site.

Additionally, SamplePGO may need to actually perform some inlining before
the inliner runs.  This is needed to better match the samples obtained from
optimized binaries.  For example, suppose the binary had 3 functions A(),
B() and C() all calling function foo().  When the code is executed assume
that A() has many samples (i.e., it's hot) while B() and C() have no
samples.

Also assume that foo() was originally inlined in A(), B() and C().  When
SamplePGO is analyzing function A(), it will find samples for the inlined
copy of foo().

At that point, SamplePGO may want to perform the inline of foo() into A()'s
call site so that it can better match the samples it gets from the
profile.  At the same time, since B() and C() had no/little samples to
them, it wants to mark the respective call sites cold so the inliner
doesn't bother with them.

Chandler, is this something we can realistically do?  I believe the first
step would be to make SamplePGO a module pass, make sure it runs before the
inliner and then we can see how we can implement the above behaviour, or
some variant of it that provides the same benefit (e.g., cloning).

Something similar will be needed for devirtualization and indirect calls.
Sampling exposes actual devirtualization and indirect call opportunities.


Thanks.  Diego.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150731/456e004b/attachment.html>


More information about the llvm-dev mailing list