[LLVMdev] RFC - Improvements to PGO profile support

Diego Novillo dnovillo at google.com
Mon Mar 16 07:20:40 PDT 2015


On Thu, Mar 12, 2015 at 5:42 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:

There are two things going on here.
>
> Firstly, the loop scales are being capped at 4096.  I propagated this
> approximation from the previous version of BFI.  If it's causing a
> problem (which it looks like it is), we should drop it and fix what
> breaks.  You can play around with this by commenting out the `if`
> statement at the end of `computeLoopScale()` in
> BlockFrequencyInfoImpl.cpp.
>
> For example, without that logic this testcase gives:
>
>     Printing analysis 'Block Frequency Analysis' for function 'main':
>     block-frequency-info: main
>      - entry: float = 1.0, int = 8
>      - for.cond: float = 51.5, int = 411
>      - for.body: float = 50.5, int = 403
>      - for.cond1: float = 5051.0, int = 40407
>      - for.body3: float = 5000.5, int = 40003
>      - for.cond4: float = 505001.0, int = 4040007
>      - for.body6: float = 500000.5, int = 4000003
>      - for.inc: float = 500000.5, int = 4000003
>      - for.end: float = 5000.5, int = 40003
>      - for.inc7: float = 5000.5, int = 40003
>      - for.end9: float = 50.5, int = 403
>      - for.inc10: float = 50.5, int = 403
>      - for.end12: float = 1.0, int = 8
>      - for.cond13: float = 51.5, int = 411
>      - for.body15: float = 50.5, int = 403
>      - for.cond16: float = 500051.0, int = 4000407
>      - for.body18: float = 500000.5, int = 4000003
>      - for.inc19: float = 500000.5, int = 4000003
>      - for.end21: float = 50.5, int = 403
>      - for.inc22: float = 50.5, int = 403
>      - for.end24: float = 1.0, int = 8
>      - for.cond26: float = 500001.5, int = 4000011
>      - for.body28: float = 500000.5, int = 4000003
>      - for.inc29: float = 500000.5, int = 4000003
>      - for.end31: float = 1.0, int = 8
>
> (Now we get 500000.5 for all the inner loop bodies.)
>
> Secondly, instrumentation-based profiling intentionally fuzzes the
> profile data in the frontend using Laplace's Rule of Succession (look at
> `scaleBranchWeight()` in CodeGenPGO.cpp).
>
> For example, "loop 1" (which isn't affected by the 4096 cap) should give
> a loop scale of 500000.5, not 1000000.  (The profile data says
> 1000000/10000 for the inner loop, 10000/100 for the middle, and 100/1
> for the outer loop.  Laplace says that we should fuzz these branch
> weights to 1000001/10001, 10001/101, and 101/2, which works out to
> 1000001/2 == 500000.5 total.)
>
>
Thanks, Duncan. I've started working on this as a first step. Long term,
the capping done during propagation is not much of a concern because we
would probably not using propagation if real profile information is
available (at least that's the initial thinking).

Thanks for all the feedback, folks. I'll start sending out patches soon
that address specific issues. We can continue the discussion there.


Diego.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150316/c06befdf/attachment.html>


More information about the llvm-dev mailing list