[LLVMdev] RFC - Improvements to PGO profile support

Xinliang David Li xinliangli at gmail.com
Tue Mar 24 13:24:20 PDT 2015

On Tue, Mar 24, 2015 at 1:07 PM, Chandler Carruth <chandlerc at google.com>

> On Tue, Mar 24, 2015 at 1:03 PM, Xinliang David Li <xinliangli at gmail.com>
> wrote:
>> The design is basically to augment the existing frequency data with one
>> 64bit data which is the global hotness of the function entry (e.g. it be
>> the entry execution count). With the execution count available, the BB
>>  count (or global hotness if you will) is simply:
>>   count(BB)  = freq (BB) * count(ENTRY)/freq(ENTRY)
>> You can view count(ENTRY) as an extension to the current 'hot'/'cold'
>> attribute
> Yes, this is what I'm saying the current design was aiming for. Is this
> what you're now suggesting, or are you suggesting something different?

I am not sure what current design are you referring to.  What I am saying
is that we need to represent 'count(ENTRY)' for IPA purpose, which is
currently missing.

>> Note that for IPA, callsite count is obtained from enclosing BB's count.
> Yes, there should clearly be a relationship here.

By the definition of 'count', it can be used for IPA. This is the key
difference to 'Freq'.

> One way I would like to see this tested is to look at the relative error
> at each level -- if we compute the global "count" as you describe for a
> callsite's BB, and aggregate it with all other callsite BBs, we should get
> something close to function's global value. If we get wildly different
> results, there is probably some flaw in the algorithm used.

We don't need to 'compute' global count -- with PGO, it is already there.
We just need to keep it without dropping it.

With PGO, the current block frequency computation flow is roughly like this:

1) read the raw profile count from profile data file.
2) compute the true/false branch target weight from the raw profile data,
and create MD_prof. Note that with capping and other things, the data
associated with MD_prof can only be used as branch probability
3) Invoke BranchProbabilityInfo analysis pass. Converting MD_prof into CFG
level branch probability data (weight) with more capping
4) Invoke BlockFrequencyInfo analysis pass -- this pass sole relies on
branch probability and assumes Freq(ENTRY) = 1 (for fraction
5) (roughly) Scale up the Freq(BB) data in 4) by making the Integer freq of
BB with the the lowest fractional frequency to be 1

With the proposed change, we just need to add one more step
2.1) Record the count(ENTRY) for the function as a meta data.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150324/127861d8/attachment.html>

More information about the llvm-dev mailing list