[LLVMdev] RFC - Improvements to PGO profile support

Xinliang David Li xinliangli at gmail.com
Tue Mar 24 13:24:20 PDT 2015


On Tue, Mar 24, 2015 at 1:07 PM, Chandler Carruth <chandlerc at google.com>
wrote:

>
> On Tue, Mar 24, 2015 at 1:03 PM, Xinliang David Li <xinliangli at gmail.com>
> wrote:
>
>> The design is basically to augment the existing frequency data with one
>> 64bit data which is the global hotness of the function entry (e.g. it be
>> the entry execution count). With the execution count available, the BB
>>  count (or global hotness if you will) is simply:
>>
>>   count(BB)  = freq (BB) * count(ENTRY)/freq(ENTRY)
>>
>> You can view count(ENTRY) as an extension to the current 'hot'/'cold'
>> attribute
>>
>
> Yes, this is what I'm saying the current design was aiming for. Is this
> what you're now suggesting, or are you suggesting something different?
>
>

I am not sure what current design are you referring to.  What I am saying
is that we need to represent 'count(ENTRY)' for IPA purpose, which is
currently missing.


>
>> Note that for IPA, callsite count is obtained from enclosing BB's count.
>>
>
> Yes, there should clearly be a relationship here.
>

By the definition of 'count', it can be used for IPA. This is the key
difference to 'Freq'.


> One way I would like to see this tested is to look at the relative error
> at each level -- if we compute the global "count" as you describe for a
> callsite's BB, and aggregate it with all other callsite BBs, we should get
> something close to function's global value. If we get wildly different
> results, there is probably some flaw in the algorithm used.
>

We don't need to 'compute' global count -- with PGO, it is already there.
We just need to keep it without dropping it.

With PGO, the current block frequency computation flow is roughly like this:

1) read the raw profile count from profile data file.
2) compute the true/false branch target weight from the raw profile data,
and create MD_prof. Note that with capping and other things, the data
associated with MD_prof can only be used as branch probability
3) Invoke BranchProbabilityInfo analysis pass. Converting MD_prof into CFG
level branch probability data (weight) with more capping
4) Invoke BlockFrequencyInfo analysis pass -- this pass sole relies on
branch probability and assumes Freq(ENTRY) = 1 (for fraction
representation).
5) (roughly) Scale up the Freq(BB) data in 4) by making the Integer freq of
BB with the the lowest fractional frequency to be 1


With the proposed change, we just need to add one more step
2.1) Record the count(ENTRY) for the function as a meta data.

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150324/127861d8/attachment.html>


More information about the llvm-dev mailing list