[LLVMdev] RFC - Improvements to PGO profile support
Xinliang David Li
xinliangli at gmail.com
Tue Mar 24 13:24:20 PDT 2015
On Tue, Mar 24, 2015 at 1:07 PM, Chandler Carruth <chandlerc at google.com>
> On Tue, Mar 24, 2015 at 1:03 PM, Xinliang David Li <xinliangli at gmail.com>
>> The design is basically to augment the existing frequency data with one
>> 64bit data which is the global hotness of the function entry (e.g. it be
>> the entry execution count). With the execution count available, the BB
>> count (or global hotness if you will) is simply:
>> count(BB) = freq (BB) * count(ENTRY)/freq(ENTRY)
>> You can view count(ENTRY) as an extension to the current 'hot'/'cold'
> Yes, this is what I'm saying the current design was aiming for. Is this
> what you're now suggesting, or are you suggesting something different?
I am not sure what current design are you referring to. What I am saying
is that we need to represent 'count(ENTRY)' for IPA purpose, which is
>> Note that for IPA, callsite count is obtained from enclosing BB's count.
> Yes, there should clearly be a relationship here.
By the definition of 'count', it can be used for IPA. This is the key
difference to 'Freq'.
> One way I would like to see this tested is to look at the relative error
> at each level -- if we compute the global "count" as you describe for a
> callsite's BB, and aggregate it with all other callsite BBs, we should get
> something close to function's global value. If we get wildly different
> results, there is probably some flaw in the algorithm used.
We don't need to 'compute' global count -- with PGO, it is already there.
We just need to keep it without dropping it.
With PGO, the current block frequency computation flow is roughly like this:
1) read the raw profile count from profile data file.
2) compute the true/false branch target weight from the raw profile data,
and create MD_prof. Note that with capping and other things, the data
associated with MD_prof can only be used as branch probability
3) Invoke BranchProbabilityInfo analysis pass. Converting MD_prof into CFG
level branch probability data (weight) with more capping
4) Invoke BlockFrequencyInfo analysis pass -- this pass sole relies on
branch probability and assumes Freq(ENTRY) = 1 (for fraction
5) (roughly) Scale up the Freq(BB) data in 4) by making the Integer freq of
BB with the the lowest fractional frequency to be 1
With the proposed change, we just need to add one more step
2.1) Record the count(ENTRY) for the function as a meta data.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the llvm-dev