[LLVMdev] RFC - Improvements to PGO profile support

Ivan Baev ibaev at codeaurora.org
Wed Mar 25 17:54:18 PDT 2015


Hi David,
The plan makes sense. A question on:

>> 4) implement frequency/profile update interfaces for Inline
>> transformations -- i.e., allowing importing callee's profile info into
>> caller with scaling factor computed from callsite count and callee
>> entry count

Do you envision the following implementation for it? Say we've decided to
inline callee B at callsite C in caller A. In addition to the current
inlining update work, we need to:

1. get block frequency BF of the basic block containing the callsite C by
invoking BlockFrequency on function A;
2. scale BF with the function entry count of A, to get scaledBF
3. decrease function entry count for B with scaledBF
4. compute and scale block frequencies of the basic blocks of B inlined
into A, or re-invoke BlockFrequency on the modified function A.

Thanks,
Ivan

>
> Date: Tue, 24 Mar 2015 13:57:30 -0700
> From: Philip Reames <listmail at philipreames.com>
> To: Xinliang David Li <davidxl at google.com>
> Cc: LLVM Developers Mailing List <llvmdev at cs.uiuc.edu>
> Subject: Re: [LLVMdev] RFC - Improvements to PGO profile support
> Message-ID: <5511CFBA.6060308 at philipreames.com>
> Content-Type: text/plain; charset="utf-8"; format=flowed
>
>
> On 03/24/2015 10:27 AM, Xinliang David Li wrote:
>> Diego and I have discussed this according to the feedback received. We
>> have revised plan for this (see Diego's last reply).  Here is a more
>> detailed re-cap:
>>
>> 1) keep MD_prof definition as it is today; also keep using the
>> frequency propagation as it is (assuming programs with irreducible
>> loops are not common and not important. If it turns out to be
>> otherwise, we will revisit this).
>> 2) fix all problems that lead to wrong 'frequency/count' computed from
>> the frequency propagation algorithm
>>     2.1) relax 32bit limit
>>     2.2) remove 4096 iteration count cap
>>     2.3) remove the 'laplace rule of succession' which can be very
>> harmful
>>     2.4) Fix the bug in  'BranchProbabilityInfo::calcMetadataWeights'
>> that does capping without scaling
>>     2.5) Fix a similar bug in
>> 'MachineBranchProbabilityInfo::getSumForBlock('
>>      .. other bugs found
>> 3) introduce a new meta data to represent function entry global
>> hotness (the implementation can choose to use entry execution count)
>> 4) implement frequency/profile update interfaces for Inline
>> transformations -- i.e., allowing importing callee's profile info into
>> caller with scaling factor computed from callsite count and callee
>> entry count
>> 5) Implement mechanism to compute, record and use profile program
>> summary data (This will be discussed separately).
> This sounds like a very workable approach to me.  I think 2.3 needs a
> bit more discussion, but given it doesn't really directly effect me, I'm
> probably going to stay out of it for now.
>
> Thanks for working on this!
>>
>>
>> thanks,
>>
>> David





More information about the llvm-dev mailing list