[PATCH] D81981: [PGO] Supplement PGO profile with Sample profile

Xinliang David Li via llvm-commits llvm-commits at lists.llvm.org
Wed Jul 22 21:20:29 PDT 2020


On Wed, Jul 22, 2020 at 5:49 PM Wei Mi <wmi at google.com> wrote:

>
>
> On Wed, Jul 22, 2020 at 3:57 PM David Li via Phabricator <
> reviews at reviews.llvm.org> wrote:
>
>> davidxl added a comment.
>>
>> Inlining won't be helped unless there is a hot callsite to the all-zero
>> count function -- but this should not exist.
>
>
> Inlining may be helped because of the hot callee heuristic. The threshold
> will be higher (325) but not be as high as hot callsite (3000).
>


The cold callsite heuristic will supersede, so hot callee heuristic won't
kick in.

> And the callsites inside of the function may have more inlining because of
> hot callsite heuristic. How hot a callsite inside of the function is
> depends on the entry count value of the function.
>
>
>

Since this is approximate, the scaling factor of internal block counts can
still be determined by relative hotness of function in sample profile if
that helps, but it is hard to measure the secondary effects compared with
the cost of putting it wrongly in .text.unlikely.




> I think the major performance hit comes from 1) text.unlikely which may
>> not be mlocked; and 2) all unbiased branches due to zero weights.  So doing
>> this depending it on entry count existence is fine, but we still to teach
>> PGOUse to drop the body. I think a simpler design would be
>>
>> At llvm_profdata side:
>>
>> 1. if the instrumentation cold function has enough internal counts, just
>> scale up the max internal counts to be a multiple of hot threshold
>
>
>> 2. if the cold function has all zero counts or we believe all their
>> internal counts are not trustworthy (basically ignore step 1) with an
>> option), we can simply discard the function entry completely (to signal
>> this function is actually hot, but we don't know internal counts)
>>
>> At PGOUse side:
>>
>> if we don't find counters for a function, set the function's entry value
>> to be above hot threshold (a function statically linked in should always
>> have counts. If there are not counts, it means it is corrected by
>> llvm-profdata).
>>
>
> Ok, in this way we keep a valid entry count in the metadata during
> optimized build without depending on the profile format always having an
> entry count in each function.
>

yes.

> However in this way, we won't be able to treat warm and hot functions
> differently. We will need to treat all the functions as hot even if they
> are only warm in production (I think we want to move both hot and warm
> functions out of .text.unlikely section).
>

Number of functions like these are assumed to be small. If there are large
number of such functions, the training data should really be fixed up
instead of relying on this method.

David

>
>
>>
>>
>> Repository:
>>   rL LLVM
>>
>> CHANGES SINCE LAST ACTION
>>   https://reviews.llvm.org/D81981/new/
>>
>> https://reviews.llvm.org/D81981
>>
>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200722/216baab6/attachment.html>


More information about the llvm-commits mailing list