[cfe-dev] [llvm-dev] [RFC] New Feature Proposal: De-Optimizing Cold Functions using PGO Info

Wed Sep 9 10:15:03 PDT 2020

Hi Tobias and Dominique,

I didn't evaluate the impact on code size in the first place since it was
not my primary goal. But thanks to the design of LLVM Test Suite
benchmarking infrastructure, I can call out those numbers right away.

(Non-LTO)
Experiment Name                 Code Size Increase Percentage
DeOpt Cold Zero Count                        5.2%
DeOpt Cold 25%                                   6.8%
DeOpt Cold 50%                                   7.0%
DeOpt Cold 75%                                   7.0%

(FullLTO)
Experiment Name                 Code Size Increase Percentage
DeOpt Cold Zero Count                        4.8%
DeOpt Cold 25%                                   6.4%
DeOpt Cold 50%                                   6.2%
DeOpt Cold 75%                                   5.3%

For non-LTO its cap is around 7%. For FullLTO things got a little more
interesting where code size actually decreased when we increased the cold
threshold, but I'll say it's around 6%. To dive a little deeper, the
majority of increased code size was (not-surprisingly) coming from the
.text section. The PLT section contributed a little bit, and the rest of
sections brealey changed.

Though the overhead on code size is higher than the target performance
overhead, I think it's still acceptable in normal cases. In addition, David
mentioned in D87337 that LLVM has used similar techniques on code size (not
sure what he was referencing, my guess will be something related to
hot-cold code splitting). So I think the feature we're proposing here can
be a complement to that one.

Finally: Tobias, thanks for evaluating the impact on Clang, I'm really
interested to see the result.

Best,
Min

On Wed, Sep 9, 2020 at 3:26 AM Tobias Hieta <tobias at plexapp.com> wrote:

> Hello,
>
> We use PGO to optimize clang itself. I can see if I have time to give this
> patch some testing. Anything special to look out for except compile
> benchmark and time to build clang, do you expect any changes in code size?
>
> On Wed, Sep 9, 2020, 10:03 Renato Golin via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> On Wed, 9 Sep 2020 at 01:21, Min-Yih Hsu via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> From the above experiments we observed that compilation / link time
>>> improvement scaled linearly with the percentage of cold functions we
>>> skipped. Even if we only skipped functions that never got executed (i.e.
>>> had counter values equal to zero, which is effectively “0%”), we already
>>> had 5~10% of “free ride” on compilation / linking speed improvement and
>>> barely had any target performance penalty.
>>>
>>
>> Hi Min (Paul, Edd),
>>
>> This is great work! Small, clear patch, substantial impact, virtually no
>> downsides.
>>
>> Just looking at your test-suite numbers, not optimising functions "never
>> used" during the profile run sounds like an obvious "default PGO behaviour"
>> to me. The flag defining the percentage range is a good option for
>> development builds.
>>
>> I imagine you guys have run this on internal programs and found
>> beneficial, too, not just the LLVM test-suite (which is very small and
>> non-representative). It would be nice if other groups that already use PGO
>> could try that locally and spot any issues.
>>
>> cheers,
>> --renato
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>

-- 
Min-Yih Hsu
Ph.D Student in ICS Department, University of California, Irvine (UCI).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200909/e543f40f/attachment.html>