[cfe-dev] [llvm-dev] [RFC] New Feature Proposal: De-Optimizing Cold Functions using PGO Info

Nemanja Ivanovic via cfe-dev cfe-dev at lists.llvm.org
Wed Sep 9 06:27:26 PDT 2020


This sounds very interesting and the compile time gains in the conservative
range (say under 25%) seem quite promising.

One concern that comes to mind is if it is possible for performance to
degrade severely in the situation where a function has a hot call site
(where it gets inlined) and some non-zero number of cold sites (where it
does not get inlined). When we decorate the function with `optnone,
noinline` it will presumably not be inlined into the hot call site any
longer and will furthermore be unoptimized.
Have you considered such a case and if so, is it something that cannot
happen (i.e. inlining has already happened, etc.) or something that we can
mitigate in the future?

A more aesthetic comment I have is that personally, I would prefer a single
option with a default percentage (say 0%) rather than having to specify two
options.
Also, it might be useful to add an option to dump the names of functions
that are decorated so the user can track an execution count of such
functions when running their code. But of course, the debug messages may be
adequate for this purpose.

Nemanja

On Wed, Sep 9, 2020 at 6:26 AM Tobias Hieta via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

> Hello,
>
> We use PGO to optimize clang itself. I can see if I have time to give this
> patch some testing. Anything special to look out for except compile
> benchmark and time to build clang, do you expect any changes in code size?
>
> On Wed, Sep 9, 2020, 10:03 Renato Golin via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> On Wed, 9 Sep 2020 at 01:21, Min-Yih Hsu via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> From the above experiments we observed that compilation / link time
>>> improvement scaled linearly with the percentage of cold functions we
>>> skipped. Even if we only skipped functions that never got executed (i.e.
>>> had counter values equal to zero, which is effectively “0%”), we already
>>> had 5~10% of “free ride” on compilation / linking speed improvement and
>>> barely had any target performance penalty.
>>>
>>
>> Hi Min (Paul, Edd),
>>
>> This is great work! Small, clear patch, substantial impact, virtually no
>> downsides.
>>
>> Just looking at your test-suite numbers, not optimising functions "never
>> used" during the profile run sounds like an obvious "default PGO behaviour"
>> to me. The flag defining the percentage range is a good option for
>> development builds.
>>
>> I imagine you guys have run this on internal programs and found
>> beneficial, too, not just the LLVM test-suite (which is very small and
>> non-representative). It would be nice if other groups that already use PGO
>> could try that locally and spot any issues.
>>
>> cheers,
>> --renato
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20200909/3bc21098/attachment.html>


More information about the cfe-dev mailing list