[LLVMdev] Capabilities of Clang's PGO (e.g. improving code density)

Teresa Johnson tejohnson at google.com
Thu May 28 11:08:32 PDT 2015


On Thu, May 28, 2015 at 9:56 AM, Philip Reames
<listmail at philipreames.com> wrote:
>
>
> On 05/27/2015 11:13 AM, Duncan P. N. Exon Smith wrote:
>>>
>>> On 2015 May 27, at 07:42, Diego Novillo <dnovillo at google.com> wrote:
>>>
>>> On Tue, May 26, 2015 at 11:47 PM, Lee Hunt <leehu at exchange.microsoft.com>
>>> wrote:
>>>
>>>> For example, from reading different pages on how Clang PGO, it’s unclear
>>>> if
>>>> it does “block reordering” (i.e. moving unexecuted code blocks to a
>>>> distant
>>>> code page, leaving only ‘hot’ executed code packed together for greater
>>>> code
>>>> density).  I find mention of “hot arc” optimization (-fprofile-arcs) ,
>>>> but
>>>> I’m unclear if this is the same thing.  Does Clang PGO do block
>>>> reordering?
>>>
>>> A small clarification.  Clang itself does not implement any
>>> optimizations.  Clang limits itself to generate LLVM IR.  The
>>> annotated IR is then used by some LLVM optimizers to guide decisions.
>>> At this time, there are few optimization passes that use the profile
>>> information: block reordering and register allocation (to avoid
>>> spilling on cold paths).
>>>
>>> There are no other significant transformations that use profiling
>>> information. We are working on that.  Notably, we'd like to add
>>> profiling-based decisions to the inliner
>>
>> Just a quick note about the inliner.  Although the inliner itself
>> doesn't know how to use the profile, clang's IRGen has been modified
>> to add an 'inlinehint' attribute to hot functions and the 'cold'
>> attribute to cold functions.  Indirectly, PGO does affect the
>> inliner.  (We'll remove this once the inliner does the right thing on
>> its own.)
>
> OT: Can you give me a pointer to the clang code involved?  I wasn't aware of
> this.

It is set in clang/lib/CodeGen/CodeGenPGO.cpp
CodeGenPGO::applyFunctionAttributes.

Note that it uses the function entry count to determine hotness. This
means that functions entered infrequently but containing very hot
loops would be marked cold, perhaps this works since it is only used
for inlining and is presumably a stand-in for call edge hotness. The
MaxFunctionCount for the profile is also the max of all the function
entry counts (set during profile writing).

Teresa

>
>>
>>> , loop optimizers and the
>>> vectorizer.
>>>
>>>
>>> Diego.
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev



-- 
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413




More information about the llvm-dev mailing list