[LLVMdev] RFC - Improvements to PGO profile support

Teresa Johnson tejohnson at google.com
Wed Feb 25 14:15:11 PST 2015


On Wed, Feb 25, 2015 at 2:03 PM, Philip Reames <listmail at philipreames.com>
wrote:

>
> On 02/25/2015 12:11 PM, Teresa Johnson wrote:
>
>> On Wed, Feb 25, 2015 at 10:52 AM, Philip Reames
>> <listmail at philipreames.com> wrote:
>>
>>> Other than the inliner, can you list the passes you think are profitable
>>> to
>>> teach about profiling data?  My list so far is: PRE (particularly of
>>> loads!), the vectorizer (i.e. duplicate work down both a hot and cold
>>> path
>>> when it can be vectorized on the hot path), LoopUnswitch, IRCE, &
>>> LoopUnroll
>>> (avoiding code size explosion in cold code).  I'm much more interested in
>>> sources of improved performance than I am simply code size reduction.
>>> (Reducing code size can improve performance of course.)
>>>
>> Also, code layout (bb layout, function layout, function splitting).
>>
> Correct me if I'm wrong, but we already have "function layout" (i.e. basic
> block placement).  It may need improved, but I've found it to be reasonable
> effective.
>
> What do you mean by "bb layout"?
>

By bb layout I was referring to basic block placement - I am not overly
familiar with LLVM's implementation, but I know that this typically
benefits from profile information.

By function layout, I meant layout of functions within the module and then
the executable. This could simply be marking/separating hot vs cold
functions, or could be fancier via a linker plugin to use profile data to
colocate functions with affinity.


> I'm assuming you're referring to a form of outlining as "function
> splitting".  This seems like a clearly good idea.


Right.

Thanks,
Teresa


>
>
>>  Need to represent global profile summary data. For example, for global
>>> hotness determination, it is useful to compute additional global summary
>>> info, such as a histogram of counts that can be used to determine hotness
>>> and working set size estimates for a large percentage of the profiled
>>> execution.
>>>
>>> Er, not clear what you're trying to say here?
>>>
>> The idea is to get a sense of a good global profile count threshold to
>> use given an application's profile, i.e. when determining whether a
>> profile count is hot in the given profile. For example, what is the
>> minimum profile count contributing to the hottest 99% of the
>> application's profile.
>>
> Ah, got it.  That makes sense for a C/C++ use case.
>
> In my use case, we're only compiling the hottest methods.  As such, I care
> mostly about relative hotness within a method or a small set of methods.
>
>


-- 
Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150225/e3c91769/attachment.html>


More information about the llvm-dev mailing list