[llvm-dev] [RFC] Profile guided section layout

Sean Silva via llvm-dev llvm-dev at lists.llvm.org
Thu Jun 15 14:30:57 PDT 2017


On Thu, Jun 15, 2017 at 11:09 AM, Xinliang David Li via llvm-dev <
llvm-dev at lists.llvm.org> wrote:

>
>
> On Thu, Jun 15, 2017 at 10:55 AM, Michael Spencer via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> On Thu, Jun 15, 2017 at 10:08 AM, Tobias Edler von Koch <
>> tobias at codeaurora.org> wrote:
>>
>>> Hi Michael,
>>>
>>> This is cool stuff, thanks for sharing!
>>>
>>> On 06/15/2017 11:51 AM, Michael Spencer via llvm-dev wrote:
>>>
>>>> The first is a new llvm pass which uses branch frequency info to get
>>>> counts for each call instruction and then adds a module flags metatdata
>>>> table of function -> function edges along with their counts.
>>>>
>>>> The second takes the module flags metadata and writes it into a
>>>> .note.llvm.callgraph section in the object file. This currently just dumps
>>>> it as text, but could save space by reusing the string table.
>>>>
>>> Have you considered reading the profile in the linker and extracting
>>> that information directly from the profile? The profile should contain call
>>> sites and their sample counts and you could match these up with relocations
>>> (calls) in the section?
>>
>>
> The main reason is that IPO transformations such as inlining and clonining
> will change the hotness of functions, so the original profile can not be
> directly for the purpose of function layout.   There is a similar support
> in Gold plugin for Google GCC.
>

Will this cause issues with ThinLTO? E.g. the thinlto backends are doing
inlining of imported functions. Do we have a mechanism for those decisions
to be reflected in a global profile for the linker to look at?

In theory the thinlto backends can keep a history of their IPO decisions in
metadata or something and emit a section for the linker to aggregate and
reconstruct an accurate global profile, but that seems relatively invasive.

-- Sean Silva


>
> David
>
>
>
>
>>
>> I did this using IR PGO instead of sample PGO so the profile data can
>> only be applied in the same place in the pipeline it is generated. Even for
>> sample based this would be complicated as the linker would actually need to
>> generate machine basic blocks from sections to be able to accurately match
>> sample counts to relocations, as there may be cold calls in hot functions.
>>
>> It may be useful however for the linker to directly accept an externally
>> generated call graph profile. The current approach can actually do this by
>> embedding it into an extra object file.
>>
>>
>>>
>>>
>>> It doesn't currently work for LTO as the llvm pass needs to be run after
>>>> all inlining decisions have been made and LTO codegen has to be done with
>>>> -ffunction-sections.
>>>>
>>> So this is just an implementation issue, right? You can make LTO run
>>> with -ffunction-sections (by setting TargetOptions.FunctionSections=true)
>>> and insert your pass in the appropriate place in the pipeline.
>>>
>>
>> Yeah, just an implementation issue. Just need to build the pass pipeline
>> differently for LTO and add a way to do -ffunction-sections in lld.
>>
>> - Michael Spencer
>>
>>
>>>
>>> Thanks,
>>> Tobias
>>>
>>> --
>>> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
>>> a Linux Foundation Collaborative Project.
>>>
>>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170615/99c83a7d/attachment.html>


More information about the llvm-dev mailing list