[PATCH] D42212: [ThinLTO] Add call edges' relative block frequency to per-module summary.

Fri Jan 19 17:33:41 PST 2018

On Fri, Jan 19, 2018, 4:28 PM Easwaran Raman <eraman at google.com> wrote:

> On Fri, Jan 19, 2018 at 4:23 PM, Teresa Johnson <tejohnson at google.com>
> wrote:
>
>>
>>
>> On Fri, Jan 19, 2018 at 4:15 PM, Easwaran Raman <eraman at google.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, Jan 19, 2018 at 4:08 PM, Teresa Johnson via Phabricator <
>>> reviews at reviews.llvm.org> wrote:
>>>
>>>> tejohnson added inline comments.
>>>>
>>>>
>>>> ================
>>>> Comment at: include/llvm/IR/ModuleSummaryIndex.h:58
>>>>    HotnessType Hotness = HotnessType::Unknown;
>>>> +  uint64_t RelBlockFreq = 0;
>>>>
>>>> ----------------
>>>> eraman wrote:
>>>> > tejohnson wrote:
>>>> > > Do we need 64 bits in practice? Especially since these are scaled
>>>> down by the entry freq. Concerned about the size increase for large indices
>>>> that have huge numbers of edges. Also, since we'll never have both the
>>>> Hotness and the RelBlockFreq, it would be great if we could union them,
>>>> although that would require a flag on the index and I suppose we might want
>>>> to support mixing modules built with and without profile data? Perhaps some
>>>> other way to combine them. I.e. use a single field and reserve the lowest 3
>>>> bits for the hotness, shifting the relative frequency into the upper n-3
>>>> bits...
>>>> > From our offline discussions I thought the use of 64 bits doesn't
>>>> matter as they get compressed. Is the concern about in-memory size? I can
>>>> reduce the size to, say, 29 bits (uint32_t with 3 bits reserved for the
>>>> enum). Another thing to note is that only the per-module summary has this
>>>> information.  Is there a way to make use of this?
>>>> Right - concerned about in-memory size. Why does only the per-module
>>>> summary contain this? Isn't that just temporary - I thought the idea was to
>>>> propagate this into the backends?
>>>>
>>> We want to propagate the synthetic function entry counts to the
>>> backends. This can be done by building a global graph whose edges are
>>> annotated with the relative BFI and propagating the counts on this graph.
>>>
>>
>> Can you clarify - to get them to the backends the updated propagated
>> counts need to be written into the index (which is serialized out for
>> distributed backends, and passed in-memory for in-process backends). So I
>> would imagine this field would be used for that purpose?
>>
> Yes, the propagated function entry counts needs to be written into the
> index, but that is  *per-function*  and not *per-edge*. On the other hand,
> relative BFI is per call edge.
>

Ah ok. But at least initially, they will need to be on the edges in the
combined index?. I.e. we will read in and merge all of the summary indices
into a combined index at the start of the thin link, and that will have the
relative frequencies on the edges, then you will do a graph propagation,
right? I'm still not sure why they would only be on the per module index in
memory.

>
>>
>>>
>>>>
>>>>
>>>> ================
>>>> Comment at: lib/Analysis/ModuleSummaryAnalysis.cpp:63
>>>> +cl::opt<bool> WriteRelBFToSummary(
>>>> +    "write-relbf-to-summary", cl::Hidden, cl::init(false),
>>>> +    cl::desc("Write relative block frequency to function summary "));
>>>> ----------------
>>>> eraman wrote:
>>>> > tejohnson wrote:
>>>> > > Suggest moving this to the bitcode writer and always adding
>>>> RelBlockFreq into the in-memory index here.
>>>> > >
>>>> > > Is there any reason to default off? Why not just default on from
>>>> the get-go?
>>>> > Until this information actually gets used (first propagate the counts
>>>> and then use the counts in the inliner), this just poses extra compile time
>>>> and increased summary size if not turned off. Alternatively, I can drop
>>>> this patch and then send it as part of a larger patch that builds the
>>>> callgraph and propagates the counts.
>>>> How long do you expect until the other patches to be sent? If not too
>>>> long I don't know that it matters too much since we are going to incur the
>>>> extra space/memory/time in the near term anyway. Better to have it tested.
>>>>
>>>>
>>>> Repository:
>>>>   rL LLVM
>>>>
>>>> https://reviews.llvm.org/D42212
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Teresa Johnson |  Software Engineer |  tejohnson at google.com |
>> 408-460-2413 <(408)%20460-2413>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180120/9eb61dd5/attachment.html>