[PATCH] Indirect call target profiling compiler-rt changes

Mon Jun 15 20:58:00 PDT 2015

> "Betul Buyukkurt" <betulb at codeaurora.org> writes:
>>>> Index: lib/profile/InstrProfilingBuffer.c
>>>> ===================================================================
>>>> --- lib/profile/InstrProfilingBuffer.c
>>>> +++ lib/profile/InstrProfilingBuffer.c
>>>> @@ -68,11 +68,14 @@
>>>>     * Match logic in __llvm_profile_write_file().
>>>>     */
>>>>
>>>> +  __llvm_profile_value_data *ValueDataBegin = NULL;
>>>> +
>>>>    /* Calculate size of sections. */
>>>>    const uint64_t DataSize = DataEnd - DataBegin;
>>>>    const uint64_t CountersSize = CountersEnd - CountersBegin;
>>>>    const uint64_t NamesSize = NamesEnd - NamesBegin;
>>>>    const uint64_t Padding = sizeof(uint64_t) - NamesSize %
>>>> sizeof(uint64_t);
>>>> +  const uint64_t ValueDataSize = 0;
>>>
>>> I guess this is supposed to do:
>>>
>>>   __llvm_profile_gather_value_data(&ValueDataBegin);
>>
>> This is correct. However, on the buffer access API's the size of the
>> buffer is calculated so that the caller of the write API allocates the
>> necessary buffer. A pointer to the buffer is passed back to the write
>> function for the data to be dumped to this memory. In my implementation,
> I
>> kept the buffer API's unmodified - other than changing the format to
> match
>> that of version 2's. I'd like to know if a re-alloc on the passed in
> (char
>> *Buffer) be in order as a change to accommodate value profiling's
> dynamic
>> memory needs.
>
> But isn't this completely broken if someone uses value profiling and the
> buffer APIs? What happens in that case?

I'd not say it's "broken". The profile header and the version number is
updated, so the profile reading/merges should happen cleanly. The only
problem is that currently there would not be any value profiling support
when the buffer API's are used. To come up with a solution, I've been
looking forward to external input from the community on the buffer API
usage scenarios w/ these CL's and if realloc is in order with how buffer
API usage is intended.

> Can we teach __llvm_profile_get_size_for_buffer to calculate the correct
> size ahead of time so that a realloc isn't necessary?

Perhaps __llvm_profile_get_size_for_buffer can take as input how many
values to output per value site. This can be used to pre-calculate and
allocate the size of the overall profiling data in advance. Then at each
value site, only the top N values would be outputted per value site.

The collected value profiles may fail to fill the pre-allocated buffer
completely, so the buffer write API should also return the total size of
the buffer used.