[PATCH] Indirect call target profiling compiler-rt changes
Xinliang David Li
davidxl at google.com
Tue Jun 9 10:42:28 PDT 2015
Sounds good to me. Clang patches can wait after LLVM patches get in.
For LLVM patches, wait for explicit LGTM from Diego or Justin.
thanks,
David
On Tue, Jun 9, 2015 at 10:33 AM, Betul Buyukkurt <betulb at codeaurora.org> wrote:
>> thanks! IIRC, I also have some comments on the clang patch which I
>> have not seen reply.
>
> Sorry, I had a delay in responding to your and Justin's comments on the
> clang patch. I'll get back to that after posting the patches for the data
> file size reduction today. I've the following plan for up-streaming if the
> latest version of the patches that I'll post are agreed on in general. The
> finer points on wording, commenting can be addressed through the below
> patches. Please comment if this is agreeable.
>
> 1) Intrinsic instruction for value profiling.
> 2) Making changes to the data structures used by the readers/writers, and
> updating of the readers to preserve the current formats.
> 3) Indexed reader/writer changes for version 3 i.e. the one that's going
> to be used by the value profiling as well
> 4) Lowering of profiling intrinsic and generation of Data w/ value
> profiling entries. Raw reader updates for reading version 2 data.
> 5) compiler-rt changes for value profiling.
>
> (4 & 5) changes should go in together not to break the builds.
>
> 6) Clang changes for generating instrumentation and attaching of new
> profile metadata.
>
> 7) Clang flag controls for value profiling types (enabling/disabling
> instrumentation/pgo use).
>
> Thanks,
> -Betul
>
>> Can you also link all patches in the same email this round (hopefully
>> the last one)?
>>
>> David
>>
>>
>>
>> On Mon, Jun 8, 2015 at 9:57 AM, Betul Buyukkurt <betulb at codeaurora.org>
>> wrote:
>>> Hi,
>>>
>>> I've implemented the changes described below and I'll be posting my
>>> patches shortly. I've however only changed the raw output generated from
>>> the runtime and the raw reader class. I've left the structure and the
>>> design of the indexed reader the same as in my latest patches. In other
>>> words I've continued to keep the counter index in the
>>> InstrProfValueRecord. I found it easier to the merges w/ counter index
>>> present in the InstrProfValueRecord data structure.
>>>
>>> To further reduce the data size of the value profiling data generated
>>> from
>>> runtime, I'm not recording the pointer values, but rather the uint32_t
>>> value data counts per value site. So the indirection array contains how
>>> many value data entries belong to a given value site.
>>>
>>> Thanks,
>>> -Betul
>>>
>>> -----Original Message-----
>>> From: Xinliang David Li [mailto:davidxl at google.com]
>>> Sent: Tuesday, June 02, 2015 10:08 AM
>>> To: Betul Buyukkurt
>>> Cc: reviews+D9009+public+928a0f9fb6abdd51 at reviews.llvm.org; Justin
>>> Bogner;
>>> Bob Wilson; dsule at codeaurora.org; Diego Novillo; Commit Messages and
>>> Patches for LLVM
>>> Subject: Re: [PATCH] Indirect call target profiling compiler-rt changes
>>>
>>> On Tue, Jun 2, 2015 at 9:34 AM, Betul Buyukkurt <betulb at codeaurora.org>
>>> wrote:
>>>>> I think there is more room to reduce memory footprint and profile
>>>>> data
>>>> size.
>>>>> The reason that value_kind and counter index needs to be recorded and
>>>> stored is because the number of value entries per site is variant.
>>>> This problem can be solved by adding one level of indirection when
>>>> writing out
>>>>> the profile data: Writing out ValueCounters in profile data too (with
>>>> address fixup after data gathering). With that written out, the
>>>>> llvm_profile_value data can be greatly shrinked:
>>>>
>>>> Hi David,
>>>>
>>>> Here is what you're suggesting per my understanding.
>>>>
>>>> In the raw profile output file, ValueCounters array of
>>>> __llvm_profile_data should point to another array of size
>>>> NumValueSites. This array is created dynamically at runtime, but will
>>>> also be written out to the profile data file. Each element of this new
>>>> array in the output file will point to the beginning of the relevant
>>>> __llvm_profile_value_data* entries for a value_kind and a
>>>> counter_index.
>>>
>>> yes. It basically makes value profile data representation more
>>> consistent
>>> with branch profile data -- the new array basically maps to the counter
>>> array of the branch profile.
>>>
>>>
>>>>
>>>> __llvm_profile_gather_value_data routine will be responsible for
>>>> forming both the __llvm_profile_value_data* array (i.e.
>>>> ValueProfilingResults) as well as the new pointer array for the added
>>>> indirection.
>>>
>>> yes.
>>>
>>>>
>>>> One extra piece of information that needs to be stored is the number
>>>> of elements that each pointer in the indirection array points to.
>>>
>>> No need -- it can be computed by diffing the value of pointer of current
>>> counter index with the next entry.
>>>
>>>> In the
>>>> least we have to store the number of elements that the last element in
>>>> the indirection array points to. This way we may eliminate the need to
>>>> store the counter index, function index and the value_kinds in each
>>>> value data field.
>>>
>>> No need for that. What is needed is just a sentinel element in the
>>> indirection array that points to the position after the end of the value
>>> entries. Basically, the number of elements of the indirect array is N+1
>>> where N is the number of sites of that value kind.
>>>
>>>> Please confirm if anything is missing from the above. Since the raw
>>>> profile format does not have to be maintained and remain backwards
>>>> compatible (per Justin's comments), this optimization can be added
>>>> after all the patches merge in. So this should not really hold the
>>>> patches from merging in. Do you agree?
>>>
>>> True, but I prefer not going that route if possible:
>>>
>>> 1) Getting it right the first time can avoid the rework and save us a
>>> lot
>>> of time in the future;
>>> 2) You have done a great work so far ;) It will probably take you much
>>> less time to do this than others ..
>>>
>>> thanks,
>>>
>>> David
>>>
>>>>
>>>> -Betul
>>>>
>>>>> 1. in memory: it needs only 3 64bit integers instead of 4 2. in disk:
>>>>> it needs only 2 64bit integers instead of the current 4 (No
>>>> need for func index either).
>>>>> http://reviews.llvm.org/D9009
>>>>> EMAIL PREFERENCES
>>>>> http://reviews.llvm.org/settings/panel/emailpreferences/
>>>>
>>>>
>>>>
>>>>
>>>
>>
>
>
More information about the llvm-commits
mailing list