[llvm-dev] [PGO] Thoughts on adding a key-value store to profile data formats

Xinliang David Li via llvm-dev llvm-dev at lists.llvm.org
Tue Jan 19 11:51:36 PST 2016


On Mon, Jan 18, 2016 at 8:38 AM, Nathan Slingerland <slingn at gmail.com> wrote:
> On Fri, Jan 15, 2016 at 11:41 AM, Xinliang David Li <davidxl at google.com>
> wrote:
>>
>> Tagging profile data with such information is generally useful. My
>> thoughts are
>>
>> 1) such information is probably not needed to be stored in raw format
>> profile data -- so no runtime changes are needed -- only llvm-profdata
>> and indexed format need to be enhanced to support this.
>> 2) A more general way is just add an option:
>> --embed_label=<customized_label>, where the label is a string can be
>> key/value pairs encoded in user's favorite format. The format of the
>> key-value pairs are not specified and remain opaque to Instr/Sample
>> Profiler
>
>
>> ...
>>
>> I think all meta data should be custom defined -- the profile reader
>> should not need to understand them.
>
>
>
> OK. The benefit of enforcing some structure from the start is that it gives
> us the the possibility of machine parsing/round trip of the content for
> future applications. Initially this would just impact how we encode the
> label content - the reader classes could still treat the content as opaque
> for the time being if the format were something intended to be
> human-readable like YAML. On the other hand, if the metadata content begins
> life unstructured, it would be harder to retrofit structure later.

Given that the information is mostly intended for human consumption, I
am not too worried about the 'unstructured' nature of it. In the end,
the reader can always extract it as a string and user can use his
favorite parser (be it regexp, or YAML) to process it. Until we find
more motivating examples that meta data need to be shared across
different tools (and therefore well defined format), we can leave it
undefined for now.

>
>>
>> ...
>> >
>> > In terms of implementation, the metadata could live as a separate
>> > contiguous
>> > section in the binary profile formats. It might make sense to encode it
>> > in
>> > something like YAML so that it could also be directly embedded in the
>> > various text formats.
>> >
>>
>> A single string after the header should do.
>
>
> For the text formats I'd suggest that we delimit the label information with
> known prefix/suffix lines. That keeps it easy to parse (and skip) -
> especially since the label content can be multiple lines. The delimiters
> would only be a part of the file format and wouldn't be displayed from
> llvm-profdata.
>

we can discuss this more in the patch review:)

David


More information about the llvm-dev mailing list