[llvm-dev] RFC: Reducing Instr PGO size overhead

Xinliang David Li via llvm-dev llvm-dev at lists.llvm.org
Fri Sep 4 17:42:29 PDT 2015


On Fri, Sep 4, 2015 at 5:21 PM, Sean Silva <chisophugis at gmail.com> wrote:
>
>
> On Fri, Sep 4, 2015 at 3:57 PM, Xinliang David Li <davidxl at google.com>
> wrote:
>>
>> >
>> > I think it is reasonable to simply replace the key we currently use with
>> > MD5(key) for getting a size reduction.  In practice for my use cases, I
>> > have
>> > not observed any of the issues you mentioned under "Large size of
>> > overhead
>> > can limit the usability of PGO greatly", but I can understand that some
>> > of
>> > these issues could become problems in Google's use case. I would
>> > personally
>> > prefer to keep the existing behavior as the default (see below), and
>> > have
>> > MD5(key) as an option.
>>
>> The problem is that this requires profile format changes.
>
>
> Why? AFAIK the "function name" is just an arbitrary string. Using s or
> MD5(s) shouldn't matter. Of course, the user will need to pass consistent
> flags to clang.

The raw format for 64bit target can be made 'unchanged', but not for
the 32bit raw format -- the nameptr field is only 32bit.

The indexed format can not be made the same --  The ondisk profile
record layout totally changes. The key field changes from a blob of
chars into an 64bit integer.
>
>
>>
>> It will be
>> very messy to support multiple formats in instr-codegen and
>> instr-runtime.  For compatibility concerns, the reader is taught to
>> support previous format, but the changes there are isolated (also
>> expected to be removed in the future).
>>
>> >
>> > My primary concern is that if the function name are not kept at all
>> > stages,
>> > then it becomes difficult to analyze the profile data in a standalone
>> > way.
>> > Many times, I have used `llvm-profdata show -all-functions foo.profdata`
>> > on
>> > the resulting profile data and then imported that data into Mathematica
>> > for
>> > analysis.
>>
>> This is certainly a very valid use case.
>>
>> >My understanding of your proposal is that `llvm-profdata show
>> > -all-functions foo.profdata` will not show the actual function names but
>> > instead MD5 hashes,
>>
>> Yes.
>>
>> To support your use case, there are two solutions:
>>
>> 1) user can add -fcoverage-mapping option in the build
>> 2) introduce a new option -fprofile-instr-names that force the
>> emission of the name sections in the .o file. This is similar to 1),
>> but no covmap section is needed.
>>
>> llvm-profdata tool  will be taught to read the name section and attach
>> function names to the profile records.
>
>
> Needing to pass the executable to llvm-profdata would cause deployment
> issues for my customers in practice.

Why? The deployment needs to pass the profile data anyway right?  This
is no different from llvm-cov usage model.

David



>
>>
>>
>> Note that with 1) or 2), the user can still benefit from the reduced
>> profile size.
>
>
> Let me reiterate that the size of the profile is not a problem I have
> observed in practice (nor have I heard of this being a problem in practice
> until this thread). Therefore I'm skeptical of any changes to our default
> behavior or any new requirements that are not opt-in.
>
> -- Sean Silva
>
>>
>>
>> thanks,
>>
>> David
>>
>>
>>
>>
>> >which will make it more difficult for me to do this kind
>> > of analysis (would require using nm on the original binary, hashing
>> > everything, etc.).
>> >
>> > btw, feel free to attach the patch even if it in a rough state. It can
>> > still
>> > help to clarify the proposal and be a good talking point. Fine-grained
>> > patch
>> > review for caring about the rough parts will happen on llvm-commits; the
>> > rough parts will not distract the discussion here on llvm-dev.
>> >
>> > -- Sean Silva
>> >
>> >>
>> >>
>> >> thanks,
>> >>
>> >> David
>> >> _______________________________________________
>> >> LLVM Developers mailing list
>> >> llvm-dev at lists.llvm.org
>> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>> >
>> >
>
>


More information about the llvm-dev mailing list