[llvm-dev] RFC: Reducing Instr PGO size overhead

Mon Dec 14 09:31:55 PST 2015

Sorry about the typo  -- I meant indexed format version won't be changed.

thanks,

David

On Mon, Dec 14, 2015 at 8:55 AM, Bob Wilson <bob.wilson at apple.com> wrote:
>
>> On Dec 9, 2015, at 1:12 PM, Xinliang David Li via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>
>> We are now very close to push the final stage of the PGO size
>> reduction task. Here is the updated plan going forward:
>>
>> 1) In this round, the format of the indexed profile data won't be unchanged.
>
> That’s a double negative. Was that intentional? I.E., are you changing the format?
>
>> 2) there will be *no* changes in user interfaces to all profiling
>> related tools including llvm-profdata, llvm-cov -- the change will be
>> transparent in terms of PGO usability.
>> 3) The implementation will be using  compression for the function name
>> section (the compression ratio is about 5:1). As a result, the linker
>> input object size, unstripped binary size, stripped binary size,
>> process image size, and raw profile data size will all be greatly
>> reduced;
>> 4) The change will also greatly improve usability of coverage-mapping
>> due to the reduced data size in memory.
>>
>> Before the final patch, there are a few more small preparation patches
>> : 1) abstract naming reading into a class (ProfSymtab) (currently the
>> reader uses/assumes the raw/uncompressed object section. 2) add
>> compression support in ProfSymtab.
>>
>> thanks,
>>
>> David
>>
>>
>>
>>
>>
>> On Wed, Oct 14, 2015 at 12:30 AM, Xinliang David Li <davidxl at google.com> wrote:
>>> I plan to divide the patch into series of small patches. The
>>> preparation patches will mostly be refactoring changes with NFC. This
>>> will minimize the size of final patch(es) with functional changes to
>>> help discussions.
>>>
>>> thanks,
>>>
>>> David
>>>
>>> On Fri, Oct 9, 2015 at 4:06 PM, Xinliang David Li <davidxl at google.com> wrote:
>>>> On Fri, Oct 9, 2015 at 3:58 PM, Sean Silva <chisophugis at gmail.com> wrote:
>>>>>
>>>>>
>>>>> On Wed, Oct 7, 2015 at 11:12 PM, Xinliang David Li <davidxl at google.com>
>>>>> wrote:
>>>>>>
>>>>>> There is no further response to this, so I will assume general
>>>>>> direction of solution-3 is acceptable ;)
>>>>
>>>>> No response does not mean "LGTM".
>>>>>
>>>>
>>>> What I meant is that the discussion can be moved on to the formal code
>>>> review. I have not yet submitted the final patch for review yet.
>>>> Before that is ready, continue using this thread to voice your
>>>> concerns.
>>>>
>>>> David
>>>>
>>>>
>>>>
>>>>
>>>>> -- Sean Silva
>>>>>
>>>>>>
>>>>>>
>>>>>> Solution-3 can be further improved. Instead of using static symbol
>>>>>> table (with zero size __llvm_prf_nm symbols) to store function names
>>>>>> for profile display and coverage mapping, the function names can be
>>>>>> stored compressed in a non-allocatable section. The compression ratio
>>>>>> for function name strings can be very high (~5x).  The covmapping data
>>>>>> can also be made non-allocatable.
>>>>>>
>>>>>> thanks,
>>>>>>
>>>>>> David
>>>>>>
>>>>>> On Tue, Sep 29, 2015 at 9:52 AM, Xinliang David Li <davidxl at google.com>
>>>>>> wrote:
>>>>>>> Sorry for the late update. I finally found time to implement a solution
>>>>>>> (Solution-3) that has the best size savings (for both PGO and coverage
>>>>>>> testing) with symbolic information available.  Here is a re-cap of what
>>>>>>> we
>>>>>>> have discussed so far:
>>>>>>>
>>>>>>> Solution-1:
>>>>>>>
>>>>>>> This is the original solution proposed. In this solution, A function
>>>>>>> name's
>>>>>>> MD5 hash is used as the primary key (combined with IR hash) for function
>>>>>>> lookup. __llvm_prf_names section won't be emitted into the binary nor
>>>>>>> dumped
>>>>>>> into the profile data unless -fcoverage-mapping is also specified.
>>>>>>>
>>>>>>>    Pros:
>>>>>>>        1. maximal reduction of instrumentation binary process image
>>>>>>> size
>>>>>>>        2. maximal reduction of object and unstripped binary size
>>>>>>>        3. maximal reduction of raw profile data size
>>>>>>>        4. maximal reduction of indexed profile data size
>>>>>>>
>>>>>>>   Cons:
>>>>>>>        1. -fcoverage-mapping use case is not handled -- the size
>>>>>>> problem
>>>>>>> still exist
>>>>>>>        2. profile dump with llvm-profdata no longer have function names
>>>>>>> associated -- user needs to use postprocessing tool to get the
>>>>>>> functionality
>>>>>>>        3. function filtering with partial function name is not
>>>>>>> supported
>>>>>>>        4. Requires incompatible format change
>>>>>>>
>>>>>>>
>>>>>>> Solution-2: (http://reviews.llvm.org/D12715)
>>>>>>>
>>>>>>> In this solution, the MD5 hash string is used to replace the raw name
>>>>>>> string
>>>>>>>      Pros:
>>>>>>>          1. Very simple to implement
>>>>>>>          2. have good reduction of all sizes for typical large C++
>>>>>>> applications
>>>>>>>          3. No profile data format change is required.
>>>>>>>
>>>>>>>      Cons:
>>>>>>>          1. Still requires 16byte overhead per-function -- can actually
>>>>>>> hurt C programs
>>>>>>>          2. -fcoverage-mapping use case is still not handled
>>>>>>>          3. The problem with llvm-profdata still exists (no symbolic
>>>>>>> info,
>>>>>>> partial filtering support)
>>>>>>>
>>>>>>>
>>>>>>> Solution-3:
>>>>>>>
>>>>>>> This is the new solution I am proposing. It is basically an enhancement
>>>>>>> of
>>>>>>> Solution-1 with most of the weakness resolved.  The difference with
>>>>>>> Solution-1 is
>>>>>>>      1. Function name symbols are emitted into the symbol table as weak
>>>>>>> externs. They don't occupy any space at runtime and can be easily
>>>>>>> stripped.
>>>>>>>      2. -fcoverage-mapping does not need special handling -- it
>>>>>>> automatically benefit from the same size saving.
>>>>>>>      3. llvm-cov is changed to read symbol info from the symtab instead
>>>>>>> of
>>>>>>> reading them from the name section data
>>>>>>>      4. llvm-profdata is enhanced to take a binary as input and dump
>>>>>>> profile with names attached. Function filtering is fully supported
>>>>>>> (option
>>>>>>> can also be introduced to force dumping names into binary and profile
>>>>>>> data,
>>>>>>> so that llvm-profdata use case is not changed at all).
>>>>>>>
>>>>>>> Pros:
>>>>>>>       1. All the pros from Solution-1
>>>>>>>       2. Size savings for coverage-mapping case
>>>>>>> Cons:
>>>>>>>       Format change is required for profile data and coverage mapping.
>>>>>>>
>>>>>>> The initial patch is here: http://reviews.llvm.org/D13251
>>>>>>>
>>>>>>> With this patch, the size of a release clang binary with coverage
>>>>>>> mapping is
>>>>>>> reduced from 986M to 569M.
>>>>>>>
>>>>>>> If there are no major concerns,  I will carve out the patch into smaller
>>>>>>> ones for review.
>>>>>>>
>>>>>>> thanks,
>>>>>>>
>>>>>>> David
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Sep 8, 2015 at 3:47 PM, Xinliang David Li <davidxl at google.com>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> yes -- it is fixed length (8byte) blob which may include null
>>>>>>>>>>>> byte
>>>>>>>>>>>> in
>>>>>>>>>>>> the middle.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> For reference, MD5 sum is 16 bytes (128-bit):
>>>>>>>>>>> https://en.wikipedia.org/wiki/MD5
>>>>>>>>>>
>>>>>>>>>> yes, LLVM's MD5 hash only takes the lower 64bit.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Or to say it another way, suppose that Itanium mangling
>>>>>>>>>>>>> required
>>>>>>>>>>>>> as a
>>>>>>>>>>>>> final
>>>>>>>>>>>>> step to replace the string with its md5 sum in hex. Therefore
>>>>>>>>>>>>> all
>>>>>>>>>>>>> symbol
>>>>>>>>>>>>> names are "small". My understanding is that this is effectively
>>>>>>>>>>>>> all
>>>>>>>>>>>>> your
>>>>>>>>>>>>> patch is doing.
>>>>>>>>>>>>
>>>>>>>>>>>> The key type before the change is StringRef, while the after the
>>>>>>>>>>>> key
>>>>>>>>>>>> type is uint64_t. Are you suggesting treating uint64_t md5 sum
>>>>>>>>>>>> key
>>>>>>>>>>>> as
>>>>>>>>>>>> a string of 8 bytes or storing md5 has in text form which will
>>>>>>>>>>>> double
>>>>>>>>>>>> the size?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> How much does this change the benefit? If most of the benefit is
>>>>>>>>>>> avoiding
>>>>>>>>>>> extraordinarily long mangled names then it may be sufficient.
>>>>>>>>>>>
>>>>>>>>>>> With IR-level instrumentation like Rong is pursuing the size may
>>>>>>>>>>> be
>>>>>>>>>>> reduced
>>>>>>>>>>> sufficiently that we do not need the optimization proposed in this
>>>>>>>>>>> thread.
>>>>>>>>>>> For example, Rong found >2x size reduction on Google's C++
>>>>>>>>>>> benchmarks,
>>>>>>>>>>> which
>>>>>>>>>>> I assume are representative of the extremely large Google binaries
>>>>>>>>>>> that
>>>>>>>>>>> are
>>>>>>>>>>> causing the problems addressed by your proposal in this thread.
>>>>>>>>>>> The
>>>>>>>>>>> measurements you mention for Clang in this thread provide similar
>>>>>>>>>>> size
>>>>>>>>>>> reductions, so Rong's approach may be sufficient (especially
>>>>>>>>>>> because
>>>>>>>>>>> functions with extremely large mangled names tend to be small
>>>>>>>>>>> inline
>>>>>>>>>>> functions in header-only template libraries).
>>>>>>>>>>
>>>>>>>>>> Late instrumentation helps many cases. In some cases (as shown in
>>>>>>>>>> SPEC), the reduction in size is not as large. Reducing PGO overhead
>>>>>>>>>> will lower the bar for its adoption.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Of the points you mention in "Large size of overhead can limit the
>>>>>>>>>>> usability
>>>>>>>>>>> of PGO greatly", many of the issues are hard limits that prevent
>>>>>>>>>>> the
>>>>>>>>>>> use
>>>>>>>>>>> of
>>>>>>>>>>> PGO. Do you have a lower bound on how much the size of the PGO
>>>>>>>>>>> data
>>>>>>>>>>> must
>>>>>>>>>>> be
>>>>>>>>>>> reduced in order to overcome the hard limits?
>>>>>>>>>>
>>>>>>>>>> This is a static view:  Think about the situation where application
>>>>>>>>>> size is ever increasing; also think about situation where we want to
>>>>>>>>>>
>>>>>>>>>> collect more types of profile data. Think about situation where user
>>>>>>>>>> want to run pgo binaries on small devices with tiny memory/storage
>>>>>>>>>> ..
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> If we want to reduce memory overhead at runtime and reduce the size
>>>>>>>>> of
>>>>>>>>> the
>>>>>>>>> raw profile data extracted from the target, there are clear
>>>>>>>>> solutions.
>>>>>>>>> Consider that debug info does not need to be loaded into the memory
>>>>>>>>> image of
>>>>>>>>> the target; why should information identifying each counter need to
>>>>>>>>> be?
>>>>>>>>> A file containing raw profile counters is a subset of a core dump; in
>>>>>>>>> most
>>>>>>>>> environments, a core dump does not need to have debug info or symbol
>>>>>>>>> names
>>>>>>>>> in it, but can be still be read in full detail in conjunction with
>>>>>>>>> the
>>>>>>>>> original binary.
>>>>>>>>
>>>>>>>> Yes -- there are many alternatives:
>>>>>>>> 1) emit the name key mapping as a side data at compile time, or
>>>>>>>> 2) emit them into nonloadable sections of the object file.
>>>>>>>>
>>>>>>>> Compared with the above, LLVM's existing design does have its own
>>>>>>>> advantage -- making it easier for tool to access 'debug' info for
>>>>>>>> counters.
>>>>>>>>
>>>>>>>> LLVM's coverage testing, on the other hand, take a hybrid approach: It
>>>>>>>> emits the coverage map as rodata, but does not pass it to the profile
>>>>>>>> dumper. I think it is better to emit covmap as a side data not
>>>>>>>> attached to target binary.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thus, as we require that the binary be passed to llvm-profdata, there
>>>>>>>>> is
>>>>>>>>> no
>>>>>>>>> fundamental reason that the memory image of the program, or the raw
>>>>>>>>> data
>>>>>>>>> extracted from the program, must have any size overhead besides the
>>>>>>>>> raw
>>>>>>>>> values of the counters themselves and any text size increase for
>>>>>>>>> incrementing them. If we are willing to impose this requirement on
>>>>>>>>> users,
>>>>>>>>> then as far as reducing memory overhead at runtime and reducing the
>>>>>>>>> size
>>>>>>>>> of
>>>>>>>>> the raw profile data extracted from the target, using hashed function
>>>>>>>>> names
>>>>>>>>> is clearly the wrong direction.
>>>>>>>>>
>>>>>>>>> *Without* imposing the requirement of passing the binary to
>>>>>>>>> llvm-profdata, I
>>>>>>>>> do like the ability to use hashed function names like you are
>>>>>>>>> proposing.
>>>>>>>>> It
>>>>>>>>> is a simple solution for reducing size overhead of function name
>>>>>>>>> strings
>>>>>>>>> with little complexity, as it is just swapping one string for
>>>>>>>>> another.
>>>>>>>>
>>>>>>>> Agree. The good news is that the overhead of hashed function names is
>>>>>>>> small enough that makes this approach attractive.
>>>>>>>>
>>>>>>>> thanks,
>>>>>>>>
>>>>>>>> David
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Obviously LLVM must be able to support the extremely large
>>>>>>>>>>> binaries
>>>>>>>>>>> in
>>>>>>>>>>> your
>>>>>>>>>>> configuration (otherwise what use is LLVM as a compiler ;) My
>>>>>>>>>>> questions
>>>>>>>>>>> are
>>>>>>>>>>> primarily aimed at establishing which tradeoffs are acceptable for
>>>>>>>>>>> supporting this (both for LLVM and for you guys).
>>>>>>>>>>
>>>>>>>>>> As I said, with the modified proposal (after getting your feedback),
>>>>>>>>>> no PGO users in LLVM land is going to lose anything/functionality.
>>>>>>>>>> The
>>>>>>>>>> end result will be net win for general users of LLVM (even though
>>>>>>>>>> your
>>>>>>>>>> customers don't care about it), not just 'us' as you have mentioned
>>>>>>>>>> many times.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Btw, for us, the issue of PGO data size is not completely
>>>>>>>>>>> immaterial
>>>>>>>>>>> but
>>>>>>>>>>> is
>>>>>>>>>>> very different from your use case. For us, the primary issue is
>>>>>>>>>>> the
>>>>>>>>>>> additional memory use at run time, since PS4 games usually use
>>>>>>>>>>> "all"
>>>>>>>>>>> available memory. We had a problem with UBSan where the large
>>>>>>>>>>> amount
>>>>>>>>>>> of
>>>>>>>>>>> memory required for storing the UBSan diagnostic data at runtime
>>>>>>>>>>> required
>>>>>>>>>>> the game programmers to manually change their memory map to make
>>>>>>>>>>> room.
>>>>>>>>>>> +Filipe, do you remember how much memory UBSan was using that
>>>>>>>>>>> caused
>>>>>>>>>>> a
>>>>>>>>>>> problem?
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> My proposal does help reducing rodata size significantly.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Yes, that is why I think that this is a useful thing to do. I just
>>>>>>>>> want
>>>>>>>>> to
>>>>>>>>> be careful about existing use cases and the relevant workflow issues.
>>>>>>>>>
>>>>>>>>> -- Sean Silva
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> David
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> -- Sean Silva
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> In the raw format, md5 sum key can be an embedded field in the
>>>>>>>>>>>> prf_data variable instead of as different var referenced by
>>>>>>>>>>>> prf_data.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> If this is not the case, you should show your current patch so
>>>>>>>>>>>>> that
>>>>>>>>>>>>> we
>>>>>>>>>>>>> can
>>>>>>>>>>>>> discuss things concretely.
>>>>>>>>>>>>
>>>>>>>>>>>> It is not. See above about the difference.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It will be
>>>>>>>>>>>>>>>> very messy to support multiple formats in instr-codegen and
>>>>>>>>>>>>>>>> instr-runtime.  For compatibility concerns, the reader is
>>>>>>>>>>>>>>>> taught
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> support previous format, but the changes there are isolated
>>>>>>>>>>>>>>>> (also
>>>>>>>>>>>>>>>> expected to be removed in the future).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> My primary concern is that if the function name are not
>>>>>>>>>>>>>>>>> kept
>>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>> stages,
>>>>>>>>>>>>>>>>> then it becomes difficult to analyze the profile data in
>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>> standalone
>>>>>>>>>>>>>>>>> way.
>>>>>>>>>>>>>>>>> Many times, I have used `llvm-profdata show
>>>>>>>>>>>>>>>>> -all-functions
>>>>>>>>>>>>>>>>> foo.profdata`
>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>> the resulting profile data and then imported that data
>>>>>>>>>>>>>>>>> into
>>>>>>>>>>>>>>>>> Mathematica
>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>> analysis.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This is certainly a very valid use case.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> My understanding of your proposal is that `llvm-profdata
>>>>>>>>>>>>>>>>> show
>>>>>>>>>>>>>>>>> -all-functions foo.profdata` will not show the actual
>>>>>>>>>>>>>>>>> function
>>>>>>>>>>>>>>>>> names
>>>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>>> instead MD5 hashes,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> To support your use case, there are two solutions:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1) user can add -fcoverage-mapping option in the build
>>>>>>>>>>>>>>>> 2) introduce a new option -fprofile-instr-names that force
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> emission of the name sections in the .o file. This is
>>>>>>>>>>>>>>>> similar
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>> 1),
>>>>>>>>>>>>>>>> but no covmap section is needed.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> llvm-profdata tool  will be taught to read the name section
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> attach
>>>>>>>>>>>>>>>> function names to the profile records.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Needing to pass the executable to llvm-profdata would cause
>>>>>>>>>>>>>>> deployment
>>>>>>>>>>>>>>> issues for my customers in practice.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Why? The deployment needs to pass the profile data anyway
>>>>>>>>>>>>>> right?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Yes, but not the executable.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The PGO training run is likely being run by a gameplay tester
>>>>>>>>>>>>> (non-programmer). In general the binary will not be lying
>>>>>>>>>>>>> around
>>>>>>>>>>>>> as a
>>>>>>>>>>>>> loose
>>>>>>>>>>>>> file anywhere, it will be part of a full package of the
>>>>>>>>>>>>> binary+assets
>>>>>>>>>>>>> (think
>>>>>>>>>>>>> like what will end up on a bluray disc). A game's binary
>>>>>>>>>>>>> *completely
>>>>>>>>>>>>> useless* without the assets, so except locally on a
>>>>>>>>>>>>> programmer's
>>>>>>>>>>>>> machine
>>>>>>>>>>>>> while they iterate/debug, there is no reason for a binary to
>>>>>>>>>>>>> ever
>>>>>>>>>>>>> exist
>>>>>>>>>>>>> as a
>>>>>>>>>>>>> standalone file.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm not saying that needing the binary is insurmountable in any
>>>>>>>>>>>>> particular
>>>>>>>>>>>>> scenario. Just that it will cause a strict increase in the
>>>>>>>>>>>>> number
>>>>>>>>>>>>> of
>>>>>>>>>>>>> issues
>>>>>>>>>>>>> to deploying PGO.
>>>>>>>>>>>>
>>>>>>>>>>>> Your concern is acknowledged.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> These are much bigger "compatibility concerns" for me than for
>>>>>>>>>>>>> newer
>>>>>>>>>>>>> toolchains to accept the old format. For a change in format I
>>>>>>>>>>>>> can
>>>>>>>>>>>>> easily
>>>>>>>>>>>>> tell my users to replace an exe with a newer one and that is
>>>>>>>>>>>>> all
>>>>>>>>>>>>> they
>>>>>>>>>>>>> need
>>>>>>>>>>>>> to do and it takes 10 seconds, guaranteed. A workflow change is
>>>>>>>>>>>>> potentially
>>>>>>>>>>>>> a massive disruption and guaranteed to take more than 10
>>>>>>>>>>>>> seconds
>>>>>>>>>>>>> to
>>>>>>>>>>>>> fix
>>>>>>>>>>>>> (perhaps hours or days).
>>>>>>>>>>>>
>>>>>>>>>>>> ok.
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>  This
>>>>>>>>>>>>>> is no different from llvm-cov usage model.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> In practice, getting the performance of PGO is a higher
>>>>>>>>>>>>> priority
>>>>>>>>>>>>> for
>>>>>>>>>>>>> my
>>>>>>>>>>>>> users, so we should not assume that llvm-cov is being used.
>>>>>>>>>>>>
>>>>>>>>>>>> Glad to hear that :)
>>>>>>>>>>>>
>>>>>>>>>>>> thanks,
>>>>>>>>>>>>
>>>>>>>>>>>> David
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -- Sean Silva
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Note that with 1) or 2), the user can still benefit from
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> reduced
>>>>>>>>>>>>>>>> profile size.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Let me reiterate that the size of the profile is not a
>>>>>>>>>>>>>>> problem
>>>>>>>>>>>>>>> I
>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>> observed in practice (nor have I heard of this being a
>>>>>>>>>>>>>>> problem
>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>> practice
>>>>>>>>>>>>>>> until this thread). Therefore I'm skeptical of any changes
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> our
>>>>>>>>>>>>>>> default
>>>>>>>>>>>>>>> behavior or any new requirements that are not opt-in.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -- Sean Silva
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> which will make it more difficult for me to do this kind
>>>>>>>>>>>>>>>>> of analysis (would require using nm on the original
>>>>>>>>>>>>>>>>> binary,
>>>>>>>>>>>>>>>>> hashing
>>>>>>>>>>>>>>>>> everything, etc.).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> btw, feel free to attach the patch even if it in a rough
>>>>>>>>>>>>>>>>> state.
>>>>>>>>>>>>>>>>> It
>>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>> still
>>>>>>>>>>>>>>>>> help to clarify the proposal and be a good talking point.
>>>>>>>>>>>>>>>>> Fine-grained
>>>>>>>>>>>>>>>>> patch
>>>>>>>>>>>>>>>>> review for caring about the rough parts will happen on
>>>>>>>>>>>>>>>>> llvm-commits;
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> rough parts will not distract the discussion here on
>>>>>>>>>>>>>>>>> llvm-dev.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -- Sean Silva
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>