[llvm-dev] RFC: Reducing Instr PGO size overhead

Mon Dec 14 08:55:24 PST 2015

> On Dec 9, 2015, at 1:12 PM, Xinliang David Li via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> We are now very close to push the final stage of the PGO size
> reduction task. Here is the updated plan going forward:
> 
> 1) In this round, the format of the indexed profile data won't be unchanged.

That’s a double negative. Was that intentional? I.E., are you changing the format?

> 2) there will be *no* changes in user interfaces to all profiling
> related tools including llvm-profdata, llvm-cov -- the change will be
> transparent in terms of PGO usability.
> 3) The implementation will be using  compression for the function name
> section (the compression ratio is about 5:1). As a result, the linker
> input object size, unstripped binary size, stripped binary size,
> process image size, and raw profile data size will all be greatly
> reduced;
> 4) The change will also greatly improve usability of coverage-mapping
> due to the reduced data size in memory.
> 
> Before the final patch, there are a few more small preparation patches
> : 1) abstract naming reading into a class (ProfSymtab) (currently the
> reader uses/assumes the raw/uncompressed object section. 2) add
> compression support in ProfSymtab.
> 
> thanks,
> 
> David
> 
> 
> 
> 
> 
> On Wed, Oct 14, 2015 at 12:30 AM, Xinliang David Li <davidxl at google.com> wrote:
>> I plan to divide the patch into series of small patches. The
>> preparation patches will mostly be refactoring changes with NFC. This
>> will minimize the size of final patch(es) with functional changes to
>> help discussions.
>> 
>> thanks,
>> 
>> David
>> 
>> On Fri, Oct 9, 2015 at 4:06 PM, Xinliang David Li <davidxl at google.com> wrote:
>>> On Fri, Oct 9, 2015 at 3:58 PM, Sean Silva <chisophugis at gmail.com> wrote:
>>>> 
>>>> 
>>>> On Wed, Oct 7, 2015 at 11:12 PM, Xinliang David Li <davidxl at google.com>
>>>> wrote:
>>>>> 
>>>>> There is no further response to this, so I will assume general
>>>>> direction of solution-3 is acceptable ;)
>>> 
>>>> No response does not mean "LGTM".
>>>> 
>>> 
>>> What I meant is that the discussion can be moved on to the formal code
>>> review. I have not yet submitted the final patch for review yet.
>>> Before that is ready, continue using this thread to voice your
>>> concerns.
>>> 
>>> David
>>> 
>>> 
>>> 
>>> 
>>>> -- Sean Silva
>>>> 
>>>>> 
>>>>> 
>>>>> Solution-3 can be further improved. Instead of using static symbol
>>>>> table (with zero size __llvm_prf_nm symbols) to store function names
>>>>> for profile display and coverage mapping, the function names can be
>>>>> stored compressed in a non-allocatable section. The compression ratio
>>>>> for function name strings can be very high (~5x).  The covmapping data
>>>>> can also be made non-allocatable.
>>>>> 
>>>>> thanks,
>>>>> 
>>>>> David
>>>>> 
>>>>> On Tue, Sep 29, 2015 at 9:52 AM, Xinliang David Li <davidxl at google.com>
>>>>> wrote:
>>>>>> Sorry for the late update. I finally found time to implement a solution
>>>>>> (Solution-3) that has the best size savings (for both PGO and coverage
>>>>>> testing) with symbolic information available.  Here is a re-cap of what
>>>>>> we
>>>>>> have discussed so far:
>>>>>> 
>>>>>> Solution-1:
>>>>>> 
>>>>>> This is the original solution proposed. In this solution, A function
>>>>>> name's
>>>>>> MD5 hash is used as the primary key (combined with IR hash) for function
>>>>>> lookup. __llvm_prf_names section won't be emitted into the binary nor
>>>>>> dumped
>>>>>> into the profile data unless -fcoverage-mapping is also specified.
>>>>>> 
>>>>>>    Pros:
>>>>>>        1. maximal reduction of instrumentation binary process image
>>>>>> size
>>>>>>        2. maximal reduction of object and unstripped binary size
>>>>>>        3. maximal reduction of raw profile data size
>>>>>>        4. maximal reduction of indexed profile data size
>>>>>> 
>>>>>>   Cons:
>>>>>>        1. -fcoverage-mapping use case is not handled -- the size
>>>>>> problem
>>>>>> still exist
>>>>>>        2. profile dump with llvm-profdata no longer have function names
>>>>>> associated -- user needs to use postprocessing tool to get the
>>>>>> functionality
>>>>>>        3. function filtering with partial function name is not
>>>>>> supported
>>>>>>        4. Requires incompatible format change
>>>>>> 
>>>>>> 
>>>>>> Solution-2: (http://reviews.llvm.org/D12715)
>>>>>> 
>>>>>> In this solution, the MD5 hash string is used to replace the raw name
>>>>>> string
>>>>>>      Pros:
>>>>>>          1. Very simple to implement
>>>>>>          2. have good reduction of all sizes for typical large C++
>>>>>> applications
>>>>>>          3. No profile data format change is required.
>>>>>> 
>>>>>>      Cons:
>>>>>>          1. Still requires 16byte overhead per-function -- can actually
>>>>>> hurt C programs
>>>>>>          2. -fcoverage-mapping use case is still not handled
>>>>>>          3. The problem with llvm-profdata still exists (no symbolic
>>>>>> info,
>>>>>> partial filtering support)
>>>>>> 
>>>>>> 
>>>>>> Solution-3:
>>>>>> 
>>>>>> This is the new solution I am proposing. It is basically an enhancement
>>>>>> of
>>>>>> Solution-1 with most of the weakness resolved.  The difference with
>>>>>> Solution-1 is
>>>>>>      1. Function name symbols are emitted into the symbol table as weak
>>>>>> externs. They don't occupy any space at runtime and can be easily
>>>>>> stripped.
>>>>>>      2. -fcoverage-mapping does not need special handling -- it
>>>>>> automatically benefit from the same size saving.
>>>>>>      3. llvm-cov is changed to read symbol info from the symtab instead
>>>>>> of
>>>>>> reading them from the name section data
>>>>>>      4. llvm-profdata is enhanced to take a binary as input and dump
>>>>>> profile with names attached. Function filtering is fully supported
>>>>>> (option
>>>>>> can also be introduced to force dumping names into binary and profile
>>>>>> data,
>>>>>> so that llvm-profdata use case is not changed at all).
>>>>>> 
>>>>>> Pros:
>>>>>>       1. All the pros from Solution-1
>>>>>>       2. Size savings for coverage-mapping case
>>>>>> Cons:
>>>>>>       Format change is required for profile data and coverage mapping.
>>>>>> 
>>>>>> The initial patch is here: http://reviews.llvm.org/D13251
>>>>>> 
>>>>>> With this patch, the size of a release clang binary with coverage
>>>>>> mapping is
>>>>>> reduced from 986M to 569M.
>>>>>> 
>>>>>> If there are no major concerns,  I will carve out the patch into smaller
>>>>>> ones for review.
>>>>>> 
>>>>>> thanks,
>>>>>> 
>>>>>> David
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Tue, Sep 8, 2015 at 3:47 PM, Xinliang David Li <davidxl at google.com>
>>>>>> wrote:
>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> yes -- it is fixed length (8byte) blob which may include null
>>>>>>>>>>> byte
>>>>>>>>>>> in
>>>>>>>>>>> the middle.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> For reference, MD5 sum is 16 bytes (128-bit):
>>>>>>>>>> https://en.wikipedia.org/wiki/MD5
>>>>>>>>> 
>>>>>>>>> yes, LLVM's MD5 hash only takes the lower 64bit.
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Or to say it another way, suppose that Itanium mangling
>>>>>>>>>>>> required
>>>>>>>>>>>> as a
>>>>>>>>>>>> final
>>>>>>>>>>>> step to replace the string with its md5 sum in hex. Therefore
>>>>>>>>>>>> all
>>>>>>>>>>>> symbol
>>>>>>>>>>>> names are "small". My understanding is that this is effectively
>>>>>>>>>>>> all
>>>>>>>>>>>> your
>>>>>>>>>>>> patch is doing.
>>>>>>>>>>> 
>>>>>>>>>>> The key type before the change is StringRef, while the after the
>>>>>>>>>>> key
>>>>>>>>>>> type is uint64_t. Are you suggesting treating uint64_t md5 sum
>>>>>>>>>>> key
>>>>>>>>>>> as
>>>>>>>>>>> a string of 8 bytes or storing md5 has in text form which will
>>>>>>>>>>> double
>>>>>>>>>>> the size?
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> How much does this change the benefit? If most of the benefit is
>>>>>>>>>> avoiding
>>>>>>>>>> extraordinarily long mangled names then it may be sufficient.
>>>>>>>>>> 
>>>>>>>>>> With IR-level instrumentation like Rong is pursuing the size may
>>>>>>>>>> be
>>>>>>>>>> reduced
>>>>>>>>>> sufficiently that we do not need the optimization proposed in this
>>>>>>>>>> thread.
>>>>>>>>>> For example, Rong found >2x size reduction on Google's C++
>>>>>>>>>> benchmarks,
>>>>>>>>>> which
>>>>>>>>>> I assume are representative of the extremely large Google binaries
>>>>>>>>>> that
>>>>>>>>>> are
>>>>>>>>>> causing the problems addressed by your proposal in this thread.
>>>>>>>>>> The
>>>>>>>>>> measurements you mention for Clang in this thread provide similar
>>>>>>>>>> size
>>>>>>>>>> reductions, so Rong's approach may be sufficient (especially
>>>>>>>>>> because
>>>>>>>>>> functions with extremely large mangled names tend to be small
>>>>>>>>>> inline
>>>>>>>>>> functions in header-only template libraries).
>>>>>>>>> 
>>>>>>>>> Late instrumentation helps many cases. In some cases (as shown in
>>>>>>>>> SPEC), the reduction in size is not as large. Reducing PGO overhead
>>>>>>>>> will lower the bar for its adoption.
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Of the points you mention in "Large size of overhead can limit the
>>>>>>>>>> usability
>>>>>>>>>> of PGO greatly", many of the issues are hard limits that prevent
>>>>>>>>>> the
>>>>>>>>>> use
>>>>>>>>>> of
>>>>>>>>>> PGO. Do you have a lower bound on how much the size of the PGO
>>>>>>>>>> data
>>>>>>>>>> must
>>>>>>>>>> be
>>>>>>>>>> reduced in order to overcome the hard limits?
>>>>>>>>> 
>>>>>>>>> This is a static view:  Think about the situation where application
>>>>>>>>> size is ever increasing; also think about situation where we want to
>>>>>>>>> 
>>>>>>>>> collect more types of profile data. Think about situation where user
>>>>>>>>> want to run pgo binaries on small devices with tiny memory/storage
>>>>>>>>> ..
>>>>>>>> 
>>>>>>>> 
>>>>>>>> If we want to reduce memory overhead at runtime and reduce the size
>>>>>>>> of
>>>>>>>> the
>>>>>>>> raw profile data extracted from the target, there are clear
>>>>>>>> solutions.
>>>>>>>> Consider that debug info does not need to be loaded into the memory
>>>>>>>> image of
>>>>>>>> the target; why should information identifying each counter need to
>>>>>>>> be?
>>>>>>>> A file containing raw profile counters is a subset of a core dump; in
>>>>>>>> most
>>>>>>>> environments, a core dump does not need to have debug info or symbol
>>>>>>>> names
>>>>>>>> in it, but can be still be read in full detail in conjunction with
>>>>>>>> the
>>>>>>>> original binary.
>>>>>>> 
>>>>>>> Yes -- there are many alternatives:
>>>>>>> 1) emit the name key mapping as a side data at compile time, or
>>>>>>> 2) emit them into nonloadable sections of the object file.
>>>>>>> 
>>>>>>> Compared with the above, LLVM's existing design does have its own
>>>>>>> advantage -- making it easier for tool to access 'debug' info for
>>>>>>> counters.
>>>>>>> 
>>>>>>> LLVM's coverage testing, on the other hand, take a hybrid approach: It
>>>>>>> emits the coverage map as rodata, but does not pass it to the profile
>>>>>>> dumper. I think it is better to emit covmap as a side data not
>>>>>>> attached to target binary.
>>>>>>> 
>>>>>>> 
>>>>>>>> 
>>>>>>>> Thus, as we require that the binary be passed to llvm-profdata, there
>>>>>>>> is
>>>>>>>> no
>>>>>>>> fundamental reason that the memory image of the program, or the raw
>>>>>>>> data
>>>>>>>> extracted from the program, must have any size overhead besides the
>>>>>>>> raw
>>>>>>>> values of the counters themselves and any text size increase for
>>>>>>>> incrementing them. If we are willing to impose this requirement on
>>>>>>>> users,
>>>>>>>> then as far as reducing memory overhead at runtime and reducing the
>>>>>>>> size
>>>>>>>> of
>>>>>>>> the raw profile data extracted from the target, using hashed function
>>>>>>>> names
>>>>>>>> is clearly the wrong direction.
>>>>>>>> 
>>>>>>>> *Without* imposing the requirement of passing the binary to
>>>>>>>> llvm-profdata, I
>>>>>>>> do like the ability to use hashed function names like you are
>>>>>>>> proposing.
>>>>>>>> It
>>>>>>>> is a simple solution for reducing size overhead of function name
>>>>>>>> strings
>>>>>>>> with little complexity, as it is just swapping one string for
>>>>>>>> another.
>>>>>>> 
>>>>>>> Agree. The good news is that the overhead of hashed function names is
>>>>>>> small enough that makes this approach attractive.
>>>>>>> 
>>>>>>> thanks,
>>>>>>> 
>>>>>>> David
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Obviously LLVM must be able to support the extremely large
>>>>>>>>>> binaries
>>>>>>>>>> in
>>>>>>>>>> your
>>>>>>>>>> configuration (otherwise what use is LLVM as a compiler ;) My
>>>>>>>>>> questions
>>>>>>>>>> are
>>>>>>>>>> primarily aimed at establishing which tradeoffs are acceptable for
>>>>>>>>>> supporting this (both for LLVM and for you guys).
>>>>>>>>> 
>>>>>>>>> As I said, with the modified proposal (after getting your feedback),
>>>>>>>>> no PGO users in LLVM land is going to lose anything/functionality.
>>>>>>>>> The
>>>>>>>>> end result will be net win for general users of LLVM (even though
>>>>>>>>> your
>>>>>>>>> customers don't care about it), not just 'us' as you have mentioned
>>>>>>>>> many times.
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Btw, for us, the issue of PGO data size is not completely
>>>>>>>>>> immaterial
>>>>>>>>>> but
>>>>>>>>>> is
>>>>>>>>>> very different from your use case. For us, the primary issue is
>>>>>>>>>> the
>>>>>>>>>> additional memory use at run time, since PS4 games usually use
>>>>>>>>>> "all"
>>>>>>>>>> available memory. We had a problem with UBSan where the large
>>>>>>>>>> amount
>>>>>>>>>> of
>>>>>>>>>> memory required for storing the UBSan diagnostic data at runtime
>>>>>>>>>> required
>>>>>>>>>> the game programmers to manually change their memory map to make
>>>>>>>>>> room.
>>>>>>>>>> +Filipe, do you remember how much memory UBSan was using that
>>>>>>>>>> caused
>>>>>>>>>> a
>>>>>>>>>> problem?
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> My proposal does help reducing rodata size significantly.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Yes, that is why I think that this is a useful thing to do. I just
>>>>>>>> want
>>>>>>>> to
>>>>>>>> be careful about existing use cases and the relevant workflow issues.
>>>>>>>> 
>>>>>>>> -- Sean Silva
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> David
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> -- Sean Silva
>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> In the raw format, md5 sum key can be an embedded field in the
>>>>>>>>>>> prf_data variable instead of as different var referenced by
>>>>>>>>>>> prf_data.
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> If this is not the case, you should show your current patch so
>>>>>>>>>>>> that
>>>>>>>>>>>> we
>>>>>>>>>>>> can
>>>>>>>>>>>> discuss things concretely.
>>>>>>>>>>> 
>>>>>>>>>>> It is not. See above about the difference.
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> It will be
>>>>>>>>>>>>>>> very messy to support multiple formats in instr-codegen and
>>>>>>>>>>>>>>> instr-runtime.  For compatibility concerns, the reader is
>>>>>>>>>>>>>>> taught
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> support previous format, but the changes there are isolated
>>>>>>>>>>>>>>> (also
>>>>>>>>>>>>>>> expected to be removed in the future).
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> My primary concern is that if the function name are not
>>>>>>>>>>>>>>>> kept
>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>> stages,
>>>>>>>>>>>>>>>> then it becomes difficult to analyze the profile data in
>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>> standalone
>>>>>>>>>>>>>>>> way.
>>>>>>>>>>>>>>>> Many times, I have used `llvm-profdata show
>>>>>>>>>>>>>>>> -all-functions
>>>>>>>>>>>>>>>> foo.profdata`
>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>> the resulting profile data and then imported that data
>>>>>>>>>>>>>>>> into
>>>>>>>>>>>>>>>> Mathematica
>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>> analysis.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> This is certainly a very valid use case.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> My understanding of your proposal is that `llvm-profdata
>>>>>>>>>>>>>>>> show
>>>>>>>>>>>>>>>> -all-functions foo.profdata` will not show the actual
>>>>>>>>>>>>>>>> function
>>>>>>>>>>>>>>>> names
>>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>> instead MD5 hashes,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Yes.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> To support your use case, there are two solutions:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 1) user can add -fcoverage-mapping option in the build
>>>>>>>>>>>>>>> 2) introduce a new option -fprofile-instr-names that force
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> emission of the name sections in the .o file. This is
>>>>>>>>>>>>>>> similar
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>> 1),
>>>>>>>>>>>>>>> but no covmap section is needed.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> llvm-profdata tool  will be taught to read the name section
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> attach
>>>>>>>>>>>>>>> function names to the profile records.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Needing to pass the executable to llvm-profdata would cause
>>>>>>>>>>>>>> deployment
>>>>>>>>>>>>>> issues for my customers in practice.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Why? The deployment needs to pass the profile data anyway
>>>>>>>>>>>>> right?
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Yes, but not the executable.
>>>>>>>>>>>> 
>>>>>>>>>>>> The PGO training run is likely being run by a gameplay tester
>>>>>>>>>>>> (non-programmer). In general the binary will not be lying
>>>>>>>>>>>> around
>>>>>>>>>>>> as a
>>>>>>>>>>>> loose
>>>>>>>>>>>> file anywhere, it will be part of a full package of the
>>>>>>>>>>>> binary+assets
>>>>>>>>>>>> (think
>>>>>>>>>>>> like what will end up on a bluray disc). A game's binary
>>>>>>>>>>>> *completely
>>>>>>>>>>>> useless* without the assets, so except locally on a
>>>>>>>>>>>> programmer's
>>>>>>>>>>>> machine
>>>>>>>>>>>> while they iterate/debug, there is no reason for a binary to
>>>>>>>>>>>> ever
>>>>>>>>>>>> exist
>>>>>>>>>>>> as a
>>>>>>>>>>>> standalone file.
>>>>>>>>>>>> 
>>>>>>>>>>>> I'm not saying that needing the binary is insurmountable in any
>>>>>>>>>>>> particular
>>>>>>>>>>>> scenario. Just that it will cause a strict increase in the
>>>>>>>>>>>> number
>>>>>>>>>>>> of
>>>>>>>>>>>> issues
>>>>>>>>>>>> to deploying PGO.
>>>>>>>>>>> 
>>>>>>>>>>> Your concern is acknowledged.
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> These are much bigger "compatibility concerns" for me than for
>>>>>>>>>>>> newer
>>>>>>>>>>>> toolchains to accept the old format. For a change in format I
>>>>>>>>>>>> can
>>>>>>>>>>>> easily
>>>>>>>>>>>> tell my users to replace an exe with a newer one and that is
>>>>>>>>>>>> all
>>>>>>>>>>>> they
>>>>>>>>>>>> need
>>>>>>>>>>>> to do and it takes 10 seconds, guaranteed. A workflow change is
>>>>>>>>>>>> potentially
>>>>>>>>>>>> a massive disruption and guaranteed to take more than 10
>>>>>>>>>>>> seconds
>>>>>>>>>>>> to
>>>>>>>>>>>> fix
>>>>>>>>>>>> (perhaps hours or days).
>>>>>>>>>>> 
>>>>>>>>>>> ok.
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>  This
>>>>>>>>>>>>> is no different from llvm-cov usage model.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> In practice, getting the performance of PGO is a higher
>>>>>>>>>>>> priority
>>>>>>>>>>>> for
>>>>>>>>>>>> my
>>>>>>>>>>>> users, so we should not assume that llvm-cov is being used.
>>>>>>>>>>> 
>>>>>>>>>>> Glad to hear that :)
>>>>>>>>>>> 
>>>>>>>>>>> thanks,
>>>>>>>>>>> 
>>>>>>>>>>> David
>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> -- Sean Silva
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> David
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Note that with 1) or 2), the user can still benefit from
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> reduced
>>>>>>>>>>>>>>> profile size.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Let me reiterate that the size of the profile is not a
>>>>>>>>>>>>>> problem
>>>>>>>>>>>>>> I
>>>>>>>>>>>>>> have
>>>>>>>>>>>>>> observed in practice (nor have I heard of this being a
>>>>>>>>>>>>>> problem
>>>>>>>>>>>>>> in
>>>>>>>>>>>>>> practice
>>>>>>>>>>>>>> until this thread). Therefore I'm skeptical of any changes
>>>>>>>>>>>>>> to
>>>>>>>>>>>>>> our
>>>>>>>>>>>>>> default
>>>>>>>>>>>>>> behavior or any new requirements that are not opt-in.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> -- Sean Silva
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> which will make it more difficult for me to do this kind
>>>>>>>>>>>>>>>> of analysis (would require using nm on the original
>>>>>>>>>>>>>>>> binary,
>>>>>>>>>>>>>>>> hashing
>>>>>>>>>>>>>>>> everything, etc.).
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> btw, feel free to attach the patch even if it in a rough
>>>>>>>>>>>>>>>> state.
>>>>>>>>>>>>>>>> It
>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>> still
>>>>>>>>>>>>>>>> help to clarify the proposal and be a good talking point.
>>>>>>>>>>>>>>>> Fine-grained
>>>>>>>>>>>>>>>> patch
>>>>>>>>>>>>>>>> review for caring about the rough parts will happen on
>>>>>>>>>>>>>>>> llvm-commits;
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> rough parts will not distract the discussion here on
>>>>>>>>>>>>>>>> llvm-dev.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> -- Sean Silva
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> thanks,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> David
>>>>>>>>>>>>>>>>> _______________________________________________
>>>>>>>>>>>>>>>>> LLVM Developers mailing list
>>>>>>>>>>>>>>>>> llvm-dev at lists.llvm.org
>>>>>>>>>>>>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev