<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Sep 7, 2015 at 10:38 PM, Xinliang David Li <span dir="ltr"><<a href="mailto:davidxl@google.com" target="_blank">davidxl@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">>> The key type before the change is StringRef, while the after the key<br>

>> type is uint64_t. Are you suggesting treating uint64_t md5 sum key as<br>

>> a string of 8 bytes or storing md5 has in text form which will double<br>

>> the size?<br>

><br>

><br>

> How much does this change the benefit? If most of the benefit is avoiding<br>

> extraordinarily long mangled names then it may be sufficient.<br>

<br>

</span>I implemented the same functionality using the approach suggested by you.<br>

<br>

1) There is no format change required for raw/indexed profile data<br>

2) when a user option is specified, md5hash string is used as the<br>

function key instead of of the raw function name<br>

<br>

In addition to that,  the option specified is also passed to the<br>

runtime and emitted into profile data. This allows different variants<br>

of the profile data encoded in the same format  to be automatically<br>

recognized by the client (clang -fprofile-use, llvm-profdata etc)<br>

without the need to specify the option again.<br></blockquote><div><br></div><div>Okay, this seems reasonable if it does not add a huge amount of complexity.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

There are some size increase, but increase seems to be in the range<br>

that is acceptable. Using clang as the test case. The indexed profile<br>

size is now 33.6 M (compared with 29.5M which is the optimal case).<br>

The raw profile size is 54.1M (compared with 40.1M from previous<br>

patch).<br></blockquote><div><br></div><div>Thanks for looking into this. If this is acceptable for you, then I think this approach is simpler.</div><div><br></div><div>-- Sean Silva</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<span class="HOEnZb"><font color="#888888"><br>

David<br>

</font></span><div class="HOEnZb"><div class="h5"><br>

><br>

> With IR-level instrumentation like Rong is pursuing the size may be reduced<br>

> sufficiently that we do not need the optimization proposed in this thread.<br>

> For example, Rong found >2x size reduction on Google's C++ benchmarks, which<br>

> I assume are representative of the extremely large Google binaries that are<br>

> causing the problems addressed by your proposal in this thread. The<br>

> measurements you mention for Clang in this thread provide similar size<br>

> reductions, so Rong's approach may be sufficient (especially because<br>

> functions with extremely large mangled names tend to be small inline<br>

> functions in header-only template libraries).<br>

><br>

> Of the points you mention in "Large size of overhead can limit the usability<br>

> of PGO greatly", many of the issues are hard limits that prevent the use of<br>

> PGO. Do you have a lower bound on how much the size of the PGO data must be<br>

> reduced in order to overcome the hard limits?<br>

><br>

> Obviously LLVM must be able to support the extremely large binaries in your<br>

> configuration (otherwise what use is LLVM as a compiler ;) My questions are<br>

> primarily aimed at establishing which tradeoffs are acceptable for<br>

> supporting this (both for LLVM and for you guys).<br>

><br>

> Btw, for us, the issue of PGO data size is not completely immaterial but is<br>

> very different from your use case. For us, the primary issue is the<br>

> additional memory use at run time, since PS4 games usually use "all"<br>

> available memory. We had a problem with UBSan where the large amount of<br>

> memory required for storing the UBSan diagnostic data at runtime required<br>

> the game programmers to manually change their memory map to make room.<br>

> +Filipe, do you remember how much memory UBSan was using that caused a<br>

> problem?<br>

><br>

> -- Sean Silva<br>

><br>

>><br>

>><br>

>> In the raw format, md5 sum key can be an embedded field in the<br>

>> prf_data variable instead of as different var referenced by prf_data.<br>

>><br>

>> ><br>

>> > If this is not the case, you should show your current patch so that we<br>

>> > can<br>

>> > discuss things concretely.<br>

>><br>

>> It is not. See above about the difference.<br>

>><br>

>> ><br>

>> >><br>

>> >> ><br>

>> >> ><br>

>> >> >><br>

>> >> >> It will be<br>

>> >> >> very messy to support multiple formats in instr-codegen and<br>

>> >> >> instr-runtime.  For compatibility concerns, the reader is taught to<br>

>> >> >> support previous format, but the changes there are isolated (also<br>

>> >> >> expected to be removed in the future).<br>

>> >> >><br>

>> >> >> ><br>

>> >> >> > My primary concern is that if the function name are not kept at<br>

>> >> >> > all<br>

>> >> >> > stages,<br>

>> >> >> > then it becomes difficult to analyze the profile data in a<br>

>> >> >> > standalone<br>

>> >> >> > way.<br>

>> >> >> > Many times, I have used `llvm-profdata show -all-functions<br>

>> >> >> > foo.profdata`<br>

>> >> >> > on<br>

>> >> >> > the resulting profile data and then imported that data into<br>

>> >> >> > Mathematica<br>

>> >> >> > for<br>

>> >> >> > analysis.<br>

>> >> >><br>

>> >> >> This is certainly a very valid use case.<br>

>> >> >><br>

>> >> >> >My understanding of your proposal is that `llvm-profdata show<br>

>> >> >> > -all-functions foo.profdata` will not show the actual function<br>

>> >> >> > names<br>

>> >> >> > but<br>

>> >> >> > instead MD5 hashes,<br>

>> >> >><br>

>> >> >> Yes.<br>

>> >> >><br>

>> >> >> To support your use case, there are two solutions:<br>

>> >> >><br>

>> >> >> 1) user can add -fcoverage-mapping option in the build<br>

>> >> >> 2) introduce a new option -fprofile-instr-names that force the<br>

>> >> >> emission of the name sections in the .o file. This is similar to 1),<br>

>> >> >> but no covmap section is needed.<br>

>> >> >><br>

>> >> >> llvm-profdata tool  will be taught to read the name section and<br>

>> >> >> attach<br>

>> >> >> function names to the profile records.<br>

>> >> ><br>

>> >> ><br>

>> >> > Needing to pass the executable to llvm-profdata would cause<br>

>> >> > deployment<br>

>> >> > issues for my customers in practice.<br>

>> >><br>

>> >> Why? The deployment needs to pass the profile data anyway right?<br>

>> ><br>

>> ><br>

>> > Yes, but not the executable.<br>

>> ><br>

>> > The PGO training run is likely being run by a gameplay tester<br>

>> > (non-programmer). In general the binary will not be lying around as a<br>

>> > loose<br>

>> > file anywhere, it will be part of a full package of the binary+assets<br>

>> > (think<br>

>> > like what will end up on a bluray disc). A game's binary *completely<br>

>> > useless* without the assets, so except locally on a programmer's machine<br>

>> > while they iterate/debug, there is no reason for a binary to ever exist<br>

>> > as a<br>

>> > standalone file.<br>

>> ><br>

>> > I'm not saying that needing the binary is insurmountable in any<br>

>> > particular<br>

>> > scenario. Just that it will cause a strict increase in the number of<br>

>> > issues<br>

>> > to deploying PGO.<br>

>><br>

>>  Your concern is acknowledged.<br>

>><br>

>> ><br>

>> > These are much bigger "compatibility concerns" for me than for newer<br>

>> > toolchains to accept the old format. For a change in format I can easily<br>

>> > tell my users to replace an exe with a newer one and that is all they<br>

>> > need<br>

>> > to do and it takes 10 seconds, guaranteed. A workflow change is<br>

>> > potentially<br>

>> > a massive disruption and guaranteed to take more than 10 seconds to fix<br>

>> > (perhaps hours or days).<br>

>><br>

>> ok.<br>

>><br>

>> ><br>

>> ><br>

>> >><br>

>> >>   This<br>

>> >> is no different from llvm-cov usage model.<br>

>> ><br>

>> ><br>

>> > In practice, getting the performance of PGO is a higher priority for my<br>

>> > users, so we should not assume that llvm-cov is being used.<br>

>><br>

>> Glad to hear that :)<br>

>><br>

>> thanks,<br>

>><br>

>> David<br>

>><br>

>> ><br>

>> > -- Sean Silva<br>

>> ><br>

>> >><br>

>> >><br>

>> >> David<br>

>> >><br>

>> >><br>

>> >><br>

>> >> ><br>

>> >> >><br>

>> >> >><br>

>> >> >> Note that with 1) or 2), the user can still benefit from the reduced<br>

>> >> >> profile size.<br>

>> >> ><br>

>> >> ><br>

>> >> > Let me reiterate that the size of the profile is not a problem I have<br>

>> >> > observed in practice (nor have I heard of this being a problem in<br>

>> >> > practice<br>

>> >> > until this thread). Therefore I'm skeptical of any changes to our<br>

>> >> > default<br>

>> >> > behavior or any new requirements that are not opt-in.<br>

>> >> ><br>

>> >> > -- Sean Silva<br>

>> >> ><br>

>> >> >><br>

>> >> >><br>

>> >> >> thanks,<br>

>> >> >><br>

>> >> >> David<br>

>> >> >><br>

>> >> >><br>

>> >> >><br>

>> >> >><br>

>> >> >> >which will make it more difficult for me to do this kind<br>

>> >> >> > of analysis (would require using nm on the original binary,<br>

>> >> >> > hashing<br>

>> >> >> > everything, etc.).<br>

>> >> >> ><br>

>> >> >> > btw, feel free to attach the patch even if it in a rough state. It<br>

>> >> >> > can<br>

>> >> >> > still<br>

>> >> >> > help to clarify the proposal and be a good talking point.<br>

>> >> >> > Fine-grained<br>

>> >> >> > patch<br>

>> >> >> > review for caring about the rough parts will happen on<br>

>> >> >> > llvm-commits;<br>

>> >> >> > the<br>

>> >> >> > rough parts will not distract the discussion here on llvm-dev.<br>

>> >> >> ><br>

>> >> >> > -- Sean Silva<br>

>> >> >> ><br>

>> >> >> >><br>

>> >> >> >><br>

>> >> >> >> thanks,<br>

>> >> >> >><br>

>> >> >> >> David<br>

>> >> >> >> _______________________________________________<br>

>> >> >> >> LLVM Developers mailing list<br>

>> >> >> >> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

>> >> >> >> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

>> >> >> ><br>

>> >> >> ><br>

>> >> ><br>

>> >> ><br>

>> ><br>

>> ><br>

><br>

><br>

</div></div></blockquote></div><br></div></div>