<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Feb 27, 2016 at 8:14 PM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">----- Original Message -----<br>
> From: "Sean Silva via llvm-dev" <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>
> To: "Xinliang David Li" <<a href="mailto:davidxl@google.com">davidxl@google.com</a>><br>
> Cc: "llvm-dev" <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>
> Sent: Saturday, February 27, 2016 8:50:05 PM<br>
> Subject: Re: [llvm-dev] Add support for in-process profile merging in profile-runtime<br>
><br>
><br>
><br>
> I have thought about this issue too, in the context of games. We may<br>
> want to turn profiling only for certain frames (essentially, this is<br>
> many small profile runs).<br>
><br>
><br>
> However, I have not seen it demonstrated that this kind of refined<br>
> data collection will actually improve PGO results in practice.<br>
> The evidence I do have though is that IIRC Apple have found that<br>
> almost all of the benefits of PGO for the Clang binary can be gotten<br>
> with a handful of training runs of Clang. Are your findings<br>
> different?<br>
><br>
><br>
> Also, in general, I am very wary of file locking.<br>
<br>
</span>As am I (especially since it often does not operate correctly, or is very slow, on distributed file systems). </blockquote><div><br></div><div>Dumping thousands of copies of profiles can be more problematic IMO.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Why don't you just read in an existing file to pre-populate the counters section when it exists at startup?<br></blockquote><div><br></div><div>No this won't work for cases when multiple processes are dumping profile concurrently.</div><div><br></div><div>David </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
-Hal<br>
<div class="HOEnZb"><div class="h5"><br>
> This can cause huge<br>
> amounts of slowdown for a build and has potential portability<br>
> problems. I don't see it as a substantially better solution than<br>
> wrapping clang in a script that runs clang and then just calls<br>
> llvm-profdata to do the merging. Running llvm-profdata is cheap<br>
> compared to doing locking in a highly parallel situation like a<br>
> build.<br>
><br>
><br>
><br>
><br>
><br>
> -- Sean Silva<br>
><br>
><br>
> On Sat, Feb 27, 2016 at 6:02 PM, Xinliang David Li via llvm-dev <<br>
> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a> > wrote:<br>
><br>
><br>
><br>
> One of the main missing features in Clang/LLVM profile runtime is the<br>
> lack of support for online/in-process profile merging support.<br>
> Profile data collected for different workloads for the same<br>
> executable binary need to be collected and merged later by the<br>
> offline post-processing tool. This limitation makes it hard to<br>
> handle cases where the instrumented binary needs to be run with<br>
> large number of small workloads, possibly in parallel. For instance,<br>
> to do PGO for clang, we may choose to build a large project with the<br>
> instrumented Clang binary. This is because<br>
> 1) to avoid profile from different runs from overriding others, %p<br>
> substitution needs to be specified in either the command line or an<br>
> environment variable so that different process can dump profile data<br>
> into its own file named using pid. This will create huge requirement<br>
> on the disk storage. For instance, clang's raw profile size is<br>
> typically 80M -- if the instrumented clang is used to build a medium<br>
> to large size project (such as clang itself), profile data can<br>
> easily use up hundreds of Gig bytes of local storage.<br>
> 2) pid can also be recycled. This means that some of the profile data<br>
> may be overridden without being noticed.<br>
><br>
><br>
> The way to solve this problem is to allow profile data to be merged<br>
> in process. I have a prototype implementation and plan to send it<br>
> out for review soon after some clean ups. By default, the profiling<br>
> merging is off and it can be turned on with an user option or via an<br>
> environment variable. The following summarizes the issues involved<br>
> in adding this feature:<br>
> 1. the target platform needs to have file locking support<br>
> 2. there needs an efficient way to identify the profile data and<br>
> associate it with the binary using binary/profdata signature;<br>
> 3. Currently without merging, profile data from shared libraries<br>
> (including dlopen/dlcose ones) are concatenated into the primary<br>
> profile file. This can complicate matters, as the merger also needs<br>
> to find the matching shared libs, and the merger also needs to avoid<br>
> unnecessary data movement/copy;<br>
> 4. value profile data is variable in length even for the same binary.<br>
><br>
><br>
> All the above issues are resolved and clang self build with<br>
> instrumented binary passes (with both j1 and high parallelism).<br>
><br>
><br>
> If you have any concerns, please let me know.<br>
><br>
><br>
> thanks,<br>
><br>
><br>
> David<br>
><br>
><br>
> _______________________________________________<br>
> LLVM Developers mailing list<br>
> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
><br>
><br>
><br>
> _______________________________________________<br>
> LLVM Developers mailing list<br>
> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>
><br>
<br>
</div></div><span class="HOEnZb"><font color="#888888">--<br>
Hal Finkel<br>
Assistant Computational Scientist<br>
Leadership Computing Facility<br>
Argonne National Laboratory<br>
</font></span></blockquote></div><br></div></div>