<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Feb 27, 2016 at 8:14 PM, Hal Finkel <span dir="ltr"><<a href="mailto:hfinkel@anl.gov" target="_blank">hfinkel@anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">----- Original Message -----<br>

> From: "Sean Silva via llvm-dev" <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

> To: "Xinliang David Li" <<a href="mailto:davidxl@google.com">davidxl@google.com</a>><br>

> Cc: "llvm-dev" <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>><br>

> Sent: Saturday, February 27, 2016 8:50:05 PM<br>

> Subject: Re: [llvm-dev] Add support for in-process profile merging in profile-runtime<br>

><br>

><br>

><br>

> I have thought about this issue too, in the context of games. We may<br>

> want to turn profiling only for certain frames (essentially, this is<br>

> many small profile runs).<br>

><br>

><br>

> However, I have not seen it demonstrated that this kind of refined<br>

> data collection will actually improve PGO results in practice.<br>

> The evidence I do have though is that IIRC Apple have found that<br>

> almost all of the benefits of PGO for the Clang binary can be gotten<br>

> with a handful of training runs of Clang. Are your findings<br>

> different?<br>

><br>

><br>

> Also, in general, I am very wary of file locking.<br>

<br>

</span>As am I (especially since it often does not operate correctly, or is very slow, on distributed file systems). </blockquote><div><br></div><div>Dumping thousands of copies of profiles can be more problematic IMO.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Why don't you just read in an existing file to pre-populate the counters section when it exists at startup?<br></blockquote><div><br></div><div>No this won't work for cases when multiple processes are dumping profile concurrently.</div><div><br></div><div>David </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<br>

 -Hal<br>

<div class="HOEnZb"><div class="h5"><br>

> This can cause huge<br>

> amounts of slowdown for a build and has potential portability<br>

> problems. I don't see it as a substantially better solution than<br>

> wrapping clang in a script that runs clang and then just calls<br>

> llvm-profdata to do the merging. Running llvm-profdata is cheap<br>

> compared to doing locking in a highly parallel situation like a<br>

> build.<br>

><br>

><br>

><br>

><br>

><br>

> -- Sean Silva<br>

><br>

><br>

> On Sat, Feb 27, 2016 at 6:02 PM, Xinliang David Li via llvm-dev <<br>

> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a> > wrote:<br>

><br>

><br>

><br>

> One of the main missing features in Clang/LLVM profile runtime is the<br>

> lack of support for online/in-process profile merging support.<br>

> Profile data collected for different workloads for the same<br>

> executable binary need to be collected and merged later by the<br>

> offline post-processing tool. This limitation makes it hard to<br>

> handle cases where the instrumented binary needs to be run with<br>

> large number of small workloads, possibly in parallel. For instance,<br>

> to do PGO for clang, we may choose to build a large project with the<br>

> instrumented Clang binary. This is because<br>

> 1) to avoid profile from different runs from overriding others, %p<br>

> substitution needs to be specified in either the command line or an<br>

> environment variable so that different process can dump profile data<br>

> into its own file named using pid. This will create huge requirement<br>

> on the disk storage. For instance, clang's raw profile size is<br>

> typically 80M -- if the instrumented clang is used to build a medium<br>

> to large size project (such as clang itself), profile data can<br>

> easily use up hundreds of Gig bytes of local storage.<br>

> 2) pid can also be recycled. This means that some of the profile data<br>

> may be overridden without being noticed.<br>

><br>

><br>

> The way to solve this problem is to allow profile data to be merged<br>

> in process. I have a prototype implementation and plan to send it<br>

> out for review soon after some clean ups. By default, the profiling<br>

> merging is off and it can be turned on with an user option or via an<br>

> environment variable. The following summarizes the issues involved<br>

> in adding this feature:<br>

> 1. the target platform needs to have file locking support<br>

> 2. there needs an efficient way to identify the profile data and<br>

> associate it with the binary using binary/profdata signature;<br>

> 3. Currently without merging, profile data from shared libraries<br>

> (including dlopen/dlcose ones) are concatenated into the primary<br>

> profile file. This can complicate matters, as the merger also needs<br>

> to find the matching shared libs, and the merger also needs to avoid<br>

> unnecessary data movement/copy;<br>

> 4. value profile data is variable in length even for the same binary.<br>

><br>

><br>

> All the above issues are resolved and clang self build with<br>

> instrumented binary passes (with both j1 and high parallelism).<br>

><br>

><br>

> If you have any concerns, please let me know.<br>

><br>

><br>

> thanks,<br>

><br>

><br>

> David<br>

><br>

><br>

> _______________________________________________<br>

> LLVM Developers mailing list<br>

> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

><br>

><br>

><br>

> _______________________________________________<br>

> LLVM Developers mailing list<br>

> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><br>

><br>

<br>

</div></div><span class="HOEnZb"><font color="#888888">--<br>

Hal Finkel<br>

Assistant Computational Scientist<br>

Leadership Computing Facility<br>

Argonne National Laboratory<br>

</font></span></blockquote></div><br></div></div>