[llvm-dev] Add support for in-process profile merging in profile-runtime

Hal Finkel via llvm-dev llvm-dev at lists.llvm.org
Sat Feb 27 20:14:47 PST 2016


----- Original Message -----
> From: "Sean Silva via llvm-dev" <llvm-dev at lists.llvm.org>
> To: "Xinliang David Li" <davidxl at google.com>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Saturday, February 27, 2016 8:50:05 PM
> Subject: Re: [llvm-dev] Add support for in-process profile merging in	profile-runtime
> 
> 
> 
> I have thought about this issue too, in the context of games. We may
> want to turn profiling only for certain frames (essentially, this is
> many small profile runs).
> 
> 
> However, I have not seen it demonstrated that this kind of refined
> data collection will actually improve PGO results in practice.
> The evidence I do have though is that IIRC Apple have found that
> almost all of the benefits of PGO for the Clang binary can be gotten
> with a handful of training runs of Clang. Are your findings
> different?
> 
> 
> Also, in general, I am very wary of file locking.

As am I (especially since it often does not operate correctly, or is very slow, on distributed file systems). Why don't you just read in an existing file to pre-populate the counters section when it exists at startup?

 -Hal

> This can cause huge
> amounts of slowdown for a build and has potential portability
> problems. I don't see it as a substantially better solution than
> wrapping clang in a script that runs clang and then just calls
> llvm-profdata to do the merging. Running llvm-profdata is cheap
> compared to doing locking in a highly parallel situation like a
> build.
> 
> 
> 
> 
> 
> -- Sean Silva
> 
> 
> On Sat, Feb 27, 2016 at 6:02 PM, Xinliang David Li via llvm-dev <
> llvm-dev at lists.llvm.org > wrote:
> 
> 
> 
> One of the main missing features in Clang/LLVM profile runtime is the
> lack of support for online/in-process profile merging support.
> Profile data collected for different workloads for the same
> executable binary need to be collected and merged later by the
> offline post-processing tool. This limitation makes it hard to
> handle cases where the instrumented binary needs to be run with
> large number of small workloads, possibly in parallel. For instance,
> to do PGO for clang, we may choose to build a large project with the
> instrumented Clang binary. This is because
> 1) to avoid profile from different runs from overriding others, %p
> substitution needs to be specified in either the command line or an
> environment variable so that different process can dump profile data
> into its own file named using pid. This will create huge requirement
> on the disk storage. For instance, clang's raw profile size is
> typically 80M -- if the instrumented clang is used to build a medium
> to large size project (such as clang itself), profile data can
> easily use up hundreds of Gig bytes of local storage.
> 2) pid can also be recycled. This means that some of the profile data
> may be overridden without being noticed.
> 
> 
> The way to solve this problem is to allow profile data to be merged
> in process. I have a prototype implementation and plan to send it
> out for review soon after some clean ups. By default, the profiling
> merging is off and it can be turned on with an user option or via an
> environment variable. The following summarizes the issues involved
> in adding this feature:
> 1. the target platform needs to have file locking support
> 2. there needs an efficient way to identify the profile data and
> associate it with the binary using binary/profdata signature;
> 3. Currently without merging, profile data from shared libraries
> (including dlopen/dlcose ones) are concatenated into the primary
> profile file. This can complicate matters, as the merger also needs
> to find the matching shared libs, and the merger also needs to avoid
> unnecessary data movement/copy;
> 4. value profile data is variable in length even for the same binary.
> 
> 
> All the above issues are resolved and clang self build with
> instrumented binary passes (with both j1 and high parallelism).
> 
> 
> If you have any concerns, please let me know.
> 
> 
> thanks,
> 
> 
> David
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the llvm-dev mailing list