[llvm-dev] Add support for in-process profile merging in profile-runtime

Xinliang David Li via llvm-dev llvm-dev at lists.llvm.org
Sat Feb 27 18:02:30 PST 2016


One of the main missing features in Clang/LLVM profile runtime is the lack
of support for online/in-process profile merging support. Profile data
collected for different workloads for the same executable binary need to be
collected and merged later by the offline post-processing tool.  This
limitation makes it hard to handle cases where the instrumented binary
needs to be run with large number of small workloads, possibly in
parallel.  For instance, to do PGO for clang, we may choose to  build  a
large project with the instrumented Clang binary. This is because
 1) to avoid profile from different runs from overriding others, %p
substitution needs to be specified in either the command line or an
environment variable so that different process can dump profile data into
its own file named using pid. This will create huge requirement on the disk
storage. For instance, clang's raw profile size is typically 80M -- if the
instrumented clang is used to build a medium to large size project (such as
clang itself), profile data can easily use up hundreds of Gig bytes of
local storage.
2) pid can also be recycled. This means that some of the profile data may
be overridden without being noticed.

The way to solve this problem is to allow profile data to be merged in
process.  I have a prototype implementation and plan to send it out for
review soon after some clean ups. By default, the profiling merging is off
and it can be turned on with an user option or via an environment variable.
The following summarizes the issues involved in adding this feature:
 1. the target platform needs to have file locking support
 2. there needs an efficient way to identify the profile data and associate
it with the binary using binary/profdata signature;
 3. Currently without merging, profile data from shared libraries
(including dlopen/dlcose ones) are concatenated into the primary profile
file. This can complicate matters, as the merger also needs to find the
matching shared libs, and the merger also needs to avoid unnecessary data
movement/copy;
 4. value profile data is variable in length even for the same binary.

All the above issues are resolved and clang self build with instrumented
binary passes (with both j1 and high parallelism).

If you have any concerns, please let me know.

thanks,

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160227/3212b633/attachment.html>


More information about the llvm-dev mailing list