[llvm-dev] RFC: XRay Profiling in LLVM

Dean Michael Berris via llvm-dev llvm-dev at lists.llvm.org
Mon Mar 26 18:10:28 PDT 2018


FYI: Patch is now available for review in https://reviews.llvm.org/D44620.


On Thu, Feb 15, 2018 at 1:34 PM Dean Michael Berris <dean.berris at gmail.com>
wrote:

> # Objective
>
> Implement an XRay mode 'xray-profiling' that gathers stack trace
> latencies/durations and builds histograms to provide basic statistics about
> where time is going in an execution of the application.
>
> # Background
>
> XRay has two modes currently implemented in compiler-rt: a basic (nee
> naive) mode and flight data recorder (FDR) mode.
>
> Basic mode logging, when enabled, will collect traces and write verbose
> logs to disk. This mode has some unsophisticated filtering and stack-depth
> limiting support. This mode is useful for tracing short-lived applications,
> so that execution can be traced throughout the lifetime of the application.
> We've used basic mode logging to identify sources of latency in Clang/LLVM.
>
> FDR mode logging is designed to record traces and aggressively filter out
> events that fall below a latency threshold. This mode uses a fixed amount
> of heap space to record events as they're happening, and only writes out
> the logs on demand. FDR mode is used mostly for long-running applications
> and services.
>
> We also already have tools in LLVM that can turn these XRay traces into
> other forms that are easier to digest. One tool turns XRay traces into
> something that the Chrome Trace Viewer can render [1] and another
> summarises it to generate latency flame graphs [2].
>
> # Problem Statement
>
> Both basic and FDR mode logging can create massive traces. One example,
> where a fully-instrumented clang binary was traced with XRay's basic mode,
> we found that it could generate an 89GB trace compiling the canonical
> "Hello, World!" program.
>
> One way to address this problem is to limit the amount of memory being
> used by the implementation through flags. Unfortunately that also means
> basic mode XRay will block more on writing out the log files.
>
> Filtering at runtime also takes some CPU time, so the overheads of writing
> out the log as well as filtering the records become non-trivial for heavily
> instrumented programs.
>
> # Proposed Solution
>
> We'd like to propose a profiling mode to XRay, called 'xray-profiling'.
> The intent is to implement handlers that will generate profile data [3]
> that other tools can digest. We'll export the data in an XRay-specific
> format for a more compact representation. We intend to implement conversion
> tools to support the profile.proto format that pprof can handle as input.
>
> To do this, we need to record the "stack" of functions that are XRay
> instrumented, represented as a prefix tree (or forest, as we'll see later).
> This prefix tree will represent the XRay instrumented functions that have
> been entered, and then exited.
>
> A() -> B() -> C()
>     \
>       -> C()
>
> Each node in the tree will contain a histogram of the latencies/durations
> of the function by subtracting the timestamp at the time of exit from the
> timestamp at entry.
>
> When the tracing functionality is finalised and flushed, we end up with a
> data file that represents the prefix tree in a summarised form following
> the profile.proto format.
>
> # Logistics
>
> As with the Basic and FDR modes, the profiling mode will be implemented in
> compiler-rt and will be default-linked into any XRay instrumented binary.
>
> In the future we can provide more targeted archives that can be controlled
> in clang, such that users can determine which runtime implementations
> they'd like to be linked into the final binary. This work may be done in
> parallel, but is unrelated to the work involved with the XRay profiling
> mode implementation.
>
> # Call for Reviewers
>
> One of the issues we're finding with the XRay work is there's very little
> community knowledge about what's happening in the project. I would like to
> change that by encouraging more reviewers, interested in XRay, to please
> sign up to be a reviewer. Eventually, we'd like to grow the list of people
> contributing to XRay from the community and this is an opportunity to do so.
>
> # Open Questions
>
> * The triggering mechanism is still through programmatically turning
> on/off XRay instrumentation in an XRay-instrumented binary. We're exploring
> some other options based on ptrace (will send another RFC at some point
> about this) and a tool that will explicitly control the XRay tracing
> externally. The alternative to this is having a dedicated XRay thread that
> will periodically check whether a signal has been tripped, and do the
> standardised procedure for enabling/disabling/flushing trace data. Any
> alternatives to this that we may not have considered yet?
>
> Thanks in advance!
>
>
> References
>
> [0] https://github.com/google/pprof
> [1] http://lists.llvm.org/pipermail/llvm-dev/2017-December/119529.html
> [2] https://llvm.org/docs/XRayExample.html#flame-graph-generation
> [3] https://github.com/google/pprof/blob/master/proto/profile.proto
> --
> -- Dean
>
-- 
-- Dean
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180327/d80ab673/attachment.html>


More information about the llvm-dev mailing list