[PATCH] D44620: [XRay][compiler-rt] XRay Profiling Mode

Dean Michael Berris via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Sun Mar 18 22:24:58 PDT 2018


dberris created this revision.
dberris added reviewers: eizan, kpw, echristo.
Herald added a subscriber: mgorny.

The XRay Profiling Mode is an implementation which captures latency
deltas on a per-function call, per-stack trace level. This is equivalent
in theory to the FDR mode implementation, but does in-memory aggregation
of the function call durations using efficient data structures instead
of storing a log of timestamps. There are a few major pieces to this
implementation, which we describe in detail below.

Major Component #1: In-memory Function Call Trie Implementation

We implement a Function Call Trie data structure, and supporting data
types.

- `__xray::Allocator<...>`: A chunked arena block allocator.
- `__xray::Array<...>`: A segmented array implementation.
- `__xray::FunctionCallTrie`: A prefix tree of XRay instrumented functions, tracking the dynamic control flow of XRay instrumented functions.

At a high level, the FunctionCallTrie tracks dynamic control flow of
XRay instrumented functions, as they are entered and exited. The
Function Call Trie does this by implementing a prefix tree, which is
populated by tracking a shadow stack. This shadow stack keeps track of
XRay instrumented functions that have been entered, but have not yet
been exited.

When we exit functions, we then ensure that we account the time
difference between the entry and exit points of a function, or those
exited implicitly in between. The function call trie is able to deal
with the situation where we have functions that were entered, but
skipped because of weird control flow at runtime. See the comments on
the `FunctionCallTrie` unit tests to see one specific (albeit contrived)
case of this support in action.

Since the `FunctionCallTrie` data type will be used in the function
entry/exit handlers, we err on the side of efficiency for some of the
operations we're performing. This means we amortise the cost of
allocating memory for when we need it, and allocate in blocks that we
want to be cache-line-sized and aligned appropriately. To do that we
implement our own allocator and segmented array data structure, with
very limited operations for use specifically in the potentially hot
paths.

Major Component #2: ProfileCollectorService

We also implement a global profile collector that can handle the storage
and serialisation of the FunctionCallTrie instances mapped to particular
threads. This is a globally managed profile collection API which
provides the following functionality:

- `ProfileCollectorService::post(...)`: Copies a `FunctionCallTrie` implementation instance mapped to a Thread ID.

- `ProfileCollectorService::serialize()`: Serializes all the `FunctionCallTrie` instances into individual buffers.

- `ProfileCollectorService::nextBuffer(...)`: Provides the iterator functionality required for accessing the in-memory buffers through the `__xray_log_process_buffers(...)` API in XRay.

We also have a few changes in the APIs and implementations of the
underlying components to make this profile collection service work.

Major Component #3: XRay Profiling Mode

Finally, we wire up the XRay Profiler runtime to start collecting data
through the FunctionCallTrie type, and exporting it through an in-memory
interface. We have tests for both single and multi-threaded
applications.

Future changes will introduce loading and conversion tools for handling
the serialized profile data.


https://reviews.llvm.org/D44620

Files:
  compiler-rt/lib/xray/CMakeLists.txt
  compiler-rt/lib/xray/tests/unit/CMakeLists.txt
  compiler-rt/lib/xray/tests/unit/allocator_test.cc
  compiler-rt/lib/xray/tests/unit/function_call_trie_test.cc
  compiler-rt/lib/xray/tests/unit/profile_collector_test.cc
  compiler-rt/lib/xray/tests/unit/segmented_array_test.cc
  compiler-rt/lib/xray/xray_allocator.h
  compiler-rt/lib/xray/xray_function_call_trie.h
  compiler-rt/lib/xray/xray_profile_collector.cc
  compiler-rt/lib/xray/xray_profile_collector.h
  compiler-rt/lib/xray/xray_profiler.cc
  compiler-rt/lib/xray/xray_profiler_flags.h
  compiler-rt/lib/xray/xray_profiler_flags.inc
  compiler-rt/lib/xray/xray_segmented_array.h
  compiler-rt/test/xray/TestCases/Posix/profiling-multi-threaded.cc
  compiler-rt/test/xray/TestCases/Posix/profiling-single-threaded.cc

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D44620.138879.patch
Type: text/x-patch
Size: 76319 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20180319/789482ca/attachment.bin>


More information about the llvm-commits mailing list