[llvm-dev] [XRay][RFC] Tooling for XRay Trace Analysis

Tue Sep 6 08:21:13 PDT 2016

(sorry for the delay)

On Tue, Aug 23, 2016 at 1:05 AM Dean Michael Berris <dean.berris at gmail.com>
wrote:

> Hi llvm-dev,
>
> I've been implementing a tool for analysing XRay traces. A recap of XRay's
> original RFC [0] mentions a tool that does function call accounting as a
> starting point. This is implemented currently in D21987 [1], and is being
> reviewed by David Blaikie.
>
> One key issue in that review is the dependency between the log format
> determined by the XRay runtime implementation in compiler-rt [2] and the
> tool reading these log entries.
>
> While it seems obvious that we should document clearly the file format of
> the traces (even supporting different versions) there's a clear dependency
> between the writer (XRay in compiler-rt) and the reader (the tool under
> development in LLVM). In this RFC, I'd like to explore some options
> regarding the coordination of these two moving pieces located in two places
> -- in particular, compiler-rt and the LLVM tools.
>
> # Problem Statement & Background
>
> XRay traces are only as useful as the analysis you can perform on it.
> While it's great to be able to look at stack traces, sometimes basic
> statistics and summaries are more digestible and gives a more immediate
> picture of the operations performed by one run of a particular binary (or
> multiple runs of the same binary on different inputs). Recently I've shared
> some initial results of the analysis available in [1] on an instrumented
> build of Clang [3] -- and this is just one example of the kinds of analysis
> possible with the data. However, there's one wrinkle here:
>
>   The analysis should be developed independently of the logging
> implementation.
>
> There's many reasons for doing this, and while it's certainly possible to
> implement a custom logging handler for XRay-instrumented binaries to
> generate some statistics on the fly instead of logging the function calls,
> this increases the cost and friction of getting value out of using XRay.
>
> Given this constraint, here are a few problems:
>
>   - The runtime library and the tools reading the log should have a common
> understanding of the log format. For now we use a naive binary dump log
> file format. We understand that there are platform and encoding issues that
> come with this (endianness being one of them, size of fields being another
> across platforms) but that this could be mitigated with enough metadata in
> the beginning the of files to indicate these encoding issues in a portable
> manner. Still, this is not easy, and having more complex schemes impose a
> heavier cost to the runtime implementation.
>   - The analysis tools should be able to read different executable file
> formats -- currently we only support ELF 64-bit. Since some analysis tools
> would really be great if they knew to convert function id's generated by
> the XRay runtime, having the instrumentation maps from the executables
> instrumented with XRay goes a long way to converting function id's to even
> just function pointers, and eventually to de-mangled function names. This
> means the tool will have to support multiple file formats that
> XRay-instrumented binaries ought to be ported to (COFF, MachO, and ELF).
>   - Having the analysis work on common in-memory (or on-disk) data
> structures ensures maximum applicability. This means even if the log file
> format changes, the analysis should still be able to work as long as we're
> keeping at least the same information in the log required by the analyses.
> For example, a hypothetical tool for generating just a graph of function
> calls encountered in a trace with counts ought to be feasible without being
> tied to the format of the XRay trace being fed to the tool.
>

This last requirement is a bit that I'm slightly confused by/trying to
better understand. I could picture tools as taking a dependency on some
LLVM API for reading the original, platform-specific, binary format. This
would make the tool neutral to versioning and target.

But I take it you mean (as detailed later) to have a separate format (could
be a portable binary format, but currently discussing it as JSON/YAML/etc)
that things are converted from that makes them portable?

One of the reasons, I think you mentioned, is that while the log is already
a separate file, you really want the instrumentation map along with it, and
that's in the whole binary which you probably don't need. Am I following
correctly?

Should this extraction then be an extract and merge? (creating a file
containing a log and instrumentation map together in this generic format?)

>
> More concisely:
>
>   1. We ought to be able to share log writer/reader code between LLVM and
> compiler-rt.
>   2. Converting the trace format from platform-specific to
> platform-agnostic (and vice versa) ought to be possible.
>   3. The tooling ought to be extensible with more analysis implementations
> without being tied to the log format.
>
> # Proposed Solution
>
> In [1] I've gone ahead and implemented a tool, currently named 'llvm-xray'
> which supports sub-commands to do the following:
>
>   - `llvm-xray extract <xray-instrumented binary>` : Converts the
> xray_instr_map in the binary into something more human and machine-readable
> text (currently does JSON, but I understand YAML is already supported
> better in LLVM).
>   - `llvm-xray account <xray trace> -m <xray-instrumented binary>` :
> Performs function call accounting with basic statistics.
>
> In the near future, we're looking to extend this tool to have the
> following (and similar) functionality:
>
>   - `llvm-xray dump <xray trace> -format={yaml,json,...}` : Takes an xray
> trace and turns it into some human-readable text format.
>   - `llvm-xray ingest <xray trace> -input-format={yaml,json...}` : Takes
> an xray trace in some human-readable text format, turns it into the binary
> format.
>

What's the need for this direction? Only for LLVM test purposes? Other
reasons?

(for DWARF for example, we just generate DWARF from existing code and test
that, rather than having a separate/independent format for generating DWARF
more directly - but we don't have complicated DWARF tools (we have llvm-dwp
which is as close as we get, and in that case I just checked in binary
object files along with the source used to create them)

This is certainly an area of discussion - with tools like lld taking a few
different approaches (including a YAML format for specifying object files,
or using assembly files and just assembling them on the fly in test cases,
or checking in binary object files). So there's no clear pre-existing
answer in LLVM for this situation, for sure)

>   - `llvm-xray stack <xray trace> -input-format=... -format=...` :
> Recreates stack traces from an xray trace.
>   - `llvm-xray graph <xray trace> -input-format=...` : Creates a graph (in
> dot format) of the function call interactions from the trace file.
>
> This allows us to do a few things:
>
>   1. When testing xray in compiler-rt, use the "dump" tool to inspect the
> contents of the log generated from xray-instrumented binaries.

Might be worth considering whether dumping for testability should be
YAML/JSON or something else. (the DWARF and object dumping used in LLVM
isn't in any such format - it's just a format designed for humans which
works well enough for our FileCheck testing, etc)

But if we need a format change to feed it in to other tools, then yes -
testing on that format (rather than having a JSON/YAML then a separate
dumping format) makes sense. I'm just trying to separate out the different
requirements and what implications they have on the design, etc.

> Similarly be able to synthesise xray binary traces in llvm lit tests using
> "ingest".
>   2. Extend the tool with more functionality without having to be gated on
> the definition of and/or implementation of the trace format. Since we can
> define the reader and writer implementation in one place, we can use the
> tool to enforce the format in regression tests (and as we evolve the format
> further, be able to support backward compatibility).
>
> # Proposed Plan of Action
>
> If the proposed solution is acceptable, the proposed plan of action is as
> follows (in chronological order):
>
>   0. Break up [1] into smaller pieces, starting with the base llvm-xray
> tool that literally "does nothing".
>   1. Implement the 'dump' and 'ingest' sub-commands as a single patch,
> with defined tests.
>   2. Update the logging implementation in [2] to use the 'dump'
> sub-command to test that entries in the log are what we expect them.
>   3. Implement the 'account' sub-command with tests seeded with data in
> lit tests.
>   4. Implement the 'stack' sub-command with tests seeded with data similar
> to #3.
>   5. Implement the 'graph' sub-command similar to #3.
>
> Note that we do not actually solve the issue of sharing the log
> writer/reader code between LLVM and compiler-rt directly, but rather we
> sidestep this in the meantime using the tool.
>
> # Open Questions
>
> - Is it possible to define the writer code in LLVM and have the
> compiler-rt implementation depend on it? I hear that this is going to be
> useful for something like the profiling library in compiler-rt too, so that
> the readers and writer implementations are both in LLVM. What are the
> technical roadblocks there, and in your opinion is this something worth
> fixing/enabling?
>

Sounds like other people have some ideas on that mentioned in the thread -
again, not an area I'm especially familiar with.

> - What is the preferred human-readable text file format to support in
> LLVM? I understand that there's already code to support parsing YAML, so
> this might be an obvious choice. OTOH JSON is really popular and there are
> a lot of parsers in other languages that can already deal with this file
> format. I'm happy to support both but was wondering whether there was a
> preference for YAML aside for the reason I already cite?
>

I really don't have much/any context here to make a judgement - I've
vaguely seen the existing YAML usage & know there was/is some in LLD, maybe
some being used over in the codeview debug info support (for generating
codeview debug info).

> - This proposal only talks of the tool itself, but the implementation of
> the tool involves some moving parts that are worth implementing as
> libraries and tested in isolation (or in combinations, some mocked and
> faked, etc). I'm a fan of writing unit tests for these things but I don't
> see a unittests/tools directory for these tool-specific internals testing.
> Is this something worth having? Any pointers on how to proceed with this
> unit-testing of tool-specific internals?
>

Generally we make the tools small and put any generically usable code in
libraries in LLVM (see libDebugInfo which was used for quite a while (&
parts of it still are) exclusively for llvm-dwarfdump (some parts are now
used in llvm-dwp and llvm-dsymutil)).

So if there's some reasonable library code you could put it in LLVM's lib
directory in an appropriate spot. Or you can add unit tests for a tool -
don't think there's any philosophical reason that'd be avoided.

- Dave

>
> Cheers
>
> -- Dean
>
> [0] http://lists.llvm.org/pipermail/llvm-dev/2016-April/098901.html
> [1] https://reviews.llvm.org/D21987
> [2] https://reviews.llvm.org/D21982
> [3] http://lists.llvm.org/pipermail/llvm-dev/2016-July/102552.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160906/0264f972/attachment.html>