[PATCH] D24376: [XRay] Implement `llvm-xray convert` -- trace file conversion

Wed Nov 23 20:33:08 PST 2016

dberris added a comment.

In https://reviews.llvm.org/D24376#604995, @dblaikie wrote:

> In https://reviews.llvm.org/D24376#604864, @dberris wrote:
>
> > The other half is the new log format (still binary) is not conducive to this simple approach of loading up records with full information. The FDR mode log format is substantially different, in that per-thread buffers are chunked up into fixed-sized blocks and each record might be a metadata record (16 bytes) or a function record (8 bytes). That log has all sorts of information consolidated in very different ways (e.g. the thread ID is stored once, as a metadata record at the beginning of a buffer, CPU id's only show up when the CPU is first encountered, we store TSC deltas instead of full TSCs in function records), and is not conducive to just "load in memory, read records one by one".
> >
> > So then, the conversion tool ought to be able to convert the FDR mode logs into this simpler log format that we already support in this tool -- then let all the other tools upcoming (accounting, timeline visualisation, etc.) just use the simpler log format.
>
>
> I'm still a bit confused though - why not just convert to the yaml format at that point? (& why convert at all - if we're talking about other tools built on top of LLVM's library (tools built outside of that would presumably only rely on the YAML?) would use a common parser/reader API that could handle all formats, right? Rather than having the user run a conversion tool, then the tool they want to use)

YAML is a little big and verbose -- the blow-up from the binary format to YAML is a factor of at least 10. In practical terms, dealing with the binary format in terms of storage and transport is a good option.

YAML is convenient for human consumption, and if users ever build tools that use scripting languages or other things already deal with YAML (or JSON, or some other text format) then that's the convenient path to take.

Maybe the solution here is to move the log reading part into something that's part of the LLVM library, but that still has the problem that reconstruction of a consistent timeline representing events on multiple threads is much more complex than one that just takes a linear "simple" binary file. Then the XRay log file that has non-fixed-size records and more complex encoding ought to be converted to the simplified form.

So the compromise/trade-off being made here is that we localize the knowledge of dealing with the different XRay binary log formats to this tool, that can turn it into "the simplest XRay log format", that can be read by simpler libraries for consumption. Given that, and the plan of being able to convert these XRay log files into other non-XRay specific formats (Chrome trace viewer format which is JSON, or the perftools utilities, etc.) is a huge convenience. This convenience is not only for testing purposes but also for building and integrating with other existing tools that deal with all sorts of traces.

Does this make sense?

https://reviews.llvm.org/D24376