[PATCH] D21987: [XRay] Implement `llvm-xray extract`, start of the llvm-xray tool

Fri Sep 9 09:03:20 PDT 2016

dblaikie added a comment.

> > Still some outstanding design issues to discuss re: file formats, conversions, portability, etc. But once that's done.

> 

> > 

> 

> > I'm still concerned about reading/writing the binary format using memory mapping/splatting - I think the tools should be portable (in the same way that LLVM/Clang are - we don't build cross compilers, we just build tools that can read/write from/to any format on any platform), but even that's an implementation detail to a degree (unless making the tools that generic removes the need for the separate portable format/conversion tools).

> 

> 

> That's a tricky one. The problem with _not_ doing this memory-dump format is the cost of doing it in the XRay runtime.

Ah, sorry - I should clarify/be more specific.

I don't have so much of a problem with the writing side - I appreciate there are performance concerns there. (but wouldn't mind understanding them better - how important is the performance once xray features are turned on? (obviously having xray be as close to zero cost when not in use is important) What sort of cost do you see if it did write a consistent record format? (copying integers into known-width types, changing to a fixed endianness))

But what I'm thinking about is whether the tool could read the input no matter which platform it was running on - each architecture would have a different, but known, format (I mean, it already does - it's just defined more implicitly in terms of "what I can read from memory with this struct overlay"). So the reader can do all the "read this many bytes of integer, switch the endianness from whatever was written to whatever I need in this process, etc". That way the output file becomes portable, without changing the bits/incurring extra cost during the writing step.

Having the tools only work on the architecture where the file was generated seems especially tricky/not-great - and I don't think we have anything like that in LLVM today. llvm-objdump works on COFF files for ARM even when it's run on Linux X86, etc (or should do, if it supports COFF, etc - I haven't looked closely... ).

> We try to be as cheap as possible there (in compiler-rt) so that we don't spend too much time just formatting the log when the process is writing it down to disk. The tradeoff we've made here consciously is that any conversion should happen externally -- i.e. the tool should be able to read this format, and then turn it into something more manageable offline. That's a very conscious trade-off here because we can't afford to use too much time or too much memory on the compiler-rt implementation.

> > Higher level design/coding style stuff:

> 

> > 

> 

> > Might be worth pulling out each of the subcommands into their own utility function/file along with the command line arguments. Though I can see an argument (har har) to be made for keeping all the arguments in the same place - but maybe subcommand arguments make sense to be grouped together along with the functionality, separate from the main file.

> 

> 

> I thought about that, then it started to feel over-engineered -- since I'd have to figure out how to dispatch anyway from the main file, I still need to know which command to call based on which subcommand, this seemed like the more prudent thing to do.

Not sure I quite follow - I was thinking something fairly simple, just taking the contents of the "if (Extract)" block in main, and pulling it into a function (with whatever parameters it needs - probably not many, since they're all subcommand arguments anyway, mostly - which could be moved over to that other file (the implementation of extract) or left here and all passed in (but that might be a lot of argument passing)).

https://reviews.llvm.org/D21987