<div dir="ltr"><div>I'm all for anything that allows people to test without having to use pre-canned binaries. I'm not particularly familiar with the minidump format, so I'm not sure what the best place for code relating to it would be, but I do agree that extending yaml2obj sounds like a good idea. From what you say, minidumps don't sound like they'd fit the ObjectFile class well, so I don't see an issue with a new MinidumpFile class, if it will work well with how yaml2obj is currently written.</div><div><br></div><div>James<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, 6 Mar 2019 at 14:00, Pavel Labath <<a href="mailto:labath@google.com" target="_blank">labath@google.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hello all,<br>

<br>

yesterday I sent an email<br>

<<a href="http://lists.llvm.org/pipermail/lldb-dev/2019-March/014811.html" rel="noreferrer" target="_blank">http://lists.llvm.org/pipermail/lldb-dev/2019-March/014811.html</a>> to<br>

lldb-dev proposing a new tool in lldb for yamlization of minidump files.<br>

It's been suggested to me that instead of a new tool it may be better to<br>

add support for that format to obj2yaml instead. Hence, this email. :)<br>

<br>

As I expect most people are unfamiliar with this format, I'm going to<br>

start off with a brief introduction.<br>

<br>

Minidump is the native "core file" format for windows systems. However,<br>

it is widely used on other systems too. Probably the most popular tools<br>

producing this format are the Google "breakpad" and "crashpad" crash<br>

reporting systems. LLDB has support for this format since 2016, when it<br>

was added as a GSoC project by Dimitar Vlahovski. It currently in active<br>

use and development by several lldb contributors.<br>

<br>

The format itself is fairly simple and extensible. The file starts of<br>

with a header containing some basic info and a collection of "streams".<br>

Each stream contains various types of information about the state of the<br>

process at the time when the snapshot (minidump) was taken. This<br>

includes information such as:<br>

- list of loaded modules<br>

- list of threads<br>

- chunks of process memory<br>

- etc.<br>

<br>

The problem I'm trying to solve right now is how to write tests for this<br>

functionality. We currently don't have any tool which could create<br>

minidump files from human-readable descriptions of them, so our tests<br>

are relying on checking in opaque binary blobs. This makes reviewing the<br>

changes hard and also complicates creating test cases (real-world<br>

minidumps tend to be large). In other words, we are missing a tool like<br>

yaml2minidump.<br>

<br>

=== end of introduction ===<br>

<br>

While we could create an lldb tool for converting between minidump and<br>

yaml files, there is some appeal in making everything available from a<br>

single tool (i.e., yaml2obj). The main obstacle to that is that there is<br>

currently no support for parsing these files in llvm, and apart from<br>

yaml2obj, it's not clear to me whether any other llvm tool/project would<br>

benefit from this functionality being available in the main llvm<br>

project. For example tools, like llvm-readelf have support for elf core<br>

files, but this is mostly a byproduct of the fact that elf core files<br>

are similar to elf executables. However, there is no "executable" form<br>

of minidumps.<br>

<br>

So I am asking this question: Do you think having minidump parsing code<br>

in llvm is a good idea?<br>

<br>

To give you an idea of what this involves, the current minidump parser<br>

in lldb is about 2000 LOC. It's already fairly independent of the rest<br>

of lldb, though it would need to be cleaned up a bit to be up to llvm<br>

standards. My expectation is that the yaml conversion code would add<br>

another 1-2 kLOC.<br>

<br>

The natural place for this in llvm would seem to be the Object library,<br>

so I'd propose for this code to be placed there. The thing I'm not sure<br>

about is whether it makes sense to integrate this into the existing<br>

ObjectFile hierarchy. While the minidump "streams" could be represented<br>

as sections, I'm not sure we'd be doing anyone a favour by doing that.<br>

The ObjectFile sections assume they are referring to sections in regular<br>

object files, which have things like relocations, symbol lists, etc., and<br>

minidump streams have none of those. Therefore I'm leaning towards the<br>

option of just implementing this as a standalone MinidumpFile class.<br>

This would be kind of similar to the existing ELFFile class, only there wouldn't<br>

be an ELFObjectFile sitting on top of that.<br>

<br>

Please let me know what do you think,<br>

pavel<br>

</blockquote></div>