[lldb-dev] new tool (core2yaml) + a new top-level library (Formats)

Tue Mar 5 13:47:26 PST 2019

Hi Pavel,

On Tue, Mar 5, 2019 at 8:31 AM Pavel Labath via lldb-dev <
lldb-dev at lists.llvm.org> wrote:

> Hello all,
>
> I have just posted a large-ish patch series for review (D58971, D58973,
> D58975, D58976), and I want to use this opportunity to draw more
> attention to it and highlight various bikeshedding
> opportunities^H^H^Htopics for discussion :).
>
> The new tool is called core2yaml, and it's goal is to fill the gap in
> the testing story for core files. As you might know, at present, the
> only way to test core file parsing code (*) is to check in an opaque
> binary blob and have the debugger open that. This presents a couple of
> challenges:
> - it's really hard to review what is inside the core file
> - one has to jump through various hoops to create a "small" core file
> This tools fixes both issues by enabling one to check in text files,
> with human-readable content. The yaml files can also be easily edited to
> prune out the content which is not relevant for the test. While that's
> not my goal at present, I am hoping that this will one day enable us to
> write self-contained tests for the unwinder, as the core file can be
> used to synthesize (or capture&reduce) interesting unwinder scenarios.
>
> Since I also needed to find a home for the new code I was writing, I
> thought this would be good opportunity to create a new library for
> various stuff. The goals I was trying to solve are:
> - make the yaml code a library. The reason for that is that we have a
> number of unittests using checked in binaries, and I thought it would be
> nice to be able to convert those to use yaml representation as well.
> - make the existing minidump parsing code more easily accessible. The
> parsing code currently lives in source/Plugins/Process/minidump, and is
> impossible to use it without pulling in the rest of lldb (which the tool
> doesn't need).
> The solution I came up with here is a new "Formats" library. I chose a
> fairly generic name, because I realized that we have code for
> (de)serializing a bunch of small formats, which don't really have a good
> place to live in. Currently I needed a parser for linux /proc/PID/maps
> files and minidump files, but I am hoping that a generic name would
> enable us to one day move the gdb-remote protocol code there (which is
> also currently buried in some plugin code, which makes it hard to depend
> on from lldb-server), as well as the future debug-info-server, if it
> ever comes into existence.
>
> Discussion topic #1: The library name and scope.
> There are lost of other ways this could be organized. One of the names I
> considered was "BinaryFormat" for symmetry with llvm, but then I chose
> to drop the "Binary" part as it seemed to me we have plenty of
> non-binary formats as well. As for it's dependencies I currently have it
> depending on Utility and nothing else (as far as lldb libraries go). I
> can imagine using some Host code might be useful there too, but I would
> like to avoid any other lldb dependencies right now. Another question is
> whether this should be a single library or a bunch of smaller ones. I
> chose a single library now because the things I initially plan to put
> there are fairly small (/proc/pid/maps parser is 200 LOC), but I can see
> how we may want to create sub-libraries for things that grow big (the
> debug-info server code might turn out to be one of those) or that have
> some additional dependencies.
>

I don't have strong opinions here, nor do I have a better suggestion for
the name.

> Discussion topic #2: tool name and scope
> A case could be made to integrate this functionality into the llvm
> yaml2obj utilities. Here I chose not to do that because the minidump
> format is not at all implemented in llvm, and I do not see a use case
> for it to be implemented/moved there. A stronger case could be made to
> put the elf core code there, since llvm already supports reading elf
> files. While originally being in favour of that, I eventually adopted
> the view that doing this in lldb would be better because:
> - it would bring more symmetry with minidumps
> - it would enable us to do fine-grained yamlization for things that we
> care about (e.g., registers), which is something that would probably be
> uninteresting to the rest of llvm.
>

I don't know much about the minidump format or code, but it sounds
reasonable for me to have support for it in yaml2obj, which would be a
sufficient motivation to have the code live there. As you mention in your
footnote, MachO core files are already supported, and it sounds like ELF
could reuse a bunch of existing code as well. So having everything in LLVM
would give you even more symmetry. I also doubt anyone would mind having
more fine grained yamlization, even if you cannot use it to reduce a test
it's nicer to see structure than a binary blob (imho). Anyway, that's just
my take, I guess this is more of a question for the LLVM mailing list.

> Discussion topic #3: Use of .def files in lldb. In one of the patches a
> create a .def textual header to be used for avoiding repetitive code
> when dealing various constants. This is fairly common practice in llvm,
> but would be a first in lldb.
>

I think this is a good idea. Although not exactly the same, we already got
our feet wet with a tablegen file in the driver.

> Discussion topic #4: Overlap with "process plugin dump". This tool has
> some overlap with the given command for minidump files, which also
> provides a textual description of minidump files. In case we are ok with
> tweaking the interface of that command slightly (and ok with some yaml
> artefacts in it's output), it should be possible to reimplement that
> command on top of the yaml serialization library.
>
> Discussion topic #5: Anything else I haven't thought of.
>
> regards,
> pavel
>
> (*) This is not entirely true for MachO core files, where yaml2obj is
> already able to convert the core files into text form. However, it is
> definitely true for ELF and minidump core files, and even the MachO yaml
> for isn't that well suited for manual viewing or reduction.
> _______________________________________________
> lldb-dev mailing list
> lldb-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20190305/aeb2cd29/attachment.html>