[Lldb-commits] [PATCH] D55434: ObjectFileBreakpad: Implement sections

Fri Jan 4 01:57:06 PST 2019

labath added a comment.

I was under the impression I've convinced you of this direction, but your questions make it sound like you're going back to the "standalone" breakpad file idea (which I am not fond of). I'll try to explain again what I'm doing here. This is going to be somewhat repetitive (for which I apologise), but I am trying to explain this from a slightly different angle this time.

The sections I'm creating here aren't the kind of sections that will be loaded in memory. They're non-loadable sections (like the sections without SHF_ALLOC flag in elf), whose only purpose is to carry data around. Similar to how .debug_info is a non-loadable section that carries DWARF data. In this code, I'm not trying to infer anything about the layout of the described executable file from the data in the breakpad file. I am only presenting a view **of** the data in the breakpad file, so that this connection can happen in SymbolFileBreakpad. So, ObjectFileBreakpad will create a section whose contents will be _literally_

  PUBLIC 1010 0 function1
  PUBLIC 1020 0 function2
  ...

Then, SymbolFileBreakpad will take this section, parse it (like SymbolFileDWARF parses .debug_info), cross-reference the information with the real object, and create appropriate symbols. This happens in D56173 <https://reviews.llvm.org/D56173>. I am currently working on other patches which take the line records from the breakpad file and create line tables. So here, ObjectFileBreakpad will provide a "FUNC" section (because in breakpad files line records are attached to the preceding FUNC record), similar to how ObjectFileELF provides .debug_line. Then SymbolFileBreakpad parses and presents it to LLDB (like SymbolFileDWARF parses .debug_line).

In this sense, a breakpad file should be similar to a symbol-only ELF file (the kind you produce with `strip --only-keep-debug`) -- this one also doesn't contain any loadable sections, and is merely a container for the symbol data.

When I speak about "discontinuity" in this patch, it means discontinuity in the descriptions themselves, not in the data being described. So a breakpad file like:

  FUNC 1000 10 0 function1
  FILE 0 /tmp/foo.c
  FUNC 1010 10 0 function2

is discontinuous because the two FUNC records are not next to each other even though the functions themselves are positioned one after the other. (I don't know why would anyone produce files like these, but the breakpad format description https://chromium.googlesource.com/breakpad/breakpad/+/master/docs/symbol_files.md explicitly allows that).

Also note that neither of these ObjectFileBreakpad nor SymbolFileBreakpad creates any loadable sections. I don't think that is necessary, as that can be done elsewhere (and better). I just use whatever sections are present in the main object file of the module. In practice this will either be a real loadable object file (elf/macho/coff), or a placeholder object file that is created when opening a minidump file. I think this makes sense for several reasons:

- determining the limits of the loadable section from the breakpad info is hard. There will always be some loaded data (various file headers, etc.) before the first symbol described by the breakpad file. And we won't also cannot be sure of the upper limit of the section if the last symbol is a PUBLIC symbol (as they don't have size). On the other hand, creating this from the minidump info is easy, as it knows the exact ranges (coming from /proc/pid/maps) and similar.
- better composability: having ObjectFileBreakpad be standalone will not allow us to get rid of the placeholder object files, as those will be still needed in cases when we don't have even the breakpad info. So we will need two branches in ObjectFileBreakpad (for when we have an object file vs. when we don't), and then placeholder files on top of that. Making breakpad files not be standalone let's us get rid of one of the branches in ObjectFileBreakpad
- Overall, I think object files should be as standalone as possible. They should not infer anything based on the information in other object files or elsewhere. They should just present the data that's present in the file itself and nothing more. Combining of data from should be done at a different level.

If this still hasn't convinced you :), and you think the standalone breakpad file is the better way to go, then I'd like to understand what are the advantages you see there, because right now I don't see any. In previous comments you were worried about being able to use a breakpad file without the matching exe file. If that isn't clear from the above, then I can reiterate that I do intend to support that flow. In fact, it is my primary use case. The only difference is I intend to achieve it via a combination of placeholder object files plus symbol information from breakpad files, rather than breakpad (object) files alone.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55434/new/

https://reviews.llvm.org/D55434