[lldb-dev] New DWARF parser proposal
Richard Mitton
richard at codersnotes.com
Fri Aug 30 13:27:18 PDT 2013
DWARF parsing is currently very slow. On my machine, loading the 'clang'
binary into lldb takes 14 seconds (vs. gdb's 29 seconds). The actual I/O
cost of reading that much data is only around 2 seconds.
The DWARF parser has to read the entire .debug_info section into memory
and parse it. It would be great if we didn't have to do this. OS X has
the .apple_names section et al, which allow lldb to automatically have
an index without having to parse anything.
However, this is an Apple extension and does not exist on other
platforms. There are a bunch of accelerator tables that the DWARF spec
allows for, but they're all unusable. pubnames is either absent or
incomplete, aranges might not be present, and if they were generated by
the compiler rather than the linker, then they may not cover all the
object files in the binary (causing lldb to miss symbols). So we end up
in the situation we have now, where we cannot use or trust the
accelerators, and have to parse everything anyway to build our own index.
I believe lldb does this the wrong way. Performing just a simple "b
main" test, it will touch the entire debug_info section *4 times* (at
least), which on the clang binary example is 600MB of data each time:
- pass 1 (extract DIEs)
- pass 2 (index DIEs)
- pass 3 (extract DIEs)
- pass 4 (build arange tables)
I believe the key problems of the current design are:
1) lldb tries to build it's own DIE array copy, rather than just
referring to the existing data in-place. This adds a significant
management overhead to all functions accessing this data.
2) lldb goes to great efforts to avoid reading the entire debug
information (even though it will ultimately need to anyway) and to avoid
keeping it in memory. This in fact causes it to *reload* it several
times, as each further operation performs lazy initialization and causes
a re-parse.
If we just accepted that we are forced to load all the data once, it
would actually be faster. My suggestion therefore is to write an
optimized single-pass DWARF indexer to replace the current DWARF loader,
with the following properties:
- Always read debug info exactly once upon module load (unless we can
guarantee apple extensions are used).
- Use the entire debug_info section in-place, without trying to build a
copy. Not having separate stages for extraction and indexing will allow
efficient data traversal.
- Make use of the abbreviation tables to pre-build decoder structures
for each DWARF tag type. In most cases we can know the size of each DIE
the moment we read it's abbreviation code in, and can skip in one
operation if needed without having to parse the elements. Because we run
in one pass, we never have to even look at DIEs we don't need.
- Track the parent scope as we go, on a stack, so we don't have to keep
doing lookups which walk the DIE tree. The current parser walks up the
tree to find what scope it's in, even though it already parsed the
parent scope container.
- Build arange tables automatically as we go, ignoring any that might
already be present. We have already touched and extracted the range data
anyway, it would be trivial to build an accelerator table for free.
- For strings, we should pre-pool the DWARF string table once up-front,
to avoid repeatedly pooling strings for each DIE.
With this approach we use the DIE data as-is in memory, without having
to make our own copy.
Parent chains should ideally only be used during parsing. If
parents/siblings are really needed after the initial parse, one easy
solution would be to just store that in a separate hash table.
I welcome discussion on this. I think it's important for lldb to not
have any delays on loading programs, and as we cannot control what the
compilers will supply to us, we have to address this on our end.
--
Richard Mitton
richard at codersnotes.com
More information about the lldb-dev
mailing list