[lldb-dev] Adding DWARF5 accelerator table support to llvm

Wed Jun 13 06:56:32 PDT 2018

Hello again,

It's been nearly six months since my first email, so it's a good time
to recap what has been done here so far. I am happy to report that
stages 1-3 (i.e. producer/consumer in llvm and integration with lldb)
of my original plan are now complete with one caveat.

The caveat is that the .debug_names section is presently not a full
drop-in replacement for the .apple_*** sections. The reason for that
is that there is no equivalent to the .apple_objc section (which links
an objc class/category name  to all of its methods). I did not
implement that, because I do not see a way to embed that kind of
information to this section without some sort of an extension. Given
that this was not required for my use case, I felt it would be best to
leave this to the people working on objc support (*looks at Jonas*) to
work out the details of how to represent that.

Nonetheless, I believe that the emitted .debug_names section contains
all the data that is required by the standard, and it is sufficient to
pass all tests in the lldb integration test suite on linux (this
doesn't include objc tests). Simple benchmarks also show a large
performance improvement.I have some numbers to illustrate that
(measurements taken by using a release build of lldb to debug a debug
build of clang, clang was built with -mllvm -accel-tables=Dwarf to
enable the accelerator generation, usage of the tables was controlled
by a setting in lldb):
- setting a breakpoint on a non-existing function without the use of
accelerator tables:
real    0m5.554s
user    0m43.764s
sys     0m6.748s
(The majority of this time is spend on building a debug info index,
which is a one-shot thing. subsequent breakpoints would be fast)

- setting a breakpoint on a non-existing function with accelerator tables:
real    0m3.517s
user    0m3.136s
sys     0m0.376s
(With the index already present, we are able to quickly determine that
there is no match and finish)

- setting a breakpoint on all "dump" functions without the use of
accelerator tables:
real    0m21.544s
user    0m59.588s
sys     0m6.796s
(Apart from building the index, now we must also parse a bunch of
compile units and line tables to resolve the breakpoint locations)

- setting a breakpoint on all "dump" functions with accelerator tables:
real    0m23.644s
user    0m22.692s
sys     0m0.948s
(Here we see that this extra work is actually the bottleneck now.
Preliminary analysis shows that majority of this time is spend
inserting line table entries into the middle of a vector, which means
it should be possible to fix this with a smarter implementation).

As far as object file sizes go, in the resulting clang binary (2.3GB),
the new .debug_names section takes up about 160MB (7%), which isn't
negligible, but considering that it supersedes the
.debug_pubnames/.debug_pubtypes tables whose combined size is 490MB
(21% of the binary), switching to this table (and dropping the other
two) will have a positive impact on the binary size. Further
reductions can be made by merging the individual indexes into one
large index as a part of the link step (which will also increase
debugger speed), but it's hard to quantify the exact impact of that.

With all of this in mind, I'd like to encourage you to give the new
tables a try. All you need to do is pass -mllvm -accel-tables=Dwarf to
clang while building your project. lldb should use the generated
tables automatically. I'm particularly interested in the interop
scenario. I've checked that readelf is able to make sense of the
generated tables, but if you have any other producer/consumer of these
tables which is independent of llvm, I'd like to know whether we are
compatible with it.

I'd also like to make the new functionality more easily accessible to
users. I am not sure what our policy here is, but I was thinking of
either including this functionality in -glldb (on non-apple targets);
or by adding a separate -g flag for it (-gdebug-names-section?), with
the goal of eventual inclusion into -glldb. I exclude apple targets
because: a) they already have a thing that works and the lack of
.apple_objc would be a pessimization; b) the different debug info
distribution model means it requires more testing and code (dsymutil).
For other targets this should bring a big performance improvement when
debugging with lldb. The lack of .apple_objc will mean lldb will have
to either index the objc compile units manually, or implement a more
complicated lookup using other information in the section. However,
Objective C is not that widespread outside of apple platforms, so the
impact of this should be minimal.

What do you think?

regards,
pavel