[llvm-dev] Increasing address pool reuse/reducing .o file size in DWARFv5

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Mon Dec 30 12:08:03 PST 2019

tl;dr: in DWARFv5, using DW_AT_ranges even when the range is contiguous
reduces linked, uncompressed debug_addr size for optimized builds by 93%
and reduces total .o file size (with compression and split) by 15%. It does
grow .dwo file size a bit - DWARFv5, no compression, not split shows the
net effect if all bytes are equal: -O3 clang binary grows by 0.4%, -O0
clang binary shrinks by 0.1%
Should we enable this strategy by default for DWARFv5, for DWARFv5+Split
DWARF, or not by default at all/only under a flag?

So, I've brought this up a few times before - that DWARFv5 does a pretty
good job of reducing relocations (& reducing .o file size with Split DWARF)
by allowing many uses of addresses to include some kind of address+offset
(debug_rnglists and loclists allowing "base_address" then offset_pairs (an
improvement over similar functionality in DWARFv4 because the offset pairs
can be uleb encoded - so they can be quite compact))

But one place that DWARFv5 misses to reduce relocations further is direct
addresses from debug_info, such as DW_AT_low_pc.

For a while I've wondered if we could use an extension form for
addr+offset, and I prototyped this without an extension attribute, but
instead using exprloc. This has slightly higher overhead to express the...
expression. (it's 9 bytes in total, could be as few as 5 with a custom form)

But I had another idea that's more instantly deployable: Why not use
DW_AT_ranges even when the range is contiguous? That way the low_pc that
previously couldn't use an existing address pool entry + offset, could use
the rnglist support for base address.

The only unnecessary address pool entries that remain that I've found are
DW_AT_low_pc for DW_TAG_labels - but there's only a handful of those in
most code. So the "ranges everywhere" strategy gets the addresses for
optimized clang down from 4758 (v4 address pool used 9923 addresses... ) to
342, with about ~4 "extra" addresses for DW_TAG_labels.

This could also be a bit less costly if DWARFv5 rnglists didn't use a
separate offset table (instead encoding the offsets directly in debug_info,
rather than using indexes)

I have patches for both the addr+offset exprloc and for the ranges-always,
both with -mllvm flags - do people think they're both worth committing for
experimentation? Neither? Default on in some cases (like Split DWARF)?

- Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191230/69e7572c/attachment.html>

More information about the llvm-dev mailing list