[llvm-dev] Increasing address pool reuse/reducing .o file size in DWARFv5

Adrian Prantl via llvm-dev llvm-dev at lists.llvm.org
Wed Jan 8 10:49:06 PST 2020


I think this sounds like a good plan for Linux. I would like to see the numbers for Darwin (= non-split DWARF) to decide whether we should just make that the default. Eric's suggestion of having this committed as an option first seems like a good step in that direction. If it is an advantage across the board we can remove the option and just make this the default behavior.

thanks,
adrian

> On Dec 30, 2019, at 12:08 PM, David Blaikie <dblaikie at gmail.com> wrote:
> 
> tl;dr: in DWARFv5, using DW_AT_ranges even when the range is contiguous reduces linked, uncompressed debug_addr size for optimized builds by 93% and reduces total .o file size (with compression and split) by 15%. It does grow .dwo file size a bit - DWARFv5, no compression, not split shows the net effect if all bytes are equal: -O3 clang binary grows by 0.4%, -O0 clang binary shrinks by 0.1%
> Should we enable this strategy by default for DWARFv5, for DWARFv5+Split DWARF, or not by default at all/only under a flag?
> 
> 
> 
> So, I've brought this up a few times before - that DWARFv5 does a pretty good job of reducing relocations (& reducing .o file size with Split DWARF) by allowing many uses of addresses to include some kind of address+offset (debug_rnglists and loclists allowing "base_address" then offset_pairs (an improvement over similar functionality in DWARFv4 because the offset pairs can be uleb encoded - so they can be quite compact))
> 
> But one place that DWARFv5 misses to reduce relocations further is direct addresses from debug_info, such as DW_AT_low_pc.
> 
> For a while I've wondered if we could use an extension form for addr+offset, and I prototyped this without an extension attribute, but instead using exprloc. This has slightly higher overhead to express the... expression. (it's 9 bytes in total, could be as few as 5 with a custom form)
> 
> But I had another idea that's more instantly deployable: Why not use DW_AT_ranges even when the range is contiguous? That way the low_pc that previously couldn't use an existing address pool entry + offset, could use the rnglist support for base address. 
> 
> The only unnecessary address pool entries that remain that I've found are DW_AT_low_pc for DW_TAG_labels - but there's only a handful of those in most code. So the "ranges everywhere" strategy gets the addresses for optimized clang down from 4758 (v4 address pool used 9923 addresses... ) to 342, with about ~4 "extra" addresses for DW_TAG_labels. 
> 
> This could also be a bit less costly if DWARFv5 rnglists didn't use a separate offset table (instead encoding the offsets directly in debug_info, rather than using indexes)
> 
> I have patches for both the addr+offset exprloc and for the ranges-always, both with -mllvm flags - do people think they're both worth committing for experimentation? Neither? Default on in some cases (like Split DWARF)?
> 
> Thanks,
> - Dave



More information about the llvm-dev mailing list