[llvm-dev] Increasing address pool reuse/reducing .o file size in DWARFv5

Robinson, Paul
Wed Jan 8 13:26:55 PST 2020

On some previous occasion that introduced additional indirection
(don't remember the details) my debugger people groused about the
additional performance cost of chasing down data in a different 
object-file section.  So we (Sony) might be happier with low_pc as
expressions, than with a ranges-always solution.

But hard to say without data, and getting both modes in at least
as a temporary thing sounds like a good plan.

> -----Original Message-----
aprantl at apple.com
> Sent: Wednesday, January 8, 2020 1:49 PM
> To: David Blaikie <dblaikie at gmail.com>
> Cc: llvm-dev <llvm-dev at lists.llvm.org>; Jonas Devlieghere
> <jdevlieghere at apple.com>; Robinson, Paul <paul.robinson at sony.com>; Eric
> Christopher <echristo at gmail.com>; Frederic Riss <friss at apple.com>
> Subject: Re: Increasing address pool reuse/reducing .o file size in
> I think this sounds like a good plan for Linux. I would like to see the
> numbers for Darwin (= non-split DWARF) to decide whether we should just
> make that the default. Eric's suggestion of having this committed as an
> option first seems like a good step in that direction. If it is an
> advantage across the board we can remove the option and just make this the
> default behavior.
> thanks,
> adrian
> > On Dec 30, 2019, at 12:08 PM, David Blaikie <dblaikie at gmail.com> wrote:
> >
> > tl;dr: in DWARFv5, using DW_AT_ranges even when the range is contiguous
> reduces linked, uncompressed debug_addr size for optimized builds by 93%
> and reduces total .o file size (with compression and split) by 15%. It
> does grow .dwo file size a bit - DWARFv5, no compression, not split shows
> the net effect if all bytes are equal: -O3 clang binary grows by 0.4%, -O0
> clang binary shrinks by 0.1%
> > Should we enable this strategy by default for DWARFv5, for DWARFv5+Split
> DWARF, or not by default at all/only under a flag?
> >
> >
> >
> > So, I've brought this up a few times before - that DWARFv5 does a pretty
> good job of reducing relocations (& reducing .o file size with Split
> DWARF) by allowing many uses of addresses to include some kind of
> address+offset (debug_rnglists and loclists allowing "base_address" then
> offset_pairs (an improvement over similar functionality in DWARFv4 because
> the offset pairs can be uleb encoded - so they can be quite compact))
> >
> > But one place that DWARFv5 misses to reduce relocations further is
> direct addresses from debug_info, such as DW_AT_low_pc.
> >
> > For a while I've wondered if we could use an extension form for
> addr+offset, and I prototyped this without an extension attribute, but
> instead using exprloc. This has slightly higher overhead to express the...
> expression. (it's 9 bytes in total, could be as few as 5 with a custom
> form)
> >
> > But I had another idea that's more instantly deployable: Why not use
> DW_AT_ranges even when the range is contiguous? That way the low_pc that
> previously couldn't use an existing address pool entry + offset, could use
> the rnglist support for base address.
> >
> > The only unnecessary address pool entries that remain that I've found
> are DW_AT_low_pc for DW_TAG_labels - but there's only a handful of those
> in most code. So the "ranges everywhere" strategy gets the addresses for
> optimized clang down from 4758 (v4 address pool used 9923 addresses... )
> to 342, with about ~4 "extra" addresses for DW_TAG_labels.
> >
> > This could also be a bit less costly if DWARFv5 rnglists didn't use a
> separate offset table (instead encoding the offsets directly in
> debug_info, rather than using indexes)
> >
> > I have patches for both the addr+offset exprloc and for the ranges-
> always, both with -mllvm flags - do people think they're both worth
> committing for experimentation? Neither? Default on in some cases (like
> Split DWARF)?
> >
> > Thanks,
> > - Dave

