[llvm-dev] Increasing address pool reuse/reducing .o file size in DWARFv5

Vedant Kumar via llvm-dev llvm-dev at lists.llvm.org
Fri Jan 10 12:57:01 PST 2020


I don't totally follow the proposed encoding change & would appreciate a small example.

Is the idea to replace e.g. an 'AT_low_pc (<direct address>) + relocation for <direct address>' with an 'AT_low_pc (<indirection into a pool of addresses> + offset)', s.t. the cost of a relocation for the address is paid down the more it's used? How do you figure the offset out?

thanks,
vedant

> On Jan 8, 2020, at 1:33 PM, David Blaikie via llvm-dev <llvm-dev at lists.llvm.org> wrote:
> 
> Sounds good all round - I'll commit these two modes, and maybe even the third (given Sony's interest & possible interest in changing their consumer to handle it) of a custom form to eek out the last few bytes from the more direct addr+offset encoding.
> 
> I'll follow up here with flag names and revision numbers once they're in.
> 
> On Wed, Jan 8, 2020 at 1:26 PM Robinson, Paul <paul.robinson at sony.com <mailto:paul.robinson at sony.com>> wrote:
> On some previous occasion that introduced additional indirection
> (don't remember the details) my debugger people groused about the
> additional performance cost of chasing down data in a different 
> object-file section.  So we (Sony) might be happier with low_pc as
> expressions, than with a ranges-always solution.
> 
> But hard to say without data, and getting both modes in at least
> as a temporary thing sounds like a good plan.
> --paulr
> 
> 
> > -----Original Message-----
> > From: aprantl at apple.com <mailto:aprantl at apple.com> <aprantl at apple.com <mailto:aprantl at apple.com>>
> > Sent: Wednesday, January 8, 2020 1:49 PM
> > To: David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>>
> > Cc: llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>>; Jonas Devlieghere
> > <jdevlieghere at apple.com <mailto:jdevlieghere at apple.com>>; Robinson, Paul <paul.robinson at sony.com <mailto:paul.robinson at sony.com>>; Eric
> > Christopher <echristo at gmail.com <mailto:echristo at gmail.com>>; Frederic Riss <friss at apple.com <mailto:friss at apple.com>>
> > Subject: Re: Increasing address pool reuse/reducing .o file size in
> > DWARFv5
> > 
> > I think this sounds like a good plan for Linux. I would like to see the
> > numbers for Darwin (= non-split DWARF) to decide whether we should just
> > make that the default. Eric's suggestion of having this committed as an
> > option first seems like a good step in that direction. If it is an
> > advantage across the board we can remove the option and just make this the
> > default behavior.
> > 
> > thanks,
> > adrian
> > 
> > > On Dec 30, 2019, at 12:08 PM, David Blaikie <dblaikie at gmail.com <mailto:dblaikie at gmail.com>> wrote:
> > >
> > > tl;dr: in DWARFv5, using DW_AT_ranges even when the range is contiguous
> > reduces linked, uncompressed debug_addr size for optimized builds by 93%
> > and reduces total .o file size (with compression and split) by 15%. It
> > does grow .dwo file size a bit - DWARFv5, no compression, not split shows
> > the net effect if all bytes are equal: -O3 clang binary grows by 0.4%, -O0
> > clang binary shrinks by 0.1%
> > > Should we enable this strategy by default for DWARFv5, for DWARFv5+Split
> > DWARF, or not by default at all/only under a flag?
> > >
> > >
> > >
> > > So, I've brought this up a few times before - that DWARFv5 does a pretty
> > good job of reducing relocations (& reducing .o file size with Split
> > DWARF) by allowing many uses of addresses to include some kind of
> > address+offset (debug_rnglists and loclists allowing "base_address" then
> > offset_pairs (an improvement over similar functionality in DWARFv4 because
> > the offset pairs can be uleb encoded - so they can be quite compact))
> > >
> > > But one place that DWARFv5 misses to reduce relocations further is
> > direct addresses from debug_info, such as DW_AT_low_pc.
> > >
> > > For a while I've wondered if we could use an extension form for
> > addr+offset, and I prototyped this without an extension attribute, but
> > instead using exprloc. This has slightly higher overhead to express the...
> > expression. (it's 9 bytes in total, could be as few as 5 with a custom
> > form)
> > >
> > > But I had another idea that's more instantly deployable: Why not use
> > DW_AT_ranges even when the range is contiguous? That way the low_pc that
> > previously couldn't use an existing address pool entry + offset, could use
> > the rnglist support for base address.
> > >
> > > The only unnecessary address pool entries that remain that I've found
> > are DW_AT_low_pc for DW_TAG_labels - but there's only a handful of those
> > in most code. So the "ranges everywhere" strategy gets the addresses for
> > optimized clang down from 4758 (v4 address pool used 9923 addresses... )
> > to 342, with about ~4 "extra" addresses for DW_TAG_labels.
> > >
> > > This could also be a bit less costly if DWARFv5 rnglists didn't use a
> > separate offset table (instead encoding the offsets directly in
> > debug_info, rather than using indexes)
> > >
> > > I have patches for both the addr+offset exprloc and for the ranges-
> > always, both with -mllvm flags - do people think they're both worth
> > committing for experimentation? Neither? Default on in some cases (like
> > Split DWARF)?
> > >
> > > Thanks,
> > > - Dave
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200110/0ec0b006/attachment.html>


More information about the llvm-dev mailing list