[llvm-dev] Increasing address pool reuse/reducing .o file size in DWARFv5

David Blaikie via llvm-dev llvm-dev at lists.llvm.org
Sun Jan 12 11:43:39 PST 2020


On Fri, Jan 10, 2020 at 12:57 PM Vedant Kumar <vedant_kumar at apple.com>
wrote:

> I don't totally follow the proposed encoding change & would appreciate a
> small example.
>
> Is the idea to replace e.g. an 'AT_low_pc (<direct address>) + relocation
> for <direct address>' with an 'AT_low_pc (<indirection into a pool of
> addresses> + offset)',
>

With Split DWARF or with DWARFv5 in LLVM at the moment, all addresses are
indirected already. So it's:

Replace "AT_low_pc (<indirection into a pool of addresses>)" with an
"AT_low_pc (<indirection into a pool of addresses> + offset)".


> s.t. the cost of a relocation for the address is paid down the more it's
> used?
>

Right - specifically to reduce the pool of addresses down to, ideally, one
address per section/indivisible chunk of machine code (per subsection in
MachO, for instance) (whereas currently there are many addresses per
section)


> How do you figure the offset out?
>

Label difference - same as is done for DW_AT_high_pc today in DWARFv4 and
DWARFv5 in LLVM. high_pc currently uses the low_pc addresse to be relative
to, in this proposed situation, we'd use a symbol that's in the first bit
of debug info in the section (or subsection in MachO). So the low_pc of the
subprogram/function, for instance, or if there are two functions in the
same section with debug info for both, the low_pc of the first of those
functions, etc...


>
> thanks,
> vedant
>
> On Jan 8, 2020, at 1:33 PM, David Blaikie via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
> Sounds good all round - I'll commit these two modes, and maybe even the
> third (given Sony's interest & possible interest in changing their consumer
> to handle it) of a custom form to eek out the last few bytes from the more
> direct addr+offset encoding.
>
> I'll follow up here with flag names and revision numbers once they're in.
>
> On Wed, Jan 8, 2020 at 1:26 PM Robinson, Paul <paul.robinson at sony.com>
> wrote:
>
>> On some previous occasion that introduced additional indirection
>> (don't remember the details) my debugger people groused about the
>> additional performance cost of chasing down data in a different
>> object-file section.  So we (Sony) might be happier with low_pc as
>> expressions, than with a ranges-always solution.
>>
>> But hard to say without data, and getting both modes in at least
>> as a temporary thing sounds like a good plan.
>> --paulr
>>
>>
>> > -----Original Message-----
>> > From: aprantl at apple.com <aprantl at apple.com>
>> > Sent: Wednesday, January 8, 2020 1:49 PM
>> > To: David Blaikie <dblaikie at gmail.com>
>> > Cc: llvm-dev <llvm-dev at lists.llvm.org>; Jonas Devlieghere
>> > <jdevlieghere at apple.com>; Robinson, Paul <paul.robinson at sony.com>; Eric
>> > Christopher <echristo at gmail.com>; Frederic Riss <friss at apple.com>
>> > Subject: Re: Increasing address pool reuse/reducing .o file size in
>> > DWARFv5
>> >
>> > I think this sounds like a good plan for Linux. I would like to see the
>> > numbers for Darwin (= non-split DWARF) to decide whether we should just
>> > make that the default. Eric's suggestion of having this committed as an
>> > option first seems like a good step in that direction. If it is an
>> > advantage across the board we can remove the option and just make this
>> the
>> > default behavior.
>> >
>> > thanks,
>> > adrian
>> >
>> > > On Dec 30, 2019, at 12:08 PM, David Blaikie <dblaikie at gmail.com>
>> wrote:
>> > >
>> > > tl;dr: in DWARFv5, using DW_AT_ranges even when the range is
>> contiguous
>> > reduces linked, uncompressed debug_addr size for optimized builds by 93%
>> > and reduces total .o file size (with compression and split) by 15%. It
>> > does grow .dwo file size a bit - DWARFv5, no compression, not split
>> shows
>> > the net effect if all bytes are equal: -O3 clang binary grows by 0.4%,
>> -O0
>> > clang binary shrinks by 0.1%
>> > > Should we enable this strategy by default for DWARFv5, for
>> DWARFv5+Split
>> > DWARF, or not by default at all/only under a flag?
>> > >
>> > >
>> > >
>> > > So, I've brought this up a few times before - that DWARFv5 does a
>> pretty
>> > good job of reducing relocations (& reducing .o file size with Split
>> > DWARF) by allowing many uses of addresses to include some kind of
>> > address+offset (debug_rnglists and loclists allowing "base_address" then
>> > offset_pairs (an improvement over similar functionality in DWARFv4
>> because
>> > the offset pairs can be uleb encoded - so they can be quite compact))
>> > >
>> > > But one place that DWARFv5 misses to reduce relocations further is
>> > direct addresses from debug_info, such as DW_AT_low_pc.
>> > >
>> > > For a while I've wondered if we could use an extension form for
>> > addr+offset, and I prototyped this without an extension attribute, but
>> > instead using exprloc. This has slightly higher overhead to express
>> the...
>> > expression. (it's 9 bytes in total, could be as few as 5 with a custom
>> > form)
>> > >
>> > > But I had another idea that's more instantly deployable: Why not use
>> > DW_AT_ranges even when the range is contiguous? That way the low_pc that
>> > previously couldn't use an existing address pool entry + offset, could
>> use
>> > the rnglist support for base address.
>> > >
>> > > The only unnecessary address pool entries that remain that I've found
>> > are DW_AT_low_pc for DW_TAG_labels - but there's only a handful of those
>> > in most code. So the "ranges everywhere" strategy gets the addresses for
>> > optimized clang down from 4758 (v4 address pool used 9923 addresses... )
>> > to 342, with about ~4 "extra" addresses for DW_TAG_labels.
>> > >
>> > > This could also be a bit less costly if DWARFv5 rnglists didn't use a
>> > separate offset table (instead encoding the offsets directly in
>> > debug_info, rather than using indexes)
>> > >
>> > > I have patches for both the addr+offset exprloc and for the ranges-
>> > always, both with -mllvm flags - do people think they're both worth
>> > committing for experimentation? Neither? Default on in some cases (like
>> > Split DWARF)?
>> > >
>> > > Thanks,
>> > > - Dave
>>
>> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200112/e14d44dd/attachment.html>


More information about the llvm-dev mailing list