[llvm-dev] [RFC] - Deduplication of debug information in linkers (LLD).

Tue Dec 5 07:13:48 PST 2017

From: Rui Ueyama [mailto:ruiu at google.com]
Sent: Monday, December 04, 2017 9:09 PM
To: Robinson, Paul
Cc: George Rimar; llvm-dev at lists.llvm.org
Subject: Re: [llvm-dev] [RFC] - Deduplication of debug information in linkers (LLD).

On Mon, Dec 4, 2017 at 10:49 AM, Robinson, Paul via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Thanks for providing the experimental data!  It clearly shows the value of type sections in DWARF.
Regarding why type sections are off by default, aside from the issue of consumers needing to understand them, there is a size penalty to type sections that becomes more evident in smaller projects (meaning, fewer compilation units).  The size penalty can be balanced against the amount of deduplication for a net win, if you have enough duplicates that you can eliminate.  But it is a tradeoff.

By a size penalty, which do you mean, the size of the final executable or the intermediate object files? If it is a size penalty of object files, how much is that? I wonder if the current situation is a reasonable trade-off.

When we emit a type section instead of directly emitting the type to .debug_info, we effectively extract the type description and move it into the type section; however the type section also has overhead, consisting of a header and some wrapper around the type information, and possibly some additional context.  This is obviously bigger than the original description.  Also references to the type become bigger; at a minimum, they are each 8 bytes, rather than the usual 4 bytes.  Repeat this overhead for each type moved to a type section.  All of this results in a bigger intermediate object file.  I have not tried to measure how much this is for "typical" compilation units.  IIRC, LLVM chooses to move enums and aggregates into type units; it does not assess the size of a type description as part of its heuristic.

If none of the type sections are duplicated in other object files, then the final executable will be just as much bigger as the linkfiles.  To the extent that there are duplicates the linker can eliminate, you start to claw back space consumed by the overhead.  If you have enough duplicates to eliminate, you have a net size win in the executable.

--paulr
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20171205/8f80560a/attachment.html>