[llvm-dev] Reducing DWARF emitter memory consumption

Peter Collingbourne via llvm-dev llvm-dev at lists.llvm.org
Wed Feb 10 14:42:08 PST 2016

On Fri, Feb 05, 2016 at 05:35:14PM -0800, David Blaikie wrote:
> On Fri, Feb 5, 2016 at 5:04 PM, Peter Collingbourne <peter at pcc.me.uk> wrote:
> > Thanks, I'll look into that. (Though earlier you told me that debug info
> > for types could be extended while walking the IR, so I wouldn't have
> > thought
> > that would have worked.)
> >
> >
> Yeah, had to think about it more - and as I think about it - I'm moderately
> sure type units (which don't include these latent extensions) will be
> pretty close to static. With just the stmt_list relocation in non-fission
> type units which /should/ still be knowable up-front.

I've implemented a change which does this, and looked at impact on memory
consumption and binary size running "llc" on Chromium's 50 largest (by
bitcode size) translation units. Bottom line: *huge* savings in total memory
consumption, median 17% when compared to before the change, median 7% when
compared to type units disabled.

(I'm not yet confident that my patch is correct (some of the section sizes
are different and I'll need to double check what's going on there) but I'll
send it out once I'm confident in it.)

I think we can do better, though. With type units enabled, the size of
.debug_info as a fraction of (.debug_info + .debug_types) is median ~40%,
so I think there's another ~12% that can be saved by avoiding DIE/DIEValue
retention for debug_info, bringing the total to ~30%. I expect numbers with
type units disabled to be in the same ballpark (with type units enabled,
we consume ~25% more space in the object file on .debug_info + .debug_types,
so the proportional savings may be less, but the absolute memory consumption
should be lower).  This also roughly lines up with the heap profiler figures
from before.

My conclusion from all this: I think we should do it, and I think it would
especially help in LTO mode with type units disabled: the type units feature
is redundant with LTO deduplication and would therefore add unnecessary bloat
to object files, which would mean increased memory usage (I measured a ~10%
median increase in memory usage comparing the current type units implementation
against type units disabled -- not an entirely fair comparison, but probably
good enough).

I have a plan in mind for doing this incrementally: we will start using the
more efficient data structure at the leaves of the DIE tree, and gradually
expand out to the root. You'll see what that looks like once I have my first
patch ready.


