<div dir="ltr"><br><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Feb 10, 2016 at 2:42 PM, Peter Collingbourne <span dir="ltr"><<a href="mailto:peter@pcc.me.uk" target="_blank">peter@pcc.me.uk</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On Fri, Feb 05, 2016 at 05:35:14PM -0800, David Blaikie wrote:<br>

> On Fri, Feb 5, 2016 at 5:04 PM, Peter Collingbourne <<a href="mailto:peter@pcc.me.uk">peter@pcc.me.uk</a>> wrote:<br>

><br>

> > Thanks, I'll look into that. (Though earlier you told me that debug info<br>

> > for types could be extended while walking the IR, so I wouldn't have<br>

> > thought<br>

> > that would have worked.)<br>

> ><br>

> ><br>

> Yeah, had to think about it more - and as I think about it - I'm moderately<br>

> sure type units (which don't include these latent extensions) will be<br>

> pretty close to static. With just the stmt_list relocation in non-fission<br>

> type units which /should/ still be knowable up-front.<br>

<br>

</span>I've implemented a change which does this, and looked at impact on memory<br>

consumption and binary size running "llc" on Chromium's 50 largest (by<br>

bitcode size) translation units. Bottom line: *huge* savings in total memory<br>

consumption, median 17% when compared to before the change, median 7% when<br>

compared to type units disabled.<br>

<br>

(I'm not yet confident that my patch is correct (some of the section sizes<br>

are different and I'll need to double check what's going on there) but I'll<br>

send it out once I'm confident in it.)<br>

<br>

I think we can do better, though. With type units enabled, the size of<br>

.debug_info as a fraction of (.debug_info + .debug_types) is median ~40%,<br>

so I think there's another ~12% that can be saved by avoiding DIE/DIEValue<br>

retention for debug_info, bringing the total to ~30%. I expect numbers with<br>

type units disabled to be in the same ballpark (with type units enabled,<br>

we consume ~25% more space in the object file on .debug_info + .debug_types,<br>

so the proportional savings may be less, but the absolute memory consumption<br>

should be lower).  This also roughly lines up with the heap profiler figures<br>

from before.<br>

<br>

My conclusion from all this: I think we should do it, and I think it would<br>

especially help in LTO mode with type units disabled: the type units feature<br>

is redundant with LTO deduplication and would therefore add unnecessary bloat<br>

to object files, which would mean increased memory usage (I measured a ~10%<br>

median increase in memory usage comparing the current type units implementation<br>

against type units disabled -- not an entirely fair comparison, but probably<br>

good enough).<br></blockquote><div><br></div><div>Oh, that's a fair point, for sure - you're particularly interested in LTO where, I agree totally, type units are entirely overhead.<br><br>Pity, as I sort of liked that solution - not having to complicate the DIE hierarchy, etc.<br><br>But improving the DIE hierarchy itself has more general benefits, for sure - outside just type units, and outside just LLVM itself. Into tools like llvm-dsymutil, which is nice.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">I have a plan in mind for doing this incrementally: we will start using the<br>

more efficient data structure at the leaves of the DIE tree, and gradually<br>

expand out to the root. You'll see what that looks like once I have my first<br>

patch ready.</blockquote><div><br>Hmm, rightio - I'm not sure I quite picture why/how it would be incremental up the tree, if you want to just chat about it or sketch out the general design I'd be happy to hear, or we can talk over a patch if that's easier. (just want to save you time if working up the patch is going to be a lot of work & we might end up making larger design changes to it)<br><br>- Dave</div></div></div></div>