<div dir="ltr"><div>Hi Manman,<br><br>Thanks for sending this summary and progress plans - it's great to see the impact your changes have had and ideas for future direction.</div><br><div class="gmail_extra"><div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Type uniquing for C++ is in. Some data for Xalan with -flto -g:</div><div><div>9.9MB raw dwarf size, peak memory usage at 2.8GB</div>
<div>The raw dwarf size was 58MB, memory usage was 7GB back in May, 2013.</div>
<div>Other efforts at size reduction helped, and type uniquing improved on top of those.</div></div><div><br></div><div><div>Data on building clang with "-flto -g" after type uniquing:</div><div> 3.4GB MDNodes after parsing all bc files, 7GB MDNodes after linking all bc files</div>
</div></div></blockquote><div><br></div><div>What's the change between parsing and linking?</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">
<div>
<div> 4.6GB DIEs</div></div></div></blockquote><div><br></div><div>It seems like the DIEs are a substantial (more than the pre-linked, but post-parsed BC files) part of the footprint. I think it might be important to do the CU-at-a-time work sooner rather than later as I'm concerned about the design impact it will have on existing and future work (it's already going to substantially change the cross-CU-DIE references, potentially changing the cost/benefit of that feature since we cannot inject DIEs from later CUs into prior ones).</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div> 4G MCContext</div></div></div></blockquote><div><br></div><div>What's the data in the MCContext that's relevant to debug info?</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div> --> The memory usage is still too big.</div></div></blockquote><div><br></div><div>
Do we have an idea of what size is "small enough"? It would be useful to have a goal.</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">
<div>So how to reduce the memory footprint at MDNode level:<br></div><div><div> 1> Combine integers into MDString and further combining MDStrings (see PR17891)</div>
<div> A partial implementation on the important debug info nodes can reduce the MDNodes from 7GB to 5.7GB</div></div></div></blockquote><div><br></div><div>I think this'll be an interesting, and potentially valuable, change even in non-LTO cases, but not necessarily where I would start just now.</div>
<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div> 2> Release MDNodes that are only used by source modules (I will send out a proposal)</div>
<div>
An estimation based on partial implementation: this will reduce MDNodes from 5.7GB to 3.9GB</div></div></div></blockquote><div><br></div><div>I'll keep an eye out for your proposal, as I can't quite picture what you've got in mind from this brief description.<br>
<br>- David</div></div></div></div>