<div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote">On Tue, Nov 12, 2013 at 2:20 PM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote"><div class="im">On Tue, Nov 12, 2013 at 2:08 PM, Manman Ren <span dir="ltr"><<a href="mailto:manman.ren@gmail.com" target="_blank">manman.ren@gmail.com</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><br><br><div class="gmail_quote"><div>On Tue, Nov 12, 2013 at 1:01 PM, David Blaikie <span dir="ltr"><<a href="mailto:dblaikie@gmail.com" target="_blank">dblaikie@gmail.com</a>></span> wrote:<br>


<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Hi Manman,<br><br>Thanks for sending this summary and progress plans - it's great to see the impact your changes have had and ideas for future direction.</div>


<br><div class="gmail_extra"><div class="gmail_quote"><div>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Type uniquing for C++ is in. Some data for Xalan with -flto -g:</div><div><div>9.9MB raw dwarf size, peak memory usage at 2.8GB</div>


<div>The raw dwarf size was 58MB, memory usage was 7GB back in May, 2013.</div>

<div>Other efforts at size reduction helped, and type uniquing improved on top of those.</div></div><div><br></div><div><div>Data on building clang with "-flto -g" after type uniquing:</div><div>  3.4GB MDNodes after parsing all bc files, 7GB MDNodes after linking all bc files</div>


</div></div></blockquote><div><br></div></div><div>What's the change between parsing and linking?</div></div></div></div></blockquote></div><div>Parsing means reading in all bc files to source modules. Linking means linking in the source modules to the destination module.</div>


<div>Extra MDNodes can be generated for the destination module.</div></div></div></div></blockquote><div><br></div></div><div>OK, that's perhaps strange - do you have any ideas about what MDNodes we create when linking modules together? If anyhting I would expect a reduction in size as MDNodes are deduplicated across multiple modules. Are you measuring this after the original modules have been unloaded? Are we not unloading those modules once we've created the merged module?</div>

</div></div></div></blockquote><div><br></div><div>We don't unload the source modules. Even when we unload the source modules, the MDNodes belong to the Context, they are shared among the modules.</div><div>My proposal is going to suggest an interface to delete the source modules and remove the MDNodes used only by the source modules from the Context.</div>

<div><br></div><div>There are a few cases where we generate MDNodes when linking modules:</div><div>1> when a MDNode points to a value that is different from the source module such as Function*.</div><div>2> when we have a cycle in the MDNode graph, all nodes in the cycle will be created for the destination module.</div>

<div> </div><div>When we load in the source modules, the types are already de-duplicated (i.e multiple source modules will share the same type if possible).</div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div class="im">

<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div>

<div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div dir="ltr">

<div>

<div>  4.6GB DIEs</div></div></div></blockquote><div><br></div><div>It seems like the DIEs are a substantial (more than the pre-linked, but post-parsed BC files) part of the footprint. I think it might be important to do the CU-at-a-time work sooner rather than later as I'm concerned about the design impact it will have on existing and future work (it's already going to substantially change the cross-CU-DIE references, potentially changing the cost/benefit of that feature since we cannot inject DIEs from later CUs into prior ones).</div>


<div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div>  4G MCContext</div></div></div></blockquote><div><br></div><div>What's the data in the MCContext that's relevant to debug info?</div>


</div></div></div></blockquote><div><br></div></div><div>One data point on "Xalan":</div><div>without -g, MCContext allocates 45MB,</div><div>with -g, MCContext allocates 286MB.</div></div></div></div></blockquote>


<div><br></div></div><div>OK, might be useful to understand which parts of that - maybe the Values (ints, strings, etc) themselves are being attributed to the MCContext rather than the MDNode sizes you were reporting above? Not really sure.</div>

</div></div></div></blockquote><div><br></div><div>Same here. I will look into that when I have time. Or somebody else already has the answer?</div><div><br></div><div>Manman</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div> </div>

</div></div></div>

</blockquote></div><br></div></div>