<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;" class=""><br class=""><div><blockquote type="cite" class=""><div class="">On Feb 5, 2016, at 6:14 PM, Mehdi Amini via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" class="">llvm-dev@lists.llvm.org</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><div style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;" class=""><blockquote type="cite" class=""><div class=""><br class="Apple-interchange-newline">On Feb 5, 2016, at 6:02 PM, David Blaikie <<a href="mailto:dblaikie@gmail.com" class="">dblaikie@gmail.com</a>> wrote:</div><br class="Apple-interchange-newline"><div class=""><br class="Apple-interchange-newline"><br class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><div class="gmail_quote" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">On Fri, Feb 5, 2016 at 5:56 PM, Mehdi Amini via llvm-dev<span class="Apple-converted-space"> </span><span dir="ltr" class=""><<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>></span><span class="Apple-converted-space"> </span>wrote:<br class=""><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"><div class="" style="word-wrap: break-word;"><br class=""><div class=""><span class=""><blockquote type="cite" class=""><div class="">On Feb 5, 2016, at 5:53 PM, Eric Christopher <<a href="mailto:echristo@gmail.com" target="_blank" class="">echristo@gmail.com</a>> wrote:</div><br class=""><div class=""><div dir="ltr" class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px;"><br class=""><br class=""><div class="gmail_quote"><div dir="ltr" class="">On Fri, Feb 5, 2016 at 5:51 PM Mehdi Amini via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>> wrote:<br class=""></div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"><br class="">> On Feb 5, 2016, at 5:40 PM, Peter Collingbourne <<a href="mailto:peter@pcc.me.uk" target="_blank" class="">peter@pcc.me.uk</a>> wrote:<br class="">><br class="">> On Fri, Feb 05, 2016 at 04:58:45PM -0800, Mehdi Amini wrote:<br class="">>><br class="">>>> On Feb 5, 2016, at 3:17 PM, Peter Collingbourne via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank" class="">llvm-dev@lists.llvm.org</a>> wrote:<br class="">>>><br class="">>>> Hi all,<br class="">>>><br class="">>>> We have profiled [1] the memory usage in LLVM when LTO'ing Chromium, and<br class="">>>> we've found that one of the top consumers of memory is the DWARF emitter in<br class="">>>> lib/CodeGen/AsmPrinter/Dwarf*.<br class="">>><br class="">>> I'm staring at the profile attached to the post #15 on the link you posted, can you confirm that the Dwarf emitter accounts for 6.7%+15.6%=22.3% of the the total allocated memory?<br class="">>> If I understand correctly the numbers, this does not tell anything about how much the Dwarf emitter accounts on the *peak memory* usage (could be more, could be nothing...).<br class="">><br class="">> I think these nodes represent allocations from the DWARF emitter:<br class="">><br class="">> DwarfDebug::DwarfDebug 9.5%<br class="">> DwarfDebug::endFunction 15.6%<br class="">> DIEValueList::addValue 9.1%<br class="">> total 34.2%<br class="">><br class="">> I believe they are totals, but my reading of the code is that the DWARF<br class="">> emitter does not deallocate its memory until the end of code generation,<br class=""><br class="">That's sad :(<br class=""><br class="">> so total ~= peak in this case.<br class=""><br class="">Assuming the peak occurs during CodeGen (which is what I on my profile), that sounds pretty reasonable!<br class=""><br class="">Thanks for the information (and the work!).<br class=""><br class="">Another question I have, is how worse the split codegen make the situation? Naively there will be a lot of redundancy in the split modules, for ThinLTO Teresa has to proceed with care to limit the amount of duplication.<br class=""><br class=""></blockquote><div class=""><br class=""></div><div class="">Hmm? Can you reword this slightly? I'm not sure what you're asking here.</div></div></div></div></blockquote><div class=""><br class=""></div></span><div class="">The parallel split codegen will take the big LTO module with all the debug info and produce multiple modules.</div><div class="">When splitting in multiple modules, you may have functions from the same DICompileUnit ending up in multiple modules.  All the retained types would be pulled in.</div></div></div></blockquote><div class=""> </div><blockquote class="gmail_quote" style="margin: 0px 0px 0px 0.8ex; border-left-width: 1px; border-left-color: rgb(204, 204, 204); border-left-style: solid; padding-left: 1ex;"><div class="" style="word-wrap: break-word;"><div class=""><div class="">(this is assuming you are already taking care of not pulling the DICompileUnit when no functions referencing it is in the split module).</div><div class="">Then each thread would do redundant work processing this type hierarchy (and other debug info).</div><div class=""><br class=""></div><div class="">For ThinLTO, Teresa is taking care (review waiting here: <a href="http://reviews.llvm.org/D16440" target="_blank" class="">http://reviews.llvm.org/D16440</a><span class="Apple-converted-space"> </span>) to try to import as little as possible, and turn type definition into declaration when possible.</div></div></div></blockquote><div class=""><br class=""></div><div class=""><div class="">Right - I don't think we'd ever need to import a definition - just rely on the fact that we will produce a type definition somewhere in the output (this may present problems for LLDB - it's certainly had issues with type declarations appearing where it would expect a definition (eg: a type that inherits from a declaration instead of a definition) not sure if that problem extends to the case of by-value function parameters)<br class=""><br class="">So the impact of that cross-module importuing should be pretty low for ThinLTO. But the benefit of any work Peter does should be equally beneficial to ThinLTO, since it still has to emit the types, build all the DIEs, etc, etc.<br class=""></div></div></div></div></blockquote><div class=""><br class=""></div></div><span style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; float: none; display: inline !important;" class="">I'm not sure if you really answered my question though, I may misunderstand what you mean here.</span><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class=""></div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">I'm not concerned about ThinLTO, any improvement on the DwarfEmitter would be beneficial for any CodeGen. I'll try to make my question more clear:</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class=""></div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">There is a "parallel code generator" for LTO that was added by Peter especially to address Chrome LTO builds. I *assume* the memory consumption measure we are talking about is using this scheme (it not mentioned how many threads).</div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;"><br class=""></div><div class="" style="font-family: Helvetica; font-size: 12px; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px;">When using the multi-threaded codegen, my concern would be that your 24 threads (random number here...) may emitting the same Dwarf informations again and again, which would make the 30% memory usage not surprising. Since we noticed this has a huge impact on ThinLTO, I was pointing an *orthogonal* way of addressing the memory concern for Chrome LTO.</div></div></blockquote><div><br class=""></div><div>Update: Peter told me on IRC that he believes the measure was made with single-threaded codegen. I wonder how worse the number would be with threading enabled :)</div><div><br class=""></div><div><br class=""></div><div>-- </div><div>Mehdi</div></div></body></html>