[llvm-dev] Reducing DWARF emitter memory consumption

Fri Feb 5 17:56:51 PST 2016

> On Feb 5, 2016, at 5:53 PM, Eric Christopher <echristo at gmail.com> wrote:
> 
> 
> 
> On Fri, Feb 5, 2016 at 5:51 PM Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> 
> > On Feb 5, 2016, at 5:40 PM, Peter Collingbourne <peter at pcc.me.uk <mailto:peter at pcc.me.uk>> wrote:
> >
> > On Fri, Feb 05, 2016 at 04:58:45PM -0800, Mehdi Amini wrote:
> >>
> >>> On Feb 5, 2016, at 3:17 PM, Peter Collingbourne via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> We have profiled [1] the memory usage in LLVM when LTO'ing Chromium, and
> >>> we've found that one of the top consumers of memory is the DWARF emitter in
> >>> lib/CodeGen/AsmPrinter/Dwarf*.
> >>
> >> I'm staring at the profile attached to the post #15 on the link you posted, can you confirm that the Dwarf emitter accounts for 6.7%+15.6%=22.3% of the the total allocated memory?
> >> If I understand correctly the numbers, this does not tell anything about how much the Dwarf emitter accounts on the *peak memory* usage (could be more, could be nothing...).
> >
> > I think these nodes represent allocations from the DWARF emitter:
> >
> > DwarfDebug::DwarfDebug 9.5%
> > DwarfDebug::endFunction 15.6%
> > DIEValueList::addValue 9.1%
> > total 34.2%
> >
> > I believe they are totals, but my reading of the code is that the DWARF
> > emitter does not deallocate its memory until the end of code generation,
> 
> That's sad :(
> 
> > so total ~= peak in this case.
> 
> Assuming the peak occurs during CodeGen (which is what I on my profile), that sounds pretty reasonable!
> 
> Thanks for the information (and the work!).
> 
> Another question I have, is how worse the split codegen make the situation? Naively there will be a lot of redundancy in the split modules, for ThinLTO Teresa has to proceed with care to limit the amount of duplication.
> 
> 
> Hmm? Can you reword this slightly? I'm not sure what you're asking here.

The parallel split codegen will take the big LTO module with all the debug info and produce multiple modules.
When splitting in multiple modules, you may have functions from the same DICompileUnit ending up in multiple modules.  All the retained types would be pulled in.
(this is assuming you are already taking care of not pulling the DICompileUnit when no functions referencing it is in the split module).
Then each thread would do redundant work processing this type hierarchy (and other debug info).

For ThinLTO, Teresa is taking care (review waiting here: http://reviews.llvm.org/D16440 ) to try to import as little as possible, and turn type definition into declaration when possible.

-- 
Mehdi

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160205/18023fd6/attachment.html>