[PATCH] [PATCH/RFC] Use a BumpAllocator for DIEs

Duncan P. N. Exon Smith dexonsmith at apple.com
Tue Jan 20 12:19:21 PST 2015


> On 2015 Jan 20, at 11:28, Frederic Riss <friss at apple.com> wrote:
> 
> Hi dblaikie, echristo,
> 
> First of all, although this patch passes the testsuite, I don't
> think it is ready to go in, at least not without a thorough
> discussion of its downsides.
> 
> What the patch does is move away from using unique_ptr<>s to
> track DIE object memory and allocate most of them using a
> BumpAllocator. Making releasing the memory for the DIE tree
> basically free. (DIEBlocks and DIELocs are still allocated using
> the standard allocator, but they were already using separate
> bookkeeping, so this doesn't complicate the patch).
> 
> I'm going down this road because the teardown of the DIE tree
> uses a lot of CPU time, especially in dsymutil's usecase. In
> my prototype, the DIE destructors take ~40 seconds to run for
> DIE trees worth of 700MB of debug infomration. That's quite a
> lot especially considering that dsymutil's runtime on this kind
> of data should be around 2 minutes. I really need to find an
> acceptale solution for dsymutil's usecase.
> 
> This patch achieves the performance gains that I expect, but there
> is one big downside: it limits the number of attributes a DIE can
> accept to a fixed maximum (well technically, if you remove the
> assert in the code, it will work, but it will leak the attribute
> vectors going over the limit).

Is there a way to explicitly walk through and clear these vectors?
Or is that already too expensive (in most cases it'll be a no-op,
but you'll still need to do the walk)?

(IIRC, `SmallVector<>::clear()` deallocates.)

> 
> Imposing this limit on users seems too strong as a constraint. 
> Especially considering not all frontends are in-tree. I can imagine
> more flexible solutions involving complicated allocation strategies,
> but I wanted to ask for opinions before I work on something that I
> fear would end up much more complicated than this patch. Maybe
> there's an easier solution that I haven't seen, happy to hear
> suggestions.
> 
> BTW, the runtime difference is definitely measurable on a clang
> debug build also, although the gained seconds do not represent
> 40% of the runtime like in dsymutil's case. On my machine, a clang
> debug build goes from:
> 
> real    14m38.308s
> user    45m26.856s
> sys     3m53.519s
> 
> without the patch, to:
> 
> real    14m19.762s
> user    44m14.438s
> sys     3m45.520s
> 
> Shaving more than one minute of user time and 15 to 20 seconds from
> wall-clock time.
> 
> http://reviews.llvm.org/D7072
> 
> Files:
>  include/llvm/CodeGen/DIE.h
>  lib/CodeGen/AsmPrinter/DIE.cpp
>  lib/CodeGen/AsmPrinter/DIEHash.cpp
>  lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
>  lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
>  lib/CodeGen/AsmPrinter/DwarfDebug.cpp
>  lib/CodeGen/AsmPrinter/DwarfFile.cpp
>  lib/CodeGen/AsmPrinter/DwarfUnit.cpp
>  lib/CodeGen/AsmPrinter/DwarfUnit.h
>  unittests/CodeGen/DIEHashTest.cpp
> 
> EMAIL PREFERENCES
>  http://reviews.llvm.org/settings/panel/emailpreferences/
> <D7072.18447.patch>_______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits




More information about the llvm-commits mailing list