[LLVMdev] RFC: Reduce the memory footprint of DIEs (and DIEValues)

Duncan P. N. Exon Smith dexonsmith at apple.com
Wed May 20 15:56:58 PDT 2015


To make this a little more concrete, I just hacked up a couple of
patches that achieve step #1.  (0004 is the key patch, and probably
should be split up somehow before commit.)  I'll collect some
results and report back.


-------------- next part --------------
A non-text attachment was scrubbed...
Name: all.patch
Type: application/octet-stream
Size: 73699 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150520/d59e8a6f/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-CodeGen-Remove-redundant-DIETypeSignature-dump.patch
Type: application/octet-stream
Size: 1259 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150520/d59e8a6f/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-CodeGen-Remove-the-vtable-entry-from-DIEValue.patch
Type: application/octet-stream
Size: 23708 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150520/d59e8a6f/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-CodeGen-Make-DIEValue-Ty-private-NFC.patch
Type: application/octet-stream
Size: 721 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150520/d59e8a6f/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-WIP-Change-DIEValue-to-be-stored-by-value.patch
Type: application/octet-stream
Size: 81233 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150520/d59e8a6f/attachment-0004.obj>
-------------- next part --------------


> On 2015 May 20, at 11:28, Duncan P. N. Exon Smith <duncan at exonsmith.com> wrote:
> 
> Pete Cooper and I have been looking at memory profiles of running llc on
> verify-uselistorder.lto.opt.bc (ld -save-temps dump just before CodeGen
> of building verify-uselistorder with -flto -g).  I've attached
> leak-backend.patch, which we're using to make Intrustruments more
> accurate (instead of effectively leaking things onto BumpPtrAllocators,
> really leak them with malloc()).  (I've collected this data on top of a
> few not-yet-committed patches to cheapen `MCSymbol` and
> `EmitLabelDifference()` that chop around 8% of memory off the top, but
> otherwise these numbers should be reproducible in ToT.)
> 
> The `DIE` class is huge.  Directly, it accounts for about 15% of backend
> memory:
> 
>    Bytes Used	Count		Symbol Name
>      77.87 MB       8.4%	318960	 	   llvm::DwarfUnit::createAndAddDIE(unsigned int, llvm::DIE&, llvm::DINode const*)
>      46.34 MB       5.0%	189810	 	   llvm::DwarfCompileUnit::constructVariableDIEImpl(llvm::DbgVariable const&, bool)
>      25.57 MB       2.7%	104752	 	   llvm::DwarfCompileUnit::constructInlinedScopeDIE(llvm::LexicalScope*)
>       8.19 MB       0.8%	33547	 	   llvm::DwarfCompileUnit::constructImportedEntityDIE(llvm::DIImportedEntity const*)
> 
> A lot of this is the pair of `SmallVector<, 12>` it has for its values
> (look into `DIEAbbrev` for the second one).  Here's a histogram of how
> many DIEs have each value count:
> 
>    # of Values  DIEs with #  with # or fewer
>              0         3128             3128
>              1       109522           112650
>              2       180382           293032
>              3        90836           383868
>              4       115552           499420
>              5        90713           590133
>              6         4125           594258
>              7        17211           611469
>              8        18144           629613
>              9        22805           652418
>             10          325           652743
>             11          203           652946
>             12          245           653191
> 
> It's crazy that we're paying for 12 up front on every DIE.  (This is
> a reformatted version of num-values-with-totals.txt, which I've
> attached along with a few other histograms Pete collected.)
> 
> The `DIEValue`s themselves, which get leaked on the BumpPtrAllocator,
> also take up a huge amount of memory (around 4%):
> 
>    Graph	 Category	Persistent Bytes	# Persistent	# Transient	Total Bytes	# Total	Transient/Total Bytes
>    0	llvm::DIEInteger	19.91 MB	652389	0	19.91 MB	652389	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>    0	llvm::DIEString	13.83 MB	302181	0	13.83 MB	302181	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>    0	llvm::DIEEntry	10.91 MB	357506	0	10.91 MB	357506	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>    0	llvm::DIEDelta	10.03 MB	328542	0	10.03 MB	328542	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>    0	llvm::DIELabel	5.14 MB	168551	0	5.14 MB	168551	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>    0	llvm::DIELoc	3.41 MB	13154	0	3.41 MB	13154	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>    0	llvm::DIELocList	1.86 MB	61055	0	1.86 MB	61055	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>    0	llvm::DIEBlock	11.69 KB	44	0	11.69 KB	44	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>    0	llvm::DIEExpr	32 Bytes	1	0	32 Bytes	1	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
> 
> We can do better.
> 
> 1. DIEValue should be a discriminated union that's passed by value
>    instead of pointer.  Most types just have 1 pointer of data.  There
>    are four "big" ones, which still need a side-allocation on the
>    BumpPtrAllocator: DIELoc, DIEBlock, DIEString, and DIEDelta.
>    Even for these, the side allocation just needs to store the data
>    itself (skipping the discriminator and the vtable entry).
> 2. The contents of DIE's Abbrev field should be integrated with the
>    list of DIEValues.  In particular, DIEValue should contain a
>    `dwarf::Form` and `dwarf::Attribute`.  In total, `sizeof(DIEValue)`
>    will still be just two pointers (1st pointer: discriminator, Form,
>    and Attribute; 2nd pointer: data).  DIE should stop storing a
>    `DIEAbbrev` itself, instead constructing one on demand, renaming
>    `DIE::getAbbrev()` to
>    `DIE::getOrCreateAbbrev(FoldingSet<DIEAbbrev>&)` or some such.
> 3. DIE's list of DIEValues is currently a `SmallVector<, 12>`, but a
>    histogram Pete ran shows that half of DIEs have 2 or fewer values,
>    and 85% have 4 or fewer values.  We're paying for 12 (!) upfront
>    right now for each DIE.  Instead, we should optimize for 2-4
>    DIEValues.  Not sure whether a std::forward_list would suffice, or if
>    we should do something fancy like:
> 
>        struct List {
>          DIEValue Values[2];
>          PointerIntPair<List *, 1> NextAndSize;
>        };
> 
>    Either way we should move the allocations to a BumpPtrAllocator
>    (trivial if it's a list instead of vector).
> 4. `DIEBlock` and `DIELoc` inherit both from `DIEValue` and `DIE`, but
>    they're only ever used as the former.  This is just a convenience
>    for building up and emitting their DIEValues.  Now that we've trimmed
>    down and simplified that functionality in `DIE`, we can extract it
>    out and make it reusable -- `DIELoc` should "have-a" DIEValue list,
>    not "be-a" DIE.
> 5. The children of DIE are stored in a `vector<unique_ptr<DIE>>`, which
>    requires side allocations.  If we use an intrusively linked list,
>    it'll be easy to avoid side allocations without hitting the
>    pointer-validity problem highlighted in the header file.
> 6. Now that DIE has no side allocations, we can move all the DIEs to a
>    BumpPtrAllocator and remove the malloc traffic.
> 
> <leak-backend.patch><num-children-by-tag.txt><num-values-by-tag.txt><num-values-with-totals.txt><num-values.txt>



More information about the llvm-dev mailing list