[LLVMdev] RFC: Reduce the memory footprint of DIEs (and DIEValues)

Duncan P. N. Exon Smith dexonsmith at apple.com
Wed May 20 15:52:55 PDT 2015


> On 2015 May 20, at 11:47, Frédéric Riss <friss at apple.com> wrote:
> 
> This is awesome.
> 
>> On May 20, 2015, at 11:28 AM, Duncan P. N. Exon Smith <duncan at exonsmith.com> wrote:
>> 
>> Pete Cooper and I have been looking at memory profiles of running llc on
>> verify-uselistorder.lto.opt.bc (ld -save-temps dump just before CodeGen
>> of building verify-uselistorder with -flto -g).  I've attached
>> leak-backend.patch, which we're using to make Intrustruments more
>> accurate (instead of effectively leaking things onto BumpPtrAllocators,
>> really leak them with malloc()).  (I've collected this data on top of a
>> few not-yet-committed patches to cheapen `MCSymbol` and
>> `EmitLabelDifference()` that chop around 8% of memory off the top, but
>> otherwise these numbers should be reproducible in ToT.)
>> 
>> The `DIE` class is huge.  Directly, it accounts for about 15% of backend
>> memory:
>> 
>>   Bytes Used	Count		Symbol Name
>>     77.87 MB       8.4%	318960	 	   llvm::DwarfUnit::createAndAddDIE(unsigned int, llvm::DIE&, llvm::DINode const*)
>>     46.34 MB       5.0%	189810	 	   llvm::DwarfCompileUnit::constructVariableDIEImpl(llvm::DbgVariable const&, bool)
>>     25.57 MB       2.7%	104752	 	   llvm::DwarfCompileUnit::constructInlinedScopeDIE(llvm::LexicalScope*)
>>      8.19 MB       0.8%	33547	 	   llvm::DwarfCompileUnit::constructImportedEntityDIE(llvm::DIImportedEntity const*)
>> 
>> A lot of this is the pair of `SmallVector<, 12>` it has for its values
>> (look into `DIEAbbrev` for the second one).  Here's a histogram of how
>> many DIEs have each value count:
>> 
>>   # of Values  DIEs with #  with # or fewer
>>             0         3128             3128
>>             1       109522           112650
>>             2       180382           293032
>>             3        90836           383868
>>             4       115552           499420
>>             5        90713           590133
>>             6         4125           594258
>>             7        17211           611469
>>             8        18144           629613
>>             9        22805           652418
>>            10          325           652743
>>            11          203           652946
>>            12          245           653191
>> 
>> It's crazy that we're paying for 12 up front on every DIE.  (This is
>> a reformatted version of num-values-with-totals.txt, which I've
>> attached along with a few other histograms Pete collected.)
>> 
>> The `DIEValue`s themselves, which get leaked on the BumpPtrAllocator,
>> also take up a huge amount of memory (around 4%):
>> 
>>   Graph	 Category	Persistent Bytes	# Persistent	# Transient	Total Bytes	# Total	Transient/Total Bytes
>>   0	llvm::DIEInteger	19.91 MB	652389	0	19.91 MB	652389	<XRRatioObject: 0x608025658ea0> %0.00, %0.00
>>   0	llvm::DIEString	13.83 MB	302181	0	13.83 MB	302181	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIEEntry	10.91 MB	357506	0	10.91 MB	357506	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIEDelta	10.03 MB	328542	0	10.03 MB	328542	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIELabel	5.14 MB	168551	0	5.14 MB	168551	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIELoc	3.41 MB	13154	0	3.41 MB	13154	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIELocList	1.86 MB	61055	0	1.86 MB	61055	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIEBlock	11.69 KB	44	0	11.69 KB	44	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIEExpr	32 Bytes	1	0	32 Bytes	1	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>> 
>> We can do better.
>> 
>> 1. DIEValue should be a discriminated union that's passed by value
>>   instead of pointer.  Most types just have 1 pointer of data.  There
>>   are four "big" ones, which still need a side-allocation on the
>>   BumpPtrAllocator: DIELoc, DIEBlock, DIEString, and DIEDelta.
>>   Even for these, the side allocation just needs to store the data
>>   itself (skipping the discriminator and the vtable entry).
>> 2. The contents of DIE's Abbrev field should be integrated with the
>>   list of DIEValues.  In particular, DIEValue should contain a
>>   `dwarf::Form` and `dwarf::Attribute`.  In total, `sizeof(DIEValue)`
>>   will still be just two pointers (1st pointer: discriminator, Form,
>>   and Attribute; 2nd pointer: data).  DIE should stop storing a
>>   `DIEAbbrev` itself, instead constructing one on demand, renaming
>>   `DIE::getAbbrev()` to
>>   `DIE::getOrCreateAbbrev(FoldingSet<DIEAbbrev>&)` or some such.
>> 3. DIE's list of DIEValues is currently a `SmallVector<, 12>`, but a
>>   histogram Pete ran shows that half of DIEs have 2 or fewer values,
>>   and 85% have 4 or fewer values.  We're paying for 12 (!) upfront
>>   right now for each DIE.  Instead, we should optimize for 2-4
>>   DIEValues.  Not sure whether a std::forward_list would suffice, or if
>>   we should do something fancy like:
> 
> DIEValues are already allocated from a BumpPtrAllocator, thus using a
> forward_list wouldn’t be practical. You’d need to use a ilist also or your
> fancy alternative.

Well, we could pass a BumpPtrAllocator to forward_list somehow, but
maybe an ilist would be better.  I'll look more closely once I get
there :).

> 
>>       struct List {
>>         DIEValue Values[2];
>>         PointerIntPair<List *, 1> NextAndSize;
>>       };
>> 
>>   Either way we should move the allocations to a BumpPtrAllocator
>>   (trivial if it's a list instead of vector).
>> 4. `DIEBlock` and `DIELoc` inherit both from `DIEValue` and `DIE`, but
>>   they're only ever used as the former.  This is just a convenience
>>   for building up and emitting their DIEValues.  Now that we've trimmed
>>   down and simplified that functionality in `DIE`, we can extract it
>>   out and make it reusable -- `DIELoc` should "have-a" DIEValue list,
>>   not "be-a" DIE.
> 
> Much needed cleanup!
> 
>> 5. The children of DIE are stored in a `vector<unique_ptr<DIE>>`, which
>>   requires side allocations.  If we use an intrusively linked list,
>>   it'll be easy to avoid side allocations without hitting the
>>   pointer-validity problem highlighted in the header file.
>> 6. Now that DIE has no side allocations, we can move all the DIEs to a
>>   BumpPtrAllocator and remove the malloc traffic.
> 
> Thanks!
> Fred
> 
>> <leak-backend.patch><num-children-by-tag.txt><num-values-by-tag.txt><num-values-with-totals.txt><num-values.txt>
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev





More information about the llvm-dev mailing list