[LLVMdev] RFC: Reduce the memory footprint of DIEs (and DIEValues)

Duncan P. N. Exon Smith dexonsmith at apple.com
Wed May 20 17:39:23 PDT 2015


With just those four patches, memory usage went *up* slightly.  Add in
the 5th patch (which does #2 below), and we get an overall memory drop
of 4%.

The intermediate result of a memory increase makes sense.  While the
first four patches reduce the number of (and size of) `DIEValue`
allocations, they increase the cost of the `SmallVector` overhead.
0005 (attached) squeezes the abbreviation data into `DIEValue` for
free, next to the discriminator for the union.  The 5 patches together
are strictly an improvement to memory usage.

It's nice to see the 4% memory drop, but this is all prep work for #3,
where I expect the biggest memory usage improvements.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0005-WIP-Store-abbreviation-data-directly-in-DIEValue.patch
Type: application/octet-stream
Size: 25110 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150520/3f1b2889/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: all-2.patch
Type: application/octet-stream
Size: 84163 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150520/3f1b2889/attachment-0001.obj>
-------------- next part --------------

> On 2015 May 20, at 15:56, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
> To make this a little more concrete, I just hacked up a couple of
> patches that achieve step #1.  (0004 is the key patch, and probably
> should be split up somehow before commit.)  I'll collect some
> results and report back.
> 
> 
> <all.patch><0001-CodeGen-Remove-redundant-DIETypeSignature-dump.patch><0002-CodeGen-Remove-the-vtable-entry-from-DIEValue.patch><0003-CodeGen-Make-DIEValue-Ty-private-NFC.patch><0004-WIP-Change-DIEValue-to-be-stored-by-value.patch>
> 
>> On 2015 May 20, at 11:28, Duncan P. N. Exon Smith <duncan at exonsmith.com> wrote:
>> 
>> Pete Cooper and I have been looking at memory profiles of running llc on
>> verify-uselistorder.lto.opt.bc (ld -save-temps dump just before CodeGen
>> of building verify-uselistorder with -flto -g).  I've attached
>> leak-backend.patch, which we're using to make Intrustruments more
>> accurate (instead of effectively leaking things onto BumpPtrAllocators,
>> really leak them with malloc()).  (I've collected this data on top of a
>> few not-yet-committed patches to cheapen `MCSymbol` and
>> `EmitLabelDifference()` that chop around 8% of memory off the top, but
>> otherwise these numbers should be reproducible in ToT.)
>> 
>> The `DIE` class is huge.  Directly, it accounts for about 15% of backend
>> memory:
>> 
>>   Bytes Used	Count		Symbol Name
>>     77.87 MB       8.4%	318960	 	   llvm::DwarfUnit::createAndAddDIE(unsigned int, llvm::DIE&, llvm::DINode const*)
>>     46.34 MB       5.0%	189810	 	   llvm::DwarfCompileUnit::constructVariableDIEImpl(llvm::DbgVariable const&, bool)
>>     25.57 MB       2.7%	104752	 	   llvm::DwarfCompileUnit::constructInlinedScopeDIE(llvm::LexicalScope*)
>>      8.19 MB       0.8%	33547	 	   llvm::DwarfCompileUnit::constructImportedEntityDIE(llvm::DIImportedEntity const*)
>> 
>> A lot of this is the pair of `SmallVector<, 12>` it has for its values
>> (look into `DIEAbbrev` for the second one).  Here's a histogram of how
>> many DIEs have each value count:
>> 
>>   # of Values  DIEs with #  with # or fewer
>>             0         3128             3128
>>             1       109522           112650
>>             2       180382           293032
>>             3        90836           383868
>>             4       115552           499420
>>             5        90713           590133
>>             6         4125           594258
>>             7        17211           611469
>>             8        18144           629613
>>             9        22805           652418
>>            10          325           652743
>>            11          203           652946
>>            12          245           653191
>> 
>> It's crazy that we're paying for 12 up front on every DIE.  (This is
>> a reformatted version of num-values-with-totals.txt, which I've
>> attached along with a few other histograms Pete collected.)
>> 
>> The `DIEValue`s themselves, which get leaked on the BumpPtrAllocator,
>> also take up a huge amount of memory (around 4%):
>> 
>>   Graph	 Category	Persistent Bytes	# Persistent	# Transient	Total Bytes	# Total	Transient/Total Bytes
>>   0	llvm::DIEInteger	19.91 MB	652389	0	19.91 MB	652389	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIEString	13.83 MB	302181	0	13.83 MB	302181	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIEEntry	10.91 MB	357506	0	10.91 MB	357506	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIEDelta	10.03 MB	328542	0	10.03 MB	328542	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIELabel	5.14 MB	168551	0	5.14 MB	168551	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIELoc	3.41 MB	13154	0	3.41 MB	13154	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIELocList	1.86 MB	61055	0	1.86 MB	61055	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIEBlock	11.69 KB	44	0	11.69 KB	44	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>>   0	llvm::DIEExpr	32 Bytes	1	0	32 Bytes	1	<XRRatioObject: 0x608025658ea0>  %0.00, %0.00
>> 
>> We can do better.
>> 
>> 1. DIEValue should be a discriminated union that's passed by value
>>   instead of pointer.  Most types just have 1 pointer of data.  There
>>   are four "big" ones, which still need a side-allocation on the
>>   BumpPtrAllocator: DIELoc, DIEBlock, DIEString, and DIEDelta.
>>   Even for these, the side allocation just needs to store the data
>>   itself (skipping the discriminator and the vtable entry).
>> 2. The contents of DIE's Abbrev field should be integrated with the
>>   list of DIEValues.  In particular, DIEValue should contain a
>>   `dwarf::Form` and `dwarf::Attribute`.  In total, `sizeof(DIEValue)`
>>   will still be just two pointers (1st pointer: discriminator, Form,
>>   and Attribute; 2nd pointer: data).  DIE should stop storing a
>>   `DIEAbbrev` itself, instead constructing one on demand, renaming
>>   `DIE::getAbbrev()` to
>>   `DIE::getOrCreateAbbrev(FoldingSet<DIEAbbrev>&)` or some such.
>> 3. DIE's list of DIEValues is currently a `SmallVector<, 12>`, but a
>>   histogram Pete ran shows that half of DIEs have 2 or fewer values,
>>   and 85% have 4 or fewer values.  We're paying for 12 (!) upfront
>>   right now for each DIE.  Instead, we should optimize for 2-4
>>   DIEValues.  Not sure whether a std::forward_list would suffice, or if
>>   we should do something fancy like:
>> 
>>       struct List {
>>         DIEValue Values[2];
>>         PointerIntPair<List *, 1> NextAndSize;
>>       };
>> 
>>   Either way we should move the allocations to a BumpPtrAllocator
>>   (trivial if it's a list instead of vector).
>> 4. `DIEBlock` and `DIELoc` inherit both from `DIEValue` and `DIE`, but
>>   they're only ever used as the former.  This is just a convenience
>>   for building up and emitting their DIEValues.  Now that we've trimmed
>>   down and simplified that functionality in `DIE`, we can extract it
>>   out and make it reusable -- `DIELoc` should "have-a" DIEValue list,
>>   not "be-a" DIE.
>> 5. The children of DIE are stored in a `vector<unique_ptr<DIE>>`, which
>>   requires side allocations.  If we use an intrusively linked list,
>>   it'll be easy to avoid side allocations without hitting the
>>   pointer-validity problem highlighted in the header file.
>> 6. Now that DIE has no side allocations, we can move all the DIEs to a
>>   BumpPtrAllocator and remove the malloc traffic.
>> 
>> <leak-backend.patch><num-children-by-tag.txt><num-values-by-tag.txt><num-values-with-totals.txt><num-values.txt>
> 



More information about the llvm-dev mailing list