[LLVMdev] RFC: Reduce the memory footprint of DIEs (and DIEValues)
Duncan P. N. Exon Smith
dexonsmith at apple.com
Wed May 20 15:56:58 PDT 2015
To make this a little more concrete, I just hacked up a couple of
patches that achieve step #1. (0004 is the key patch, and probably
should be split up somehow before commit.) I'll collect some
results and report back.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: all.patch
Type: application/octet-stream
Size: 73699 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150520/d59e8a6f/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-CodeGen-Remove-redundant-DIETypeSignature-dump.patch
Type: application/octet-stream
Size: 1259 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150520/d59e8a6f/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-CodeGen-Remove-the-vtable-entry-from-DIEValue.patch
Type: application/octet-stream
Size: 23708 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150520/d59e8a6f/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-CodeGen-Make-DIEValue-Ty-private-NFC.patch
Type: application/octet-stream
Size: 721 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150520/d59e8a6f/attachment-0003.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-WIP-Change-DIEValue-to-be-stored-by-value.patch
Type: application/octet-stream
Size: 81233 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150520/d59e8a6f/attachment-0004.obj>
-------------- next part --------------
> On 2015 May 20, at 11:28, Duncan P. N. Exon Smith <duncan at exonsmith.com> wrote:
>
> Pete Cooper and I have been looking at memory profiles of running llc on
> verify-uselistorder.lto.opt.bc (ld -save-temps dump just before CodeGen
> of building verify-uselistorder with -flto -g). I've attached
> leak-backend.patch, which we're using to make Intrustruments more
> accurate (instead of effectively leaking things onto BumpPtrAllocators,
> really leak them with malloc()). (I've collected this data on top of a
> few not-yet-committed patches to cheapen `MCSymbol` and
> `EmitLabelDifference()` that chop around 8% of memory off the top, but
> otherwise these numbers should be reproducible in ToT.)
>
> The `DIE` class is huge. Directly, it accounts for about 15% of backend
> memory:
>
> Bytes Used Count Symbol Name
> 77.87 MB 8.4% 318960 llvm::DwarfUnit::createAndAddDIE(unsigned int, llvm::DIE&, llvm::DINode const*)
> 46.34 MB 5.0% 189810 llvm::DwarfCompileUnit::constructVariableDIEImpl(llvm::DbgVariable const&, bool)
> 25.57 MB 2.7% 104752 llvm::DwarfCompileUnit::constructInlinedScopeDIE(llvm::LexicalScope*)
> 8.19 MB 0.8% 33547 llvm::DwarfCompileUnit::constructImportedEntityDIE(llvm::DIImportedEntity const*)
>
> A lot of this is the pair of `SmallVector<, 12>` it has for its values
> (look into `DIEAbbrev` for the second one). Here's a histogram of how
> many DIEs have each value count:
>
> # of Values DIEs with # with # or fewer
> 0 3128 3128
> 1 109522 112650
> 2 180382 293032
> 3 90836 383868
> 4 115552 499420
> 5 90713 590133
> 6 4125 594258
> 7 17211 611469
> 8 18144 629613
> 9 22805 652418
> 10 325 652743
> 11 203 652946
> 12 245 653191
>
> It's crazy that we're paying for 12 up front on every DIE. (This is
> a reformatted version of num-values-with-totals.txt, which I've
> attached along with a few other histograms Pete collected.)
>
> The `DIEValue`s themselves, which get leaked on the BumpPtrAllocator,
> also take up a huge amount of memory (around 4%):
>
> Graph Category Persistent Bytes # Persistent # Transient Total Bytes # Total Transient/Total Bytes
> 0 llvm::DIEInteger 19.91 MB 652389 0 19.91 MB 652389 <XRRatioObject: 0x608025658ea0> %0.00, %0.00
> 0 llvm::DIEString 13.83 MB 302181 0 13.83 MB 302181 <XRRatioObject: 0x608025658ea0> %0.00, %0.00
> 0 llvm::DIEEntry 10.91 MB 357506 0 10.91 MB 357506 <XRRatioObject: 0x608025658ea0> %0.00, %0.00
> 0 llvm::DIEDelta 10.03 MB 328542 0 10.03 MB 328542 <XRRatioObject: 0x608025658ea0> %0.00, %0.00
> 0 llvm::DIELabel 5.14 MB 168551 0 5.14 MB 168551 <XRRatioObject: 0x608025658ea0> %0.00, %0.00
> 0 llvm::DIELoc 3.41 MB 13154 0 3.41 MB 13154 <XRRatioObject: 0x608025658ea0> %0.00, %0.00
> 0 llvm::DIELocList 1.86 MB 61055 0 1.86 MB 61055 <XRRatioObject: 0x608025658ea0> %0.00, %0.00
> 0 llvm::DIEBlock 11.69 KB 44 0 11.69 KB 44 <XRRatioObject: 0x608025658ea0> %0.00, %0.00
> 0 llvm::DIEExpr 32 Bytes 1 0 32 Bytes 1 <XRRatioObject: 0x608025658ea0> %0.00, %0.00
>
> We can do better.
>
> 1. DIEValue should be a discriminated union that's passed by value
> instead of pointer. Most types just have 1 pointer of data. There
> are four "big" ones, which still need a side-allocation on the
> BumpPtrAllocator: DIELoc, DIEBlock, DIEString, and DIEDelta.
> Even for these, the side allocation just needs to store the data
> itself (skipping the discriminator and the vtable entry).
> 2. The contents of DIE's Abbrev field should be integrated with the
> list of DIEValues. In particular, DIEValue should contain a
> `dwarf::Form` and `dwarf::Attribute`. In total, `sizeof(DIEValue)`
> will still be just two pointers (1st pointer: discriminator, Form,
> and Attribute; 2nd pointer: data). DIE should stop storing a
> `DIEAbbrev` itself, instead constructing one on demand, renaming
> `DIE::getAbbrev()` to
> `DIE::getOrCreateAbbrev(FoldingSet<DIEAbbrev>&)` or some such.
> 3. DIE's list of DIEValues is currently a `SmallVector<, 12>`, but a
> histogram Pete ran shows that half of DIEs have 2 or fewer values,
> and 85% have 4 or fewer values. We're paying for 12 (!) upfront
> right now for each DIE. Instead, we should optimize for 2-4
> DIEValues. Not sure whether a std::forward_list would suffice, or if
> we should do something fancy like:
>
> struct List {
> DIEValue Values[2];
> PointerIntPair<List *, 1> NextAndSize;
> };
>
> Either way we should move the allocations to a BumpPtrAllocator
> (trivial if it's a list instead of vector).
> 4. `DIEBlock` and `DIELoc` inherit both from `DIEValue` and `DIE`, but
> they're only ever used as the former. This is just a convenience
> for building up and emitting their DIEValues. Now that we've trimmed
> down and simplified that functionality in `DIE`, we can extract it
> out and make it reusable -- `DIELoc` should "have-a" DIEValue list,
> not "be-a" DIE.
> 5. The children of DIE are stored in a `vector<unique_ptr<DIE>>`, which
> requires side allocations. If we use an intrusively linked list,
> it'll be easy to avoid side allocations without hitting the
> pointer-validity problem highlighted in the header file.
> 6. Now that DIE has no side allocations, we can move all the DIEs to a
> BumpPtrAllocator and remove the malloc traffic.
>
> <leak-backend.patch><num-children-by-tag.txt><num-values-by-tag.txt><num-values-with-totals.txt><num-values.txt>
More information about the llvm-dev
mailing list