<div dir="ltr">Just wanted to say awesome data!<div><br></div><div>-- Sean Silva</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, May 20, 2015 at 11:28 AM, Duncan P. N. Exon Smith <span dir="ltr"><<a href="mailto:duncan@exonsmith.com" target="_blank">duncan@exonsmith.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Pete Cooper and I have been looking at memory profiles of running llc on<br>
verify-uselistorder.lto.opt.bc (ld -save-temps dump just before CodeGen<br>
of building verify-uselistorder with -flto -g). I've attached<br>
leak-backend.patch, which we're using to make Intrustruments more<br>
accurate (instead of effectively leaking things onto BumpPtrAllocators,<br>
really leak them with malloc()). (I've collected this data on top of a<br>
few not-yet-committed patches to cheapen `MCSymbol` and<br>
`EmitLabelDifference()` that chop around 8% of memory off the top, but<br>
otherwise these numbers should be reproducible in ToT.)<br>
<br>
The `DIE` class is huge. Directly, it accounts for about 15% of backend<br>
memory:<br>
<br>
Bytes Used Count Symbol Name<br>
77.87 MB 8.4% 318960 llvm::DwarfUnit::createAndAddDIE(unsigned int, llvm::DIE&, llvm::DINode const*)<br>
46.34 MB 5.0% 189810 llvm::DwarfCompileUnit::constructVariableDIEImpl(llvm::DbgVariable const&, bool)<br>
25.57 MB 2.7% 104752 llvm::DwarfCompileUnit::constructInlinedScopeDIE(llvm::LexicalScope*)<br>
8.19 MB 0.8% 33547 llvm::DwarfCompileUnit::constructImportedEntityDIE(llvm::DIImportedEntity const*)<br>
<br>
A lot of this is the pair of `SmallVector<, 12>` it has for its values<br>
(look into `DIEAbbrev` for the second one). Here's a histogram of how<br>
many DIEs have each value count:<br>
<br>
# of Values DIEs with # with # or fewer<br>
0 3128 3128<br>
1 109522 112650<br>
2 180382 293032<br>
3 90836 383868<br>
4 115552 499420<br>
5 90713 590133<br>
6 4125 594258<br>
7 17211 611469<br>
8 18144 629613<br>
9 22805 652418<br>
10 325 652743<br>
11 203 652946<br>
12 245 653191<br>
<br>
It's crazy that we're paying for 12 up front on every DIE. (This is<br>
a reformatted version of num-values-with-totals.txt, which I've<br>
attached along with a few other histograms Pete collected.)<br>
<br>
The `DIEValue`s themselves, which get leaked on the BumpPtrAllocator,<br>
also take up a huge amount of memory (around 4%):<br>
<br>
Graph Category Persistent Bytes # Persistent # Transient Total Bytes # Total Transient/Total Bytes<br>
0 llvm::DIEInteger 19.91 MB 652389 0 19.91 MB 652389 <XRRatioObject: 0x608025658ea0> %0.00, %0.00<br>
0 llvm::DIEString 13.83 MB 302181 0 13.83 MB 302181 <XRRatioObject: 0x608025658ea0> %0.00, %0.00<br>
0 llvm::DIEEntry 10.91 MB 357506 0 10.91 MB 357506 <XRRatioObject: 0x608025658ea0> %0.00, %0.00<br>
0 llvm::DIEDelta 10.03 MB 328542 0 10.03 MB 328542 <XRRatioObject: 0x608025658ea0> %0.00, %0.00<br>
0 llvm::DIELabel 5.14 MB 168551 0 5.14 MB 168551 <XRRatioObject: 0x608025658ea0> %0.00, %0.00<br>
0 llvm::DIELoc 3.41 MB 13154 0 3.41 MB 13154 <XRRatioObject: 0x608025658ea0> %0.00, %0.00<br>
0 llvm::DIELocList 1.86 MB 61055 0 1.86 MB 61055 <XRRatioObject: 0x608025658ea0> %0.00, %0.00<br>
0 llvm::DIEBlock 11.69 KB 44 0 11.69 KB 44 <XRRatioObject: 0x608025658ea0> %0.00, %0.00<br>
0 llvm::DIEExpr 32 Bytes 1 0 32 Bytes 1 <XRRatioObject: 0x608025658ea0> %0.00, %0.00<br>
<br>
We can do better.<br>
<br>
1. DIEValue should be a discriminated union that's passed by value<br>
instead of pointer. Most types just have 1 pointer of data. There<br>
are four "big" ones, which still need a side-allocation on the<br>
BumpPtrAllocator: DIELoc, DIEBlock, DIEString, and DIEDelta.<br>
Even for these, the side allocation just needs to store the data<br>
itself (skipping the discriminator and the vtable entry).<br>
2. The contents of DIE's Abbrev field should be integrated with the<br>
list of DIEValues. In particular, DIEValue should contain a<br>
`dwarf::Form` and `dwarf::Attribute`. In total, `sizeof(DIEValue)`<br>
will still be just two pointers (1st pointer: discriminator, Form,<br>
and Attribute; 2nd pointer: data). DIE should stop storing a<br>
`DIEAbbrev` itself, instead constructing one on demand, renaming<br>
`DIE::getAbbrev()` to<br>
`DIE::getOrCreateAbbrev(FoldingSet<DIEAbbrev>&)` or some such.<br>
3. DIE's list of DIEValues is currently a `SmallVector<, 12>`, but a<br>
histogram Pete ran shows that half of DIEs have 2 or fewer values,<br>
and 85% have 4 or fewer values. We're paying for 12 (!) upfront<br>
right now for each DIE. Instead, we should optimize for 2-4<br>
DIEValues. Not sure whether a std::forward_list would suffice, or if<br>
we should do something fancy like:<br>
<br>
struct List {<br>
DIEValue Values[2];<br>
PointerIntPair<List *, 1> NextAndSize;<br>
};<br>
<br>
Either way we should move the allocations to a BumpPtrAllocator<br>
(trivial if it's a list instead of vector).<br>
4. `DIEBlock` and `DIELoc` inherit both from `DIEValue` and `DIE`, but<br>
they're only ever used as the former. This is just a convenience<br>
for building up and emitting their DIEValues. Now that we've trimmed<br>
down and simplified that functionality in `DIE`, we can extract it<br>
out and make it reusable -- `DIELoc` should "have-a" DIEValue list,<br>
not "be-a" DIE.<br>
5. The children of DIE are stored in a `vector<unique_ptr<DIE>>`, which<br>
requires side allocations. If we use an intrusively linked list,<br>
it'll be easy to avoid side allocations without hitting the<br>
pointer-validity problem highlighted in the header file.<br>
6. Now that DIE has no side allocations, we can move all the DIEs to a<br>
BumpPtrAllocator and remove the malloc traffic.<br>
<br>
<br>_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" target="_blank">http://llvm.cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><br>
<br></blockquote></div><br></div>