[LLVMdev] RFC: Reduce the memory footprint of DIEs (and DIEValues)

Duncan P. N. Exon Smith dexonsmith at apple.com
Sat Jun 27 09:40:15 PDT 2015


> On 2015 Jun 25, at 15:22, David Blaikie <dblaikie at gmail.com> wrote:
> 
> 
> 
> On Thu, Jun 25, 2015 at 3:21 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
> > On 2015-Jun-25, at 15:13, David Blaikie <dblaikie at gmail.com> wrote:
> >
> > Looks good (might eventually iterate a bit on the generic list data structures (names, API, etc) - but they're factored out enough)
> >
> > Does DIE::get need to be templated on the allocator type?
> 
> Nope, that's from when I was prototyping, just missed that site.
> (I was using a `MallocAllocator` and leaking everything so that I
> could get better information from the heap profile).  I'll fix that
> and commit.
> 
> Thanks

Thanks for the detailed review!

In my panic to fix MSVC issues, I forgot (until now) to reply with my
commits :/.

r240733 and r240736 committed the two linked lists, and then r240748
and r240750 fixed some MSVC compile errors.

> 
> >
> > On Thu, Jun 25, 2015 at 2:57 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> >
> > > On 2015-Jun-25, at 13:13, David Blaikie <dblaikie at gmail.com> wrote:
> > >
> > >
> > >
> > > On Thu, Jun 25, 2015 at 1:01 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> > >
> > > > On 2015-Jun-25, at 12:44, David Blaikie <dblaikie at gmail.com> wrote:
> > > >
> > > >
> > > >
> > > > On Thu, Jun 25, 2015 at 12:04 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> > > >
> > > > > On 2015-Jun-25, at 12:00, David Blaikie <dblaikie at gmail.com> wrote:
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Jun 25, 2015 at 11:49 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> > > > >
> > > > > > On 2015-Jun-25, at 10:48, David Blaikie <dblaikie at gmail.com> wrote:
> > > > > >
> > > > > > I'm not sure of the merits of handing around unique_ptrs with a null deleter (I could see it in some generic template machinery - but this is pretty explicitly never-owning). I get the desire to tag the conceptual ownership, and I appreciate that - but without any checking I suspect it'll just become incorrect over time and we won't notice.
> > > > >
> > > > > IMO, where tagging the ownership is valuable is before the DIE is
> > > > > inserted in the tree (and in this patch that's the only place it
> > > > > remains).  But maybe `unique_ptr` is kind of a funny name for what's
> > > > > going on.
> > > > >
> > > > > Maybe a way to keep it from bitrotting would be to make some of the
> > > > > API private, something like:
> > > > >
> > > > >     class OwnedDIE {
> > > > >       friend class DIE;
> > > > >       DIE *Die = nullptr;
> > > > >
> > > > >       OwnedDIE(const OwnedDIE &) = delete;
> > > > >       OwnedDIE &operator=(const OwnedDIE &) = delete;
> > > > >
> > > > >       OwnedDIE(DIE *Die) : Die(Die) {}
> > > > >       DIE *release() {
> > > > >         auto *D = Die;
> > > > >         Die = nullptr;
> > > > >         return D;
> > > > >       }
> > > > >
> > > > >     public:
> > > > >       OwnedDIE() = default;
> > > > >       ~OwnedDIE() = default;
> > > > >       OwnedDIE(OwnedDIE &X) : Die(X.Die) { X.Die = nullptr; }
> > > > >       OwnedDIE &operator=(OwnedDIE &X) {
> >
> > (I meant `&&` for both of these :/.)
> >
> > > > >         Die = X.Die;
> > > > >         X.Die = nullptr;
> > > > >         return *this;
> > > > >       }
> > > > >       DIE *get() const { return Die; }
> > > > >
> > > > > As soon as you've got 'get()' (yeah, op* could be abused, but that's a bit more expliict work to take the address of the reference and go stash that somewhere) I think it becomes pretty easy to accidentally duplicate or otherwise play fast-and-loose with ownership.
> > > >
> > > > FWIW, it's impossible to play fast and loose between `DIE::get()` and
> > > > `DIE::addChild()` with this API.  You can only mess around with
> > > > top-level DIEs.  I'm not sure that's worth much though?
> > > >
> > > > I guess the only thing preventing you from adding the same DIE as a child to multiple nodes is the parental check? So the parent pointer acts as a sort of intrusive ownership tracking mechanism... :s doesn't exactly fill me with confidence.
> > >
> > > Right.  (Unless we keep something like an `OwnedDIE`.)
> > >
> > > *nod* but without checking, still easy to abuse an OwnedDIE.
> > >
> > > Anyway, rather than going around in circles here: Without any checks (except the incidental one of the parent pointer) it feels misleading to talk about ownership. I would suggest/prefer abandoning the concept and thinking about it as the BumpPtrAllocator owning these allocations.
> > >
> > > But I leave that up to you - keep it the way it is if you prefer.
> >
> > David and I talked offline and stopped circling each other, and then
> > Fred and I talked about the implications for dsymutil.  Basically,
> > changing to the strict API I proposed above would require double
> > pointers in places in `dsymutil` -- raw pointer + container that
> > could be moved -- and there isn't really much benefit.
> >
> > I'll add a note in the commit message that gives some background,
> > but it seems prudent just to admit these are bumpptr-allocated.
> >
> > *idea, dropped*
> >
> > David, how do the updated patches look?  (Note, I committed r240701
> > in the meantime.)
> >
> >
> >
> > >
> > > > That said, since they're bump pointer allocated anyway, there's no reason we couldn't have a multi-parented DIE anyway. (I mean it'd be silly because it'd print lots of redundant DWARF, but... *shrug*)
> > >
> > > Well, the `DIE::Offset` is stored intrusively in a `DIE`, so this
> > > isn't quite legal.
> > >
> > > Ah, right right.
> > >
> > > Also, `DIEHash::computeTypeSignature()` requires walking the parent
> > > list -- otherwise, we could almost get rid of it (there are a
> > > couple of other users, but IIRC they're not as hard to refactor).
> > >
> > > Hmm - I thought one of them was the stuff used to do cross-DIE referencing which was pretty hard or ugly to do in other ways, if I remember the rather long review that went on for that change...
> > >
> > > But if it's for types in particular - we only produce type signatures when producing the CU signature (we don't use type signatures for type units) and so we can probably keep track of the parent chain during the hash walk and re-use it rather than walking up it.
> >
> > Hmm; if you can get rid of the need in type signatures, maybe I'll
> > have a closer look at the other uses.  They seemed tractable to me.
> >
> > The only other one I wasn't sure about was a place where we need to
> > look up the unit (by finding the *final* node in the list).  If I
> > can't get rid of it entirely, we could at least rename `Parent` to
> > `Unit` and save the walk.
> >
> >
> > >
> > >
> > > >
> > > > >       DIE &operator*() const { return *Die; }
> > > > >       DIE *operator->() const { return Die; }
> > > > >       explicit operator bool() const { return Die; }
> > > > >     };
> > > > >
> > > > >     class DIE {
> > > > >     public:
> > > > >       // Calls OwnedDIE::OwnedDIE(DIE *).
> > > > >       static OwnedDIE get(BumpPtrAllocator &Alloc, dwarf::Tag Tag);
> > > > >       // Calls OwnedDIE::release().
> > > > >       DIE &addChild(OwnedDIE Child);
> > > > >     };
> > > > >
> > > > > I guess I'm not sure how much better this is, although it would
> > > > > prevent llvm-dsymutil from compiling without some more
> > > > > reorganization (currently it calls `release()` and recreates
> > > > > `unique_ptr<>` between functions).
> > > > >
> > > > > > If we're going to keep it - could we add some checking in debug builds somehow? (skip the bump ptr allocation entirely in debug build? use a typed bump ptr allocator (it calls dtors) in a debug build? Just a side table hash set to check construction/destruction pairs and an assert that it's empty on exit?)
> > > > >
> > > > > The `OwnedDIE` idea above adds a bit more safety, but it wouldn't
> > > > > catch "leaks", and a consumer (like dsymutil) and a consumer could
> > > > > choose to store top-level DIEs outside an `OwnedDIE` container.
> > > > >
> > > > > Thoughts?
> > > > >
> > > > > Here's a claim I'll try on for size:
> > > > >
> > > > > Either we should have a mechanism that enforces ownership-like semantics (which I think boils down to: checked pseudo-destruction in asserts builds) or we probably should just forget about it and treat these things as raw pointers with no specific sense of ownership. Just accept that they were all created with the BumpPtrAllocator (perhaps make the API restrictions stronger in this regard? delete op new/delete? I forget how all that works) and move on.
> > > >
> > > > Hmm, I guess we already check the `Parent` pointer when adding
> > > > children, so maybe you're right.
> > > >
> > > > >
> > > > > > Was there a reason to squash the DIEValue and DIE allocator changes together? (probably related to the DIEBlock? (both a value and a DIE) - I thought I remembered reading something from you about changing/splitting that? ah, that's your other patch - if that came first then we could split the allocation changes for DIEValue and DIE into separate patches? But that's too much work to rejig it all?)
> > > > >
> > > > > I think I can split the DIEValue/DIE changes.  New patches early
> > > > > afternoon I hope.
> > > > >
> > > > > >
> > > > > > in DwarfLinker::patchLineTableForUnit why the change back to a range-for from std::find_if?
> > > > >
> > > > > I don't remember =/.  I'll see if I can get the original flow working,
> > > > > and if I can't I'll know why I changed it.
> > > > >
> > > > > >
> > > > > > (I do get tired of seeing allocators passed in/around everywhere - would it make more sense to wrap the allocator in a type with utility functions for generating things? I suppose it's no great win, but feels a bit better, maybe? eg: "DIE::get(DIEValueAllocator, dwarf::DW_TAG_inlined_subroutine);" -> "DIEBuilder.get(dwarf::DW_TAG_inlined_subroutine)" or similar - for values it's trickier, the DIE and the allocator are both needed every time, but there's no easy access to the allocator from values or dies... *shrug* )
> > > > >
> > > > > Maybe I'll play with some builder ideas, see if it reduces the churn
> > > > > at all?  We could have:
> > > > >
> > > > >     DIEBuilder DieB(Alloc);
> > > > >     auto Die = DieB.get(Tag);
> > > > >     DIEValueBuilder DieValueB = DieB.addValues(Die);
> > > > >     DieValueB.addValue(SomeDieValue);
> > > > >
> > > > > I'm not sure this will actually be better in practice though...  I'll
> > > > > play around after posting new patches.
> > > > >
> > > > > Yeah - don't worry too much if it doesn't look useful. And it can be orthogonal cleanup if we one day discover a great API. But I appreciate giving it a bit of thought.
> > > > >
> > > > > - David
> > > > >
> > > > >
> > > > > > On Wed, Jun 24, 2015 at 1:11 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> > > > > >
> > > > > > > On 2015 Jun 17, at 11:36, David Blaikie <dblaikie at gmail.com> wrote:
> > > > > > >
> > > > > > >> On Wed, Jun 17, 2015 at 11:26 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> > > > > > >>
> > > > > > >> > On 2015 Jun 16, at 21:40, Yaron Keren <yaron.keren at gmail.com> wrote:
> > > > > > >> >
> > > > > > >> > Hi Duncan,
> > > > > > >> >
> > > > > > >> > Singly linked lists are used throught LLVM: ArrayRecycler, Registry, LiveInterval, SCEVUnknown, CGBlockInfo, MacroArgCache, ...DIEValueList.
> > > > > > >> >
> > > > > > >> > What do you think about implementing a singly-linked ADT?
> > > > > > >> >
> > > > > > >>
> > > > > > >> Interesting idea.
> > > > > > >>
> > > > > > >> I've taken a look at the other singly-linked lists you mentioned.  Most
> > > > > > >> of these are some version of a free/recycling list (ArrayRecycler,
> > > > > > >> Registry, MacroArgCache, etc.), which need super simple push/pop API and
> > > > > > >> no memory management.
> > > > > > >>
> > > > > > >> LiveInterval and CGBlock also need iteration and a way to conditionally
> > > > > > >> remove elements (something like `remove_if()`), and in a few cases it
> > > > > > >> would be convenient to have API to destruct and/or destroy nodes.
> > > > > > >>
> > > > > > >> I think these would all be straightforward to formalize into an ADT.  We
> > > > > > >> could even use it to implement a value-based `llvm::forward_list<>` that
> > > > > > >> avoided the sorting bug in libstdc++ (which gets "stable" exactly
> > > > > > >> backwards) that David hit when refactoring tablegen.
> > > > > > >>
> > > > > > >> Unfortunately, I think it'd be hard to reuse for DIE::Children and
> > > > > > >> DIEValueList.  The latter wants to preserve insertion order and needs
> > > > > > >> `push_back()` API (the former can technically get away with a timely
> > > > > > >> call to `reverse()`, but it's awkward).  Supporting `push_back()`
> > > > > > >> efficiently requires pointing at the last element somehow, either via an
> > > > > > >> extra pointer or circular logic (what my latest patch does for
> > > > > > >> DIE::Children).
> > > > > > >
> > > > > > > One of the other problems I hit when trying to value-ify the DIE children is that the point of DIE creation is, in some particular cases, divorced from the knowledge of which container it's going to go in to. Yet pointer identity is needed before then.
> > > > > > >
> > > > > > > At least that's how I recall it - may be worth going down that path again just so we can write it up clearly for posterity, I dunno.
> > > > > >
> > > > > > Nope, I saw this too.
> > > > > >
> > > > > > In theory, you could get std::list to work here, by inserting into a
> > > > > > singleton linked list when you first need the reference, and splicing
> > > > > > it into its parent's linked list once it's created.  However, I think a
> > > > > > value-based list is too expensive anyway (see below)...
> > > > > >
> > > > > > > If it is possible, then I would love/prefer to use an existing non-intrusive container (or writing one if we have to). at least could prototype with std::list, then, as you say, implement a std::forward_list-like device with the bonus that we could fix sort and do the circular trick to allow insertion at the end, etc.
> > > > > >
> > > > > > Unfortunately the BumpPtrAllocator/skip-unlinking-the-list part of these
> > > > > > patches is really necessary, making std::list pretty awkward.  My payload
> > > > > > usually takes ~2 minutes when profiling, and when I profiled linked lists
> > > > > > without the BumpPtrAllocator it was talking >15 minutes to teardown (I
> > > > > > killed it, so I don't know how much longer it would have taken).
> > > > > >
> > > > > > Moreover, using a BumpPtrAllocator this way fixes the slow DIE teardown
> > > > > > that Fred was battling in llvm-dsymutil.
> > > > > >
> > > > > > > (& sorry, will get to proper review of this patch soon... )
> > > > > >
> > > > > > (Sadly, I completely missed this reply of yours until this morning (when
> > > > > > I was already preparing the patches below).)
> > > > > >
> > > > > > I've rebased the patches on top of r240566, and squashed them together
> > > > > > with shared implementation for the two linked lists in die-diet.patch.
> > > > > >
> > > > > > Yesterday (on top of 240244), I reran llc to check memory usage on the
> > > > > > verify-uselistorder.lto.opt.bc payload.  This patch drops peak memory
> > > > > > usage from 800MB down to 720MB, 10%.  I also timed before/after the patch
> > > > > > (no heap profile, CPU speed locked down, best of 5 runs) and measured a
> > > > > > 1% speedup.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > (I've also attached die-block-is-not-a-die.patch, which fixes the
> > > > > > DIELoc/DIEBlock/DIE relationship.  I've been holding off on committing
> > > > > > it since it's built on top of the die-diet.patch.
> > > > > >
> > > > > >
> > > > > >
> > > > > > )
> > > > > >
> > > > > > >
> > > > > > >> We could configure this stuff via templates -- I'd be open to the idea
> > > > > > >> -- but I think the intersection between the implementations would be
> > > > > > >> practically nil.  Even the iterator implementations need to be
> > > > > > >> completely different.  If these are the only `push_back()`-enabled
> > > > > > >> slists in tree, is it premature to abstract it?  Would we even want the
> > > > > > >> same name as "normal" slists?
> > > > > > >>
> > > > > > >> (Regardless, I think it would be a great idea for someone to ADT-ify the
> > > > > > >> other linked lists!)
> > > > > > >
> >
> >
> >
> 
> 





More information about the llvm-commits mailing list