[LLVMdev] RFC: Reduce the memory footprint of DIEs (and DIEValues)

Thu Jun 25 15:13:53 PDT 2015

Looks good (might eventually iterate a bit on the generic list data
structures (names, API, etc) - but they're factored out enough)

Does DIE::get need to be templated on the allocator type?

On Thu, Jun 25, 2015 at 2:57 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:

>
> > On 2015-Jun-25, at 13:13, David Blaikie <dblaikie at gmail.com> wrote:
> >
> >
> >
> > On Thu, Jun 25, 2015 at 1:01 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> >
> > > On 2015-Jun-25, at 12:44, David Blaikie <dblaikie at gmail.com> wrote:
> > >
> > >
> > >
> > > On Thu, Jun 25, 2015 at 12:04 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> > >
> > > > On 2015-Jun-25, at 12:00, David Blaikie <dblaikie at gmail.com> wrote:
> > > >
> > > >
> > > >
> > > > On Thu, Jun 25, 2015 at 11:49 AM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> > > >
> > > > > On 2015-Jun-25, at 10:48, David Blaikie <dblaikie at gmail.com>
> wrote:
> > > > >
> > > > > I'm not sure of the merits of handing around unique_ptrs with a
> null deleter (I could see it in some generic template machinery - but this
> is pretty explicitly never-owning). I get the desire to tag the conceptual
> ownership, and I appreciate that - but without any checking I suspect it'll
> just become incorrect over time and we won't notice.
> > > >
> > > > IMO, where tagging the ownership is valuable is before the DIE is
> > > > inserted in the tree (and in this patch that's the only place it
> > > > remains).  But maybe `unique_ptr` is kind of a funny name for what's
> > > > going on.
> > > >
> > > > Maybe a way to keep it from bitrotting would be to make some of the
> > > > API private, something like:
> > > >
> > > >     class OwnedDIE {
> > > >       friend class DIE;
> > > >       DIE *Die = nullptr;
> > > >
> > > >       OwnedDIE(const OwnedDIE &) = delete;
> > > >       OwnedDIE &operator=(const OwnedDIE &) = delete;
> > > >
> > > >       OwnedDIE(DIE *Die) : Die(Die) {}
> > > >       DIE *release() {
> > > >         auto *D = Die;
> > > >         Die = nullptr;
> > > >         return D;
> > > >       }
> > > >
> > > >     public:
> > > >       OwnedDIE() = default;
> > > >       ~OwnedDIE() = default;
> > > >       OwnedDIE(OwnedDIE &X) : Die(X.Die) { X.Die = nullptr; }
> > > >       OwnedDIE &operator=(OwnedDIE &X) {
>
> (I meant `&&` for both of these :/.)
>
> > > >         Die = X.Die;
> > > >         X.Die = nullptr;
> > > >         return *this;
> > > >       }
> > > >       DIE *get() const { return Die; }
> > > >
> > > > As soon as you've got 'get()' (yeah, op* could be abused, but that's
> a bit more expliict work to take the address of the reference and go stash
> that somewhere) I think it becomes pretty easy to accidentally duplicate or
> otherwise play fast-and-loose with ownership.
> > >
> > > FWIW, it's impossible to play fast and loose between `DIE::get()` and
> > > `DIE::addChild()` with this API.  You can only mess around with
> > > top-level DIEs.  I'm not sure that's worth much though?
> > >
> > > I guess the only thing preventing you from adding the same DIE as a
> child to multiple nodes is the parental check? So the parent pointer acts
> as a sort of intrusive ownership tracking mechanism... :s doesn't exactly
> fill me with confidence.
> >
> > Right.  (Unless we keep something like an `OwnedDIE`.)
> >
> > *nod* but without checking, still easy to abuse an OwnedDIE.
> >
> > Anyway, rather than going around in circles here: Without any checks
> (except the incidental one of the parent pointer) it feels misleading to
> talk about ownership. I would suggest/prefer abandoning the concept and
> thinking about it as the BumpPtrAllocator owning these allocations.
> >
> > But I leave that up to you - keep it the way it is if you prefer.
>
> David and I talked offline and stopped circling each other, and then
> Fred and I talked about the implications for dsymutil.  Basically,
> changing to the strict API I proposed above would require double
> pointers in places in `dsymutil` -- raw pointer + container that
> could be moved -- and there isn't really much benefit.
>
> I'll add a note in the commit message that gives some background,
> but it seems prudent just to admit these are bumpptr-allocated.
>
> *idea, dropped*
>
> David, how do the updated patches look?  (Note, I committed r240701
> in the meantime.)
>
>
>
> >
> > > That said, since they're bump pointer allocated anyway, there's no
> reason we couldn't have a multi-parented DIE anyway. (I mean it'd be silly
> because it'd print lots of redundant DWARF, but... *shrug*)
> >
> > Well, the `DIE::Offset` is stored intrusively in a `DIE`, so this
> > isn't quite legal.
> >
> > Ah, right right.
> >
> > Also, `DIEHash::computeTypeSignature()` requires walking the parent
> > list -- otherwise, we could almost get rid of it (there are a
> > couple of other users, but IIRC they're not as hard to refactor).
> >
> > Hmm - I thought one of them was the stuff used to do cross-DIE
> referencing which was pretty hard or ugly to do in other ways, if I
> remember the rather long review that went on for that change...
> >
> > But if it's for types in particular - we only produce type signatures
> when producing the CU signature (we don't use type signatures for type
> units) and so we can probably keep track of the parent chain during the
> hash walk and re-use it rather than walking up it.
>
> Hmm; if you can get rid of the need in type signatures, maybe I'll
> have a closer look at the other uses.  They seemed tractable to me.
>
> The only other one I wasn't sure about was a place where we need to
> look up the unit (by finding the *final* node in the list).  If I
> can't get rid of it entirely, we could at least rename `Parent` to
> `Unit` and save the walk.
>
>
> >
> >
> > >
> > > >       DIE &operator*() const { return *Die; }
> > > >       DIE *operator->() const { return Die; }
> > > >       explicit operator bool() const { return Die; }
> > > >     };
> > > >
> > > >     class DIE {
> > > >     public:
> > > >       // Calls OwnedDIE::OwnedDIE(DIE *).
> > > >       static OwnedDIE get(BumpPtrAllocator &Alloc, dwarf::Tag Tag);
> > > >       // Calls OwnedDIE::release().
> > > >       DIE &addChild(OwnedDIE Child);
> > > >     };
> > > >
> > > > I guess I'm not sure how much better this is, although it would
> > > > prevent llvm-dsymutil from compiling without some more
> > > > reorganization (currently it calls `release()` and recreates
> > > > `unique_ptr<>` between functions).
> > > >
> > > > > If we're going to keep it - could we add some checking in debug
> builds somehow? (skip the bump ptr allocation entirely in debug build? use
> a typed bump ptr allocator (it calls dtors) in a debug build? Just a side
> table hash set to check construction/destruction pairs and an assert that
> it's empty on exit?)
> > > >
> > > > The `OwnedDIE` idea above adds a bit more safety, but it wouldn't
> > > > catch "leaks", and a consumer (like dsymutil) and a consumer could
> > > > choose to store top-level DIEs outside an `OwnedDIE` container.
> > > >
> > > > Thoughts?
> > > >
> > > > Here's a claim I'll try on for size:
> > > >
> > > > Either we should have a mechanism that enforces ownership-like
> semantics (which I think boils down to: checked pseudo-destruction in
> asserts builds) or we probably should just forget about it and treat these
> things as raw pointers with no specific sense of ownership. Just accept
> that they were all created with the BumpPtrAllocator (perhaps make the API
> restrictions stronger in this regard? delete op new/delete? I forget how
> all that works) and move on.
> > >
> > > Hmm, I guess we already check the `Parent` pointer when adding
> > > children, so maybe you're right.
> > >
> > > >
> > > > > Was there a reason to squash the DIEValue and DIE allocator
> changes together? (probably related to the DIEBlock? (both a value and a
> DIE) - I thought I remembered reading something from you about
> changing/splitting that? ah, that's your other patch - if that came first
> then we could split the allocation changes for DIEValue and DIE into
> separate patches? But that's too much work to rejig it all?)
> > > >
> > > > I think I can split the DIEValue/DIE changes.  New patches early
> > > > afternoon I hope.
> > > >
> > > > >
> > > > > in DwarfLinker::patchLineTableForUnit why the change back to a
> range-for from std::find_if?
> > > >
> > > > I don't remember =/.  I'll see if I can get the original flow
> working,
> > > > and if I can't I'll know why I changed it.
> > > >
> > > > >
> > > > > (I do get tired of seeing allocators passed in/around everywhere -
> would it make more sense to wrap the allocator in a type with utility
> functions for generating things? I suppose it's no great win, but feels a
> bit better, maybe? eg: "DIE::get(DIEValueAllocator,
> dwarf::DW_TAG_inlined_subroutine);" ->
> "DIEBuilder.get(dwarf::DW_TAG_inlined_subroutine)" or similar - for values
> it's trickier, the DIE and the allocator are both needed every time, but
> there's no easy access to the allocator from values or dies... *shrug* )
> > > >
> > > > Maybe I'll play with some builder ideas, see if it reduces the churn
> > > > at all?  We could have:
> > > >
> > > >     DIEBuilder DieB(Alloc);
> > > >     auto Die = DieB.get(Tag);
> > > >     DIEValueBuilder DieValueB = DieB.addValues(Die);
> > > >     DieValueB.addValue(SomeDieValue);
> > > >
> > > > I'm not sure this will actually be better in practice though...  I'll
> > > > play around after posting new patches.
> > > >
> > > > Yeah - don't worry too much if it doesn't look useful. And it can be
> orthogonal cleanup if we one day discover a great API. But I appreciate
> giving it a bit of thought.
> > > >
> > > > - David
> > > >
> > > >
> > > > > On Wed, Jun 24, 2015 at 1:11 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> > > > >
> > > > > > On 2015 Jun 17, at 11:36, David Blaikie <dblaikie at gmail.com>
> wrote:
> > > > > >
> > > > > >> On Wed, Jun 17, 2015 at 11:26 AM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> > > > > >>
> > > > > >> > On 2015 Jun 16, at 21:40, Yaron Keren <yaron.keren at gmail.com>
> wrote:
> > > > > >> >
> > > > > >> > Hi Duncan,
> > > > > >> >
> > > > > >> > Singly linked lists are used throught LLVM: ArrayRecycler,
> Registry, LiveInterval, SCEVUnknown, CGBlockInfo, MacroArgCache,
> ...DIEValueList.
> > > > > >> >
> > > > > >> > What do you think about implementing a singly-linked ADT?
> > > > > >> >
> > > > > >>
> > > > > >> Interesting idea.
> > > > > >>
> > > > > >> I've taken a look at the other singly-linked lists you
> mentioned.  Most
> > > > > >> of these are some version of a free/recycling list
> (ArrayRecycler,
> > > > > >> Registry, MacroArgCache, etc.), which need super simple
> push/pop API and
> > > > > >> no memory management.
> > > > > >>
> > > > > >> LiveInterval and CGBlock also need iteration and a way to
> conditionally
> > > > > >> remove elements (something like `remove_if()`), and in a few
> cases it
> > > > > >> would be convenient to have API to destruct and/or destroy
> nodes.
> > > > > >>
> > > > > >> I think these would all be straightforward to formalize into an
> ADT.  We
> > > > > >> could even use it to implement a value-based
> `llvm::forward_list<>` that
> > > > > >> avoided the sorting bug in libstdc++ (which gets "stable"
> exactly
> > > > > >> backwards) that David hit when refactoring tablegen.
> > > > > >>
> > > > > >> Unfortunately, I think it'd be hard to reuse for DIE::Children
> and
> > > > > >> DIEValueList.  The latter wants to preserve insertion order and
> needs
> > > > > >> `push_back()` API (the former can technically get away with a
> timely
> > > > > >> call to `reverse()`, but it's awkward).  Supporting
> `push_back()`
> > > > > >> efficiently requires pointing at the last element somehow,
> either via an
> > > > > >> extra pointer or circular logic (what my latest patch does for
> > > > > >> DIE::Children).
> > > > > >
> > > > > > One of the other problems I hit when trying to value-ify the DIE
> children is that the point of DIE creation is, in some particular cases,
> divorced from the knowledge of which container it's going to go in to. Yet
> pointer identity is needed before then.
> > > > > >
> > > > > > At least that's how I recall it - may be worth going down that
> path again just so we can write it up clearly for posterity, I dunno.
> > > > >
> > > > > Nope, I saw this too.
> > > > >
> > > > > In theory, you could get std::list to work here, by inserting into
> a
> > > > > singleton linked list when you first need the reference, and
> splicing
> > > > > it into its parent's linked list once it's created.  However, I
> think a
> > > > > value-based list is too expensive anyway (see below)...
> > > > >
> > > > > > If it is possible, then I would love/prefer to use an existing
> non-intrusive container (or writing one if we have to). at least could
> prototype with std::list, then, as you say, implement a
> std::forward_list-like device with the bonus that we could fix sort and do
> the circular trick to allow insertion at the end, etc.
> > > > >
> > > > > Unfortunately the BumpPtrAllocator/skip-unlinking-the-list part of
> these
> > > > > patches is really necessary, making std::list pretty awkward.  My
> payload
> > > > > usually takes ~2 minutes when profiling, and when I profiled
> linked lists
> > > > > without the BumpPtrAllocator it was talking >15 minutes to
> teardown (I
> > > > > killed it, so I don't know how much longer it would have taken).
> > > > >
> > > > > Moreover, using a BumpPtrAllocator this way fixes the slow DIE
> teardown
> > > > > that Fred was battling in llvm-dsymutil.
> > > > >
> > > > > > (& sorry, will get to proper review of this patch soon... )
> > > > >
> > > > > (Sadly, I completely missed this reply of yours until this morning
> (when
> > > > > I was already preparing the patches below).)
> > > > >
> > > > > I've rebased the patches on top of r240566, and squashed them
> together
> > > > > with shared implementation for the two linked lists in
> die-diet.patch.
> > > > >
> > > > > Yesterday (on top of 240244), I reran llc to check memory usage on
> the
> > > > > verify-uselistorder.lto.opt.bc payload.  This patch drops peak
> memory
> > > > > usage from 800MB down to 720MB, 10%.  I also timed before/after
> the patch
> > > > > (no heap profile, CPU speed locked down, best of 5 runs) and
> measured a
> > > > > 1% speedup.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > (I've also attached die-block-is-not-a-die.patch, which fixes the
> > > > > DIELoc/DIEBlock/DIE relationship.  I've been holding off on
> committing
> > > > > it since it's built on top of the die-diet.patch.
> > > > >
> > > > >
> > > > >
> > > > > )
> > > > >
> > > > > >
> > > > > >> We could configure this stuff via templates -- I'd be open to
> the idea
> > > > > >> -- but I think the intersection between the implementations
> would be
> > > > > >> practically nil.  Even the iterator implementations need to be
> > > > > >> completely different.  If these are the only
> `push_back()`-enabled
> > > > > >> slists in tree, is it premature to abstract it?  Would we even
> want the
> > > > > >> same name as "normal" slists?
> > > > > >>
> > > > > >> (Regardless, I think it would be a great idea for someone to
> ADT-ify the
> > > > > >> other linked lists!)
> > > > > >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150625/519aee30/attachment.html>