[LLVMdev] RFC: Reduce the memory footprint of DIEs (and DIEValues)

David Blaikie dblaikie at gmail.com
Thu Jun 25 12:44:36 PDT 2015


On Thu, Jun 25, 2015 at 12:04 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:

>
> > On 2015-Jun-25, at 12:00, David Blaikie <dblaikie at gmail.com> wrote:
> >
> >
> >
> > On Thu, Jun 25, 2015 at 11:49 AM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> >
> > > On 2015-Jun-25, at 10:48, David Blaikie <dblaikie at gmail.com> wrote:
> > >
> > > I'm not sure of the merits of handing around unique_ptrs with a null
> deleter (I could see it in some generic template machinery - but this is
> pretty explicitly never-owning). I get the desire to tag the conceptual
> ownership, and I appreciate that - but without any checking I suspect it'll
> just become incorrect over time and we won't notice.
> >
> > IMO, where tagging the ownership is valuable is before the DIE is
> > inserted in the tree (and in this patch that's the only place it
> > remains).  But maybe `unique_ptr` is kind of a funny name for what's
> > going on.
> >
> > Maybe a way to keep it from bitrotting would be to make some of the
> > API private, something like:
> >
> >     class OwnedDIE {
> >       friend class DIE;
> >       DIE *Die = nullptr;
> >
> >       OwnedDIE(const OwnedDIE &) = delete;
> >       OwnedDIE &operator=(const OwnedDIE &) = delete;
> >
> >       OwnedDIE(DIE *Die) : Die(Die) {}
> >       DIE *release() {
> >         auto *D = Die;
> >         Die = nullptr;
> >         return D;
> >       }
> >
> >     public:
> >       OwnedDIE() = default;
> >       ~OwnedDIE() = default;
> >       OwnedDIE(OwnedDIE &X) : Die(X.Die) { X.Die = nullptr; }
> >       OwnedDIE &operator=(OwnedDIE &X) {
> >         Die = X.Die;
> >         X.Die = nullptr;
> >         return *this;
> >       }
> >       DIE *get() const { return Die; }
> >
> > As soon as you've got 'get()' (yeah, op* could be abused, but that's a
> bit more expliict work to take the address of the reference and go stash
> that somewhere) I think it becomes pretty easy to accidentally duplicate or
> otherwise play fast-and-loose with ownership.
>
> FWIW, it's impossible to play fast and loose between `DIE::get()` and
> `DIE::addChild()` with this API.  You can only mess around with
> top-level DIEs.  I'm not sure that's worth much though?
>

I guess the only thing preventing you from adding the same DIE as a child
to multiple nodes is the parental check? So the parent pointer acts as a
sort of intrusive ownership tracking mechanism... :s doesn't exactly fill
me with confidence.

That said, since they're bump pointer allocated anyway, there's no reason
we couldn't have a multi-parented DIE anyway. (I mean it'd be silly because
it'd print lots of redundant DWARF, but... *shrug*)


> >       DIE &operator*() const { return *Die; }
> >       DIE *operator->() const { return Die; }
> >       explicit operator bool() const { return Die; }
> >     };
> >
> >     class DIE {
> >     public:
> >       // Calls OwnedDIE::OwnedDIE(DIE *).
> >       static OwnedDIE get(BumpPtrAllocator &Alloc, dwarf::Tag Tag);
> >       // Calls OwnedDIE::release().
> >       DIE &addChild(OwnedDIE Child);
> >     };
> >
> > I guess I'm not sure how much better this is, although it would
> > prevent llvm-dsymutil from compiling without some more
> > reorganization (currently it calls `release()` and recreates
> > `unique_ptr<>` between functions).
> >
> > > If we're going to keep it - could we add some checking in debug builds
> somehow? (skip the bump ptr allocation entirely in debug build? use a typed
> bump ptr allocator (it calls dtors) in a debug build? Just a side table
> hash set to check construction/destruction pairs and an assert that it's
> empty on exit?)
> >
> > The `OwnedDIE` idea above adds a bit more safety, but it wouldn't
> > catch "leaks", and a consumer (like dsymutil) and a consumer could
> > choose to store top-level DIEs outside an `OwnedDIE` container.
> >
> > Thoughts?
> >
> > Here's a claim I'll try on for size:
> >
> > Either we should have a mechanism that enforces ownership-like semantics
> (which I think boils down to: checked pseudo-destruction in asserts builds)
> or we probably should just forget about it and treat these things as raw
> pointers with no specific sense of ownership. Just accept that they were
> all created with the BumpPtrAllocator (perhaps make the API restrictions
> stronger in this regard? delete op new/delete? I forget how all that works)
> and move on.
>
> Hmm, I guess we already check the `Parent` pointer when adding
> children, so maybe you're right.
>
> >
> > > Was there a reason to squash the DIEValue and DIE allocator changes
> together? (probably related to the DIEBlock? (both a value and a DIE) - I
> thought I remembered reading something from you about changing/splitting
> that? ah, that's your other patch - if that came first then we could split
> the allocation changes for DIEValue and DIE into separate patches? But
> that's too much work to rejig it all?)
> >
> > I think I can split the DIEValue/DIE changes.  New patches early
> > afternoon I hope.
> >
> > >
> > > in DwarfLinker::patchLineTableForUnit why the change back to a
> range-for from std::find_if?
> >
> > I don't remember =/.  I'll see if I can get the original flow working,
> > and if I can't I'll know why I changed it.
> >
> > >
> > > (I do get tired of seeing allocators passed in/around everywhere -
> would it make more sense to wrap the allocator in a type with utility
> functions for generating things? I suppose it's no great win, but feels a
> bit better, maybe? eg: "DIE::get(DIEValueAllocator,
> dwarf::DW_TAG_inlined_subroutine);" ->
> "DIEBuilder.get(dwarf::DW_TAG_inlined_subroutine)" or similar - for values
> it's trickier, the DIE and the allocator are both needed every time, but
> there's no easy access to the allocator from values or dies... *shrug* )
> >
> > Maybe I'll play with some builder ideas, see if it reduces the churn
> > at all?  We could have:
> >
> >     DIEBuilder DieB(Alloc);
> >     auto Die = DieB.get(Tag);
> >     DIEValueBuilder DieValueB = DieB.addValues(Die);
> >     DieValueB.addValue(SomeDieValue);
> >
> > I'm not sure this will actually be better in practice though...  I'll
> > play around after posting new patches.
> >
> > Yeah - don't worry too much if it doesn't look useful. And it can be
> orthogonal cleanup if we one day discover a great API. But I appreciate
> giving it a bit of thought.
> >
> > - David
> >
> >
> > > On Wed, Jun 24, 2015 at 1:11 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> > >
> > > > On 2015 Jun 17, at 11:36, David Blaikie <dblaikie at gmail.com> wrote:
> > > >
> > > >> On Wed, Jun 17, 2015 at 11:26 AM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> > > >>
> > > >> > On 2015 Jun 16, at 21:40, Yaron Keren <yaron.keren at gmail.com>
> wrote:
> > > >> >
> > > >> > Hi Duncan,
> > > >> >
> > > >> > Singly linked lists are used throught LLVM: ArrayRecycler,
> Registry, LiveInterval, SCEVUnknown, CGBlockInfo, MacroArgCache,
> ...DIEValueList.
> > > >> >
> > > >> > What do you think about implementing a singly-linked ADT?
> > > >> >
> > > >>
> > > >> Interesting idea.
> > > >>
> > > >> I've taken a look at the other singly-linked lists you mentioned.
> Most
> > > >> of these are some version of a free/recycling list (ArrayRecycler,
> > > >> Registry, MacroArgCache, etc.), which need super simple push/pop
> API and
> > > >> no memory management.
> > > >>
> > > >> LiveInterval and CGBlock also need iteration and a way to
> conditionally
> > > >> remove elements (something like `remove_if()`), and in a few cases
> it
> > > >> would be convenient to have API to destruct and/or destroy nodes.
> > > >>
> > > >> I think these would all be straightforward to formalize into an
> ADT.  We
> > > >> could even use it to implement a value-based `llvm::forward_list<>`
> that
> > > >> avoided the sorting bug in libstdc++ (which gets "stable" exactly
> > > >> backwards) that David hit when refactoring tablegen.
> > > >>
> > > >> Unfortunately, I think it'd be hard to reuse for DIE::Children and
> > > >> DIEValueList.  The latter wants to preserve insertion order and
> needs
> > > >> `push_back()` API (the former can technically get away with a timely
> > > >> call to `reverse()`, but it's awkward).  Supporting `push_back()`
> > > >> efficiently requires pointing at the last element somehow, either
> via an
> > > >> extra pointer or circular logic (what my latest patch does for
> > > >> DIE::Children).
> > > >
> > > > One of the other problems I hit when trying to value-ify the DIE
> children is that the point of DIE creation is, in some particular cases,
> divorced from the knowledge of which container it's going to go in to. Yet
> pointer identity is needed before then.
> > > >
> > > > At least that's how I recall it - may be worth going down that path
> again just so we can write it up clearly for posterity, I dunno.
> > >
> > > Nope, I saw this too.
> > >
> > > In theory, you could get std::list to work here, by inserting into a
> > > singleton linked list when you first need the reference, and splicing
> > > it into its parent's linked list once it's created.  However, I think a
> > > value-based list is too expensive anyway (see below)...
> > >
> > > > If it is possible, then I would love/prefer to use an existing
> non-intrusive container (or writing one if we have to). at least could
> prototype with std::list, then, as you say, implement a
> std::forward_list-like device with the bonus that we could fix sort and do
> the circular trick to allow insertion at the end, etc.
> > >
> > > Unfortunately the BumpPtrAllocator/skip-unlinking-the-list part of
> these
> > > patches is really necessary, making std::list pretty awkward.  My
> payload
> > > usually takes ~2 minutes when profiling, and when I profiled linked
> lists
> > > without the BumpPtrAllocator it was talking >15 minutes to teardown (I
> > > killed it, so I don't know how much longer it would have taken).
> > >
> > > Moreover, using a BumpPtrAllocator this way fixes the slow DIE teardown
> > > that Fred was battling in llvm-dsymutil.
> > >
> > > > (& sorry, will get to proper review of this patch soon... )
> > >
> > > (Sadly, I completely missed this reply of yours until this morning
> (when
> > > I was already preparing the patches below).)
> > >
> > > I've rebased the patches on top of r240566, and squashed them together
> > > with shared implementation for the two linked lists in die-diet.patch.
> > >
> > > Yesterday (on top of 240244), I reran llc to check memory usage on the
> > > verify-uselistorder.lto.opt.bc payload.  This patch drops peak memory
> > > usage from 800MB down to 720MB, 10%.  I also timed before/after the
> patch
> > > (no heap profile, CPU speed locked down, best of 5 runs) and measured a
> > > 1% speedup.
> > >
> > >
> > >
> > >
> > > (I've also attached die-block-is-not-a-die.patch, which fixes the
> > > DIELoc/DIEBlock/DIE relationship.  I've been holding off on committing
> > > it since it's built on top of the die-diet.patch.
> > >
> > >
> > >
> > > )
> > >
> > > >
> > > >> We could configure this stuff via templates -- I'd be open to the
> idea
> > > >> -- but I think the intersection between the implementations would be
> > > >> practically nil.  Even the iterator implementations need to be
> > > >> completely different.  If these are the only `push_back()`-enabled
> > > >> slists in tree, is it premature to abstract it?  Would we even want
> the
> > > >> same name as "normal" slists?
> > > >>
> > > >> (Regardless, I think it would be a great idea for someone to
> ADT-ify the
> > > >> other linked lists!)
> > > >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150625/306c425f/attachment.html>


More information about the llvm-commits mailing list