[LLVMdev] RFC: Reduce the memory footprint of DIEs (and DIEValues)

David Blaikie dblaikie at gmail.com
Thu Jun 25 12:46:44 PDT 2015


On Thu, Jun 25, 2015 at 12:27 PM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:

>
> > On 2015-Jun-25, at 12:22, Duncan P. N. Exon Smith <dexonsmith at apple.com>
> wrote:
> >
> >>
> >> On 2015-Jun-25, at 12:04, Duncan P. N. Exon Smith <dexonsmith at apple.com>
> wrote:
> >>
> >>>
> >>> On 2015-Jun-25, at 12:00, David Blaikie <dblaikie at gmail.com> wrote:
> >>>
> >>>
> >>>
> >>> On Thu, Jun 25, 2015 at 11:49 AM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> >>>
> >>>> On 2015-Jun-25, at 10:48, David Blaikie <dblaikie at gmail.com> wrote:
> >>>>
> >>>> I'm not sure of the merits of handing around unique_ptrs with a null
> deleter (I could see it in some generic template machinery - but this is
> pretty explicitly never-owning). I get the desire to tag the conceptual
> ownership, and I appreciate that - but without any checking I suspect it'll
> just become incorrect over time and we won't notice.
> >>>
> >>> IMO, where tagging the ownership is valuable is before the DIE is
> >>> inserted in the tree (and in this patch that's the only place it
> >>> remains).  But maybe `unique_ptr` is kind of a funny name for what's
> >>> going on.
> >>>
> >>> Maybe a way to keep it from bitrotting would be to make some of the
> >>> API private, something like:
> >>>
> >>>   class OwnedDIE {
> >>>     friend class DIE;
> >>>     DIE *Die = nullptr;
> >>>
> >>>     OwnedDIE(const OwnedDIE &) = delete;
> >>>     OwnedDIE &operator=(const OwnedDIE &) = delete;
> >>>
> >>>     OwnedDIE(DIE *Die) : Die(Die) {}
> >>>     DIE *release() {
> >>>       auto *D = Die;
> >>>       Die = nullptr;
> >>>       return D;
> >>>     }
> >>>
> >>>   public:
> >>>     OwnedDIE() = default;
> >>>     ~OwnedDIE() = default;
> >>>     OwnedDIE(OwnedDIE &X) : Die(X.Die) { X.Die = nullptr; }
> >>>     OwnedDIE &operator=(OwnedDIE &X) {
> >>>       Die = X.Die;
> >>>       X.Die = nullptr;
> >>>       return *this;
> >>>     }
> >>>     DIE *get() const { return Die; }
> >>>
> >>> As soon as you've got 'get()' (yeah, op* could be abused, but that's a
> bit more expliict work to take the address of the reference and go stash
> that somewhere) I think it becomes pretty easy to accidentally duplicate or
> otherwise play fast-and-loose with ownership.
> >>
> >> FWIW, it's impossible to play fast and loose between `DIE::get()` and
> >> `DIE::addChild()` with this API.  You can only mess around with
> >> top-level DIEs.  I'm not sure that's worth much though?
> >>
> >>>
> >>>     DIE &operator*() const { return *Die; }
> >>>     DIE *operator->() const { return Die; }
> >>>     explicit operator bool() const { return Die; }
> >>>   };
> >>>
> >>>   class DIE {
> >>>   public:
> >>>     // Calls OwnedDIE::OwnedDIE(DIE *).
> >>>     static OwnedDIE get(BumpPtrAllocator &Alloc, dwarf::Tag Tag);
> >>>     // Calls OwnedDIE::release().
> >>>     DIE &addChild(OwnedDIE Child);
> >>>   };
> >>>
> >>> I guess I'm not sure how much better this is, although it would
> >>> prevent llvm-dsymutil from compiling without some more
> >>> reorganization (currently it calls `release()` and recreates
> >>> `unique_ptr<>` between functions).
> >>>
> >>>> If we're going to keep it - could we add some checking in debug
> builds somehow? (skip the bump ptr allocation entirely in debug build? use
> a typed bump ptr allocator (it calls dtors) in a debug build? Just a side
> table hash set to check construction/destruction pairs and an assert that
> it's empty on exit?)
> >>>
> >>> The `OwnedDIE` idea above adds a bit more safety, but it wouldn't
> >>> catch "leaks", and a consumer (like dsymutil) and a consumer could
> >>> choose to store top-level DIEs outside an `OwnedDIE` container.
> >>>
> >>> Thoughts?
> >>>
> >>> Here's a claim I'll try on for size:
> >>>
> >>> Either we should have a mechanism that enforces ownership-like
> semantics (which I think boils down to: checked pseudo-destruction in
> asserts builds) or we probably should just forget about it and treat these
> things as raw pointers with no specific sense of ownership. Just accept
> that they were all created with the BumpPtrAllocator (perhaps make the API
> restrictions stronger in this regard? delete op new/delete? I forget how
> all that works) and move on.
> >>
> >> Hmm, I guess we already check the `Parent` pointer when adding
> >> children, so maybe you're right.
> >>
> >>>
> >>>> Was there a reason to squash the DIEValue and DIE allocator changes
> together? (probably related to the DIEBlock? (both a value and a DIE) - I
> thought I remembered reading something from you about changing/splitting
> that? ah, that's your other patch - if that came first then we could split
> the allocation changes for DIEValue and DIE into separate patches? But
> that's too much work to rejig it all?)
> >>>
> >>> I think I can split the DIEValue/DIE changes.  New patches early
> >>> afternoon I hope.
> >
> > For now I left in DIE::UniquePtr.  I'm happy to switch to OwnedDIE
> > (which enforces ownership between `DIE::get()` and `DIE::addChild()`)
> > or just remove the container entirely.
> >
> > Split patches attached.
> >
> > <0001-die-diet-v2.patch><0002-die-diet-v2.patch>
> >>>
> >>>>
> >>>> in DwarfLinker::patchLineTableForUnit why the change back to a
> range-for from std::find_if?
> >>>
> >>> I don't remember =/.  I'll see if I can get the original flow working,
> >>> and if I can't I'll know why I changed it.
> >
> > I can't find a reason.  The original flow works just fine, so I went
> > back to it in the new patches.
>
>
> Nevermind, found the reason (in my quick check I recompiled the wrong
> module):
>
> /Users/dexonsmith/data/llvm/diet/tools/dsymutil/DwarfLinker.cpp:2375:33:
> error: no member named 'values_begin' in 'llvm::DIE'
>         std::find_if(OutputDIE->values_begin(), OutputDIE->values_end(),
>                      ~~~~~~~~~  ^
>
> (etc.)
>
> I guess here's what I was thinking:
>
>   - Can't reorg first, since the current code needs an iterator.
>

My take on this is to reorg first, but do so by introducing an
llvm::find_if(Range, Predicate) - see llvm::all_of as an example of adding
range algorithms (STLExtras.h)


>   - Awkward to reorg second, since we need extra API for just one
>     commit.
>   - Easiest to just rewrite it.
>
> Happy to do whatever here.


> >
> >>>
> >>>>
> >>>> (I do get tired of seeing allocators passed in/around everywhere -
> would it make more sense to wrap the allocator in a type with utility
> functions for generating things? I suppose it's no great win, but feels a
> bit better, maybe? eg: "DIE::get(DIEValueAllocator,
> dwarf::DW_TAG_inlined_subroutine);" ->
> "DIEBuilder.get(dwarf::DW_TAG_inlined_subroutine)" or similar - for values
> it's trickier, the DIE and the allocator are both needed every time, but
> there's no easy access to the allocator from values or dies... *shrug* )
> >>>
> >>> Maybe I'll play with some builder ideas, see if it reduces the churn
> >>> at all?  We could have:
> >>>
> >>>   DIEBuilder DieB(Alloc);
> >>>   auto Die = DieB.get(Tag);
> >>>   DIEValueBuilder DieValueB = DieB.addValues(Die);
> >>>   DieValueB.addValue(SomeDieValue);
> >>>
> >>> I'm not sure this will actually be better in practice though...  I'll
> >>> play around after posting new patches.
> >>>
> >>> Yeah - don't worry too much if it doesn't look useful. And it can be
> orthogonal cleanup if we one day discover a great API. But I appreciate
> giving it a bit of thought.
> >>>
> >>> - David
> >>>
> >>>
> >>>> On Wed, Jun 24, 2015 at 1:11 PM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> >>>>
> >>>>> On 2015 Jun 17, at 11:36, David Blaikie <dblaikie at gmail.com> wrote:
> >>>>>
> >>>>>> On Wed, Jun 17, 2015 at 11:26 AM, Duncan P. N. Exon Smith <
> dexonsmith at apple.com> wrote:
> >>>>>>
> >>>>>>> On 2015 Jun 16, at 21:40, Yaron Keren <yaron.keren at gmail.com>
> wrote:
> >>>>>>>
> >>>>>>> Hi Duncan,
> >>>>>>>
> >>>>>>> Singly linked lists are used throught LLVM: ArrayRecycler,
> Registry, LiveInterval, SCEVUnknown, CGBlockInfo, MacroArgCache,
> ...DIEValueList.
> >>>>>>>
> >>>>>>> What do you think about implementing a singly-linked ADT?
> >>>>>>>
> >>>>>>
> >>>>>> Interesting idea.
> >>>>>>
> >>>>>> I've taken a look at the other singly-linked lists you mentioned.
> Most
> >>>>>> of these are some version of a free/recycling list (ArrayRecycler,
> >>>>>> Registry, MacroArgCache, etc.), which need super simple push/pop
> API and
> >>>>>> no memory management.
> >>>>>>
> >>>>>> LiveInterval and CGBlock also need iteration and a way to
> conditionally
> >>>>>> remove elements (something like `remove_if()`), and in a few cases
> it
> >>>>>> would be convenient to have API to destruct and/or destroy nodes.
> >>>>>>
> >>>>>> I think these would all be straightforward to formalize into an
> ADT.  We
> >>>>>> could even use it to implement a value-based `llvm::forward_list<>`
> that
> >>>>>> avoided the sorting bug in libstdc++ (which gets "stable" exactly
> >>>>>> backwards) that David hit when refactoring tablegen.
> >>>>>>
> >>>>>> Unfortunately, I think it'd be hard to reuse for DIE::Children and
> >>>>>> DIEValueList.  The latter wants to preserve insertion order and
> needs
> >>>>>> `push_back()` API (the former can technically get away with a timely
> >>>>>> call to `reverse()`, but it's awkward).  Supporting `push_back()`
> >>>>>> efficiently requires pointing at the last element somehow, either
> via an
> >>>>>> extra pointer or circular logic (what my latest patch does for
> >>>>>> DIE::Children).
> >>>>>
> >>>>> One of the other problems I hit when trying to value-ify the DIE
> children is that the point of DIE creation is, in some particular cases,
> divorced from the knowledge of which container it's going to go in to. Yet
> pointer identity is needed before then.
> >>>>>
> >>>>> At least that's how I recall it - may be worth going down that path
> again just so we can write it up clearly for posterity, I dunno.
> >>>>
> >>>> Nope, I saw this too.
> >>>>
> >>>> In theory, you could get std::list to work here, by inserting into a
> >>>> singleton linked list when you first need the reference, and splicing
> >>>> it into its parent's linked list once it's created.  However, I think
> a
> >>>> value-based list is too expensive anyway (see below)...
> >>>>
> >>>>> If it is possible, then I would love/prefer to use an existing
> non-intrusive container (or writing one if we have to). at least could
> prototype with std::list, then, as you say, implement a
> std::forward_list-like device with the bonus that we could fix sort and do
> the circular trick to allow insertion at the end, etc.
> >>>>
> >>>> Unfortunately the BumpPtrAllocator/skip-unlinking-the-list part of
> these
> >>>> patches is really necessary, making std::list pretty awkward.  My
> payload
> >>>> usually takes ~2 minutes when profiling, and when I profiled linked
> lists
> >>>> without the BumpPtrAllocator it was talking >15 minutes to teardown (I
> >>>> killed it, so I don't know how much longer it would have taken).
> >>>>
> >>>> Moreover, using a BumpPtrAllocator this way fixes the slow DIE
> teardown
> >>>> that Fred was battling in llvm-dsymutil.
> >>>>
> >>>>> (& sorry, will get to proper review of this patch soon... )
> >>>>
> >>>> (Sadly, I completely missed this reply of yours until this morning
> (when
> >>>> I was already preparing the patches below).)
> >>>>
> >>>> I've rebased the patches on top of r240566, and squashed them together
> >>>> with shared implementation for the two linked lists in die-diet.patch.
> >>>>
> >>>> Yesterday (on top of 240244), I reran llc to check memory usage on the
> >>>> verify-uselistorder.lto.opt.bc payload.  This patch drops peak memory
> >>>> usage from 800MB down to 720MB, 10%.  I also timed before/after the
> patch
> >>>> (no heap profile, CPU speed locked down, best of 5 runs) and measured
> a
> >>>> 1% speedup.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> (I've also attached die-block-is-not-a-die.patch, which fixes the
> >>>> DIELoc/DIEBlock/DIE relationship.  I've been holding off on committing
> >>>> it since it's built on top of the die-diet.patch.
> >>>>
> >>>>
> >>>>
> >>>> )
> >>>>
> >>>>>
> >>>>>> We could configure this stuff via templates -- I'd be open to the
> idea
> >>>>>> -- but I think the intersection between the implementations would be
> >>>>>> practically nil.  Even the iterator implementations need to be
> >>>>>> completely different.  If these are the only `push_back()`-enabled
> >>>>>> slists in tree, is it premature to abstract it?  Would we even want
> the
> >>>>>> same name as "normal" slists?
> >>>>>>
> >>>>>> (Regardless, I think it would be a great idea for someone to
> ADT-ify the
> >>>>>> other linked lists!)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150625/bd395951/attachment.html>


More information about the llvm-commits mailing list