[PATCH] [DebugInfo] Add a getParent method to DWARFDebugInformationEntryMinimal.

David Blaikie dblaikie at gmail.com
Tue Jan 20 11:15:56 PST 2015


On Mon, Jan 19, 2015 at 10:43 AM, Frédéric Riss <friss at apple.com> wrote:

>
> On Jan 19, 2015, at 9:42 AM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Mon, Jan 19, 2015 at 8:05 AM, Frédéric Riss <friss at apple.com> wrote:
>
>>
>> On Jan 17, 2015, at 2:39 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>>
>>
>> On Fri, Jan 16, 2015 at 4:16 PM, Frederic Riss <friss at apple.com> wrote:
>>
>>> Hi samsonov, dblaikie,
>>>
>>> The DIE representation of libDebugInfo is designed to be minimal
>>> (as the class name implies). The information that's there allows
>>> to walk the DIE tree down, but you can't get back to the root
>>> from an arbitrary node.
>>>
>>> This patch adds this ability with the goal of using it in
>>> llvm-dsymutil. I could also use this information in dwarfdump
>>> itself to easily support dumping the full DIE hierarchy of a
>>> selected DIE when I finally implement a search option.
>>>
>>> The added information definitely has a cost: the size of
>>> the structure is increased by 8 bytes on 64 bits archs
>>> due to padding and this combined to the added logic makes
>>> the raw parsing between 5 and 10% slower.
>>>
>>> Just for llvm-dsymutil, this is optional. I can gather the
>>> parent chains in a separate path and store them, but I think
>>> that the performance/aded functionality benefit is worth
>>> it. WDYT?
>>>
>>
>> I doubt the performance of dwarfdump is terribly important - but I would
>> imagine the performance of dsymutil will be/is, so it might be worth
>> looking at the alternatives that don't increase the memory cost of all DIEs?
>>
>>
>> Yes speed is more important for dsymutil than dwarfdump. I was more
>> concerned about other clients like llvm-symbolizer that might also parse
>> the DIE tree. I fail to see the connection between that fact and the
>> conclusion you draw though. If we don’t store the parent in libDebugInfo, I
>> will have to store it in dsymutil anyway (as I already said, not such a big
>> deal).
>>
>
> I don't know the dsymutil use cases/features enough (read: at all) to know
> where you need this information - but I was assuming you wouldn't be able
> to reach a DIE object without having walked its parents, so you could keep
> just those parents you walked to get it - which is a lower memory overhead
> than having every DIE keep its parent all the time. Does that make sense
> (in general and in the dsymutil use cases)?
>
>
> dsymutil will walk the DIE tree and mark the DIEs it wants to keep.
>

At this point it's build in-memory DIE objects for every DIE (not just the
ones to keep), yes?


> When you do that, you also need to mark the referenced DIEs as kept, some
> of which might be forward references. And when marking a DIE as kept, you
> need to mark all its parents as such.
>

And to do this it reaches the referenced DIE by folliwing a cross-tree (not
parent-child) pointer that takes it straight to the referenced DIE? (I
don't know what the in-memory representation looks like at this level,
whether we just store the offset, or a pointer, etc).

So we're going down, across, then back up - fair enough. The only
alternative then would be, when we're building the cross-DIE references, we
could store a parent chain in the reference there. If there are relatively
few cross-DIE references that might be cheaper than keeping a parent
pointer in every node. (these parent chains could be shared, reference
counted by all the users pointing to a given node for example)

Not sure that's worth it, but a thought depending on what you think the
memory tradeoffs are likely to be like in this space for dsymutil.

In any case I think that's up to you, as the owner/developer of
llvm-dsymutil and I don't think llvm-dwarfdump should care either way. It's
not really a performance sensitive tool.


> Now, before you think too much about it, I first wrote that algorithm in a
> way that didn’t need the parent information beforehand, but this prevented
> me from generating the exact same result as the original dsymutil. So I
> needed to add a pass to compute the parent chains before doing the marking
> algorithm.
>
>
>> (I imagine a search query that responds with a path, rather than a node,
>> would be a reasonable API, for example - but I dunno)
>>
>>
>> To be more precise about the use I envisioned for dwarfdump itself: the
>> accelerator tables provide a mapping from names to DIE offset, allowing to
>> do fast names lookups. Let’s say I implement a —search=<name> option to
>> dwarfdump using those tables. If you just want to print the target DIE then
>> libDebugInfo handles that just fine. If you want to print the whole path
>> that leads to that DIE, you have to do some additional work that storing
>> the parent would avoid.
>>
>
> Right - as I was mentioning above, I would imagine this search query would
> have to walk the DIE hierarchy anyway (if you just jump into the actual
> object file memory at the specified offset you can reconstruct the parent
> chain easily anyway - so you have to at least walk the DIEs in some
> lightweight  manner to reach the DIE you're searching for - so you can keep
> those parents you're handling and return them with as the search result).
>
>
> I have the feeling we are talking past each other a bit here, sorry if I’m
> just misunderstanding and restating obvious things. libDebugInfo provides
> an unconditional lightweight walk over all the DIEs and stores them in a
> vector. What the patch does is adding the parent information gathering to
> this initial walk. A libDebugInfoClient that wants the parent information
> can easily reconstruct it, but it requires another full walk over the DIEs.
>
> Using the accelerator table, I would like to do something like:
> uint64_t DIEOffset = Accel.lookup(Name);
> DebugInformationEntryMinimal *DIE = CUSection.getDIEForOffset(DIEOffset);
> printDIEWithParents(DIE);
>
> The last step isn’t possible today without an additional walk over the DIE
> vector. And I’d like your opinion (and Alexey’s) about the tradeoff
> involved in generating/storing the parent chains unconditionally in
> extractDIEsToVector(). Or of course, if you have an idea to gather the
> information during the initial walk without using additional memory when
> it’s not needed, then that’s even better!
>
> Fred
>
> It's just a thought,
>
> - David
>
>
>>
>> Fred
>>
>>
>>
>>>
>>> http://reviews.llvm.org/D7025
>>>
>>> Files:
>>>   include/llvm/DebugInfo/DWARFDebugInfoEntry.h
>>>   lib/DebugInfo/DWARFUnit.cpp
>>>
>>> Index: include/llvm/DebugInfo/DWARFDebugInfoEntry.h
>>> ===================================================================
>>> --- include/llvm/DebugInfo/DWARFDebugInfoEntry.h
>>> +++ include/llvm/DebugInfo/DWARFDebugInfoEntry.h
>>> @@ -30,13 +30,16 @@
>>>    /// Offset within the .debug_info of the start of this entry.
>>>    uint32_t Offset;
>>>
>>> +  /// How many to substract to "this" to get the parent.
>>> +  uint32_t ParentIdx;
>>> +
>>>    /// How many to add to "this" to get the sibling.
>>>    uint32_t SiblingIdx;
>>>
>>>    const DWARFAbbreviationDeclaration *AbbrevDecl;
>>>  public:
>>>    DWARFDebugInfoEntryMinimal()
>>> -    : Offset(0), SiblingIdx(0), AbbrevDecl(nullptr) {}
>>> +    : Offset(0), ParentIdx(0), SiblingIdx(0), AbbrevDecl(nullptr) {}
>>>
>>>    void dump(raw_ostream &OS, DWARFUnit *u, unsigned recurseDepth,
>>>              unsigned indent = 0) const;
>>> @@ -66,6 +69,11 @@
>>>      return SiblingIdx > 0 ? this + SiblingIdx : nullptr;
>>>    }
>>>
>>> +  // Returns the parent DIE or null if this is the root DIE.
>>> +  const DWARFDebugInfoEntryMinimal *getParent() const {
>>> +    return ParentIdx ? this - ParentIdx : nullptr;
>>> +  }
>>> +
>>>    // We know we are kept in a vector of contiguous entries, so we know
>>>    // we don't need to store our child pointer, if we have a child it
>>> will
>>>    // be the next entry in the list...
>>> @@ -82,6 +90,15 @@
>>>        SiblingIdx = 0;
>>>    }
>>>
>>> +  void setParent(const DWARFDebugInfoEntryMinimal *Parent) {
>>> +    if (Parent) {
>>> +      // We know we are kept in a vector of contiguous entries, so we
>>> know
>>> +      // our parent will be soewhere before "this".
>>> +      ParentIdx = this - Parent;
>>> +    } else
>>> +      ParentIdx = 0;
>>> +  }
>>> +
>>>    const DWARFAbbreviationDeclaration *getAbbreviationDeclarationPtr()
>>> const {
>>>      return AbbrevDecl;
>>>    }
>>> Index: lib/DebugInfo/DWARFUnit.cpp
>>> ===================================================================
>>> --- lib/DebugInfo/DWARFUnit.cpp
>>> +++ lib/DebugInfo/DWARFUnit.cpp
>>> @@ -143,12 +143,18 @@
>>>    if (DieArray.size() <= 1)
>>>      return;
>>>
>>> -  std::vector<DWARFDebugInfoEntryMinimal *> ParentChain;
>>> +  // Make sure the vector is never empty so that we don't have to
>>> +  // check for emptyness inside the loop bellow.
>>> +  std::vector<DWARFDebugInfoEntryMinimal *> ParentChain(1,&DieArray[0]);
>>>    DWARFDebugInfoEntryMinimal *SiblingChain = nullptr;
>>>    for (auto &DIE : DieArray) {
>>>      if (SiblingChain) {
>>>        SiblingChain->setSibling(&DIE);
>>>      }
>>> +    // Note that this will call setParent with itself on the root
>>> +    // node. This will result in a null offset which is handled as 'no
>>> +    // parent'.
>>> +    DIE.setParent(ParentChain.back());
>>>      if (const DWARFAbbreviationDeclaration *AbbrDecl =
>>>              DIE.getAbbreviationDeclarationPtr()) {
>>>        // Normal DIE.
>>> @@ -165,7 +171,7 @@
>>>      }
>>>    }
>>>    assert(SiblingChain == nullptr || SiblingChain == &DieArray[0]);
>>> -  assert(ParentChain.empty());
>>> +  assert(ParentChain.size() == 1 && ParentChain[0] == &DieArray[0]);
>>>  }
>>>
>>>  void DWARFUnit::extractDIEsToVector(
>>>
>>> EMAIL PREFERENCES
>>>   http://reviews.llvm.org/settings/panel/emailpreferences/
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150120/fc2504f6/attachment.html>


More information about the llvm-commits mailing list