[PATCH] [DebugInfo] Add a getParent method to DWARFDebugInformationEntryMinimal.
David Blaikie
dblaikie at gmail.com
Tue Jan 20 11:15:56 PST 2015
On Mon, Jan 19, 2015 at 10:43 AM, Frédéric Riss <friss at apple.com> wrote:
>
> On Jan 19, 2015, at 9:42 AM, David Blaikie <dblaikie at gmail.com> wrote:
>
>
>
> On Mon, Jan 19, 2015 at 8:05 AM, Frédéric Riss <friss at apple.com> wrote:
>
>>
>> On Jan 17, 2015, at 2:39 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>>
>>
>> On Fri, Jan 16, 2015 at 4:16 PM, Frederic Riss <friss at apple.com> wrote:
>>
>>> Hi samsonov, dblaikie,
>>>
>>> The DIE representation of libDebugInfo is designed to be minimal
>>> (as the class name implies). The information that's there allows
>>> to walk the DIE tree down, but you can't get back to the root
>>> from an arbitrary node.
>>>
>>> This patch adds this ability with the goal of using it in
>>> llvm-dsymutil. I could also use this information in dwarfdump
>>> itself to easily support dumping the full DIE hierarchy of a
>>> selected DIE when I finally implement a search option.
>>>
>>> The added information definitely has a cost: the size of
>>> the structure is increased by 8 bytes on 64 bits archs
>>> due to padding and this combined to the added logic makes
>>> the raw parsing between 5 and 10% slower.
>>>
>>> Just for llvm-dsymutil, this is optional. I can gather the
>>> parent chains in a separate path and store them, but I think
>>> that the performance/aded functionality benefit is worth
>>> it. WDYT?
>>>
>>
>> I doubt the performance of dwarfdump is terribly important - but I would
>> imagine the performance of dsymutil will be/is, so it might be worth
>> looking at the alternatives that don't increase the memory cost of all DIEs?
>>
>>
>> Yes speed is more important for dsymutil than dwarfdump. I was more
>> concerned about other clients like llvm-symbolizer that might also parse
>> the DIE tree. I fail to see the connection between that fact and the
>> conclusion you draw though. If we don’t store the parent in libDebugInfo, I
>> will have to store it in dsymutil anyway (as I already said, not such a big
>> deal).
>>
>
> I don't know the dsymutil use cases/features enough (read: at all) to know
> where you need this information - but I was assuming you wouldn't be able
> to reach a DIE object without having walked its parents, so you could keep
> just those parents you walked to get it - which is a lower memory overhead
> than having every DIE keep its parent all the time. Does that make sense
> (in general and in the dsymutil use cases)?
>
>
> dsymutil will walk the DIE tree and mark the DIEs it wants to keep.
>
At this point it's build in-memory DIE objects for every DIE (not just the
ones to keep), yes?
> When you do that, you also need to mark the referenced DIEs as kept, some
> of which might be forward references. And when marking a DIE as kept, you
> need to mark all its parents as such.
>
And to do this it reaches the referenced DIE by folliwing a cross-tree (not
parent-child) pointer that takes it straight to the referenced DIE? (I
don't know what the in-memory representation looks like at this level,
whether we just store the offset, or a pointer, etc).
So we're going down, across, then back up - fair enough. The only
alternative then would be, when we're building the cross-DIE references, we
could store a parent chain in the reference there. If there are relatively
few cross-DIE references that might be cheaper than keeping a parent
pointer in every node. (these parent chains could be shared, reference
counted by all the users pointing to a given node for example)
Not sure that's worth it, but a thought depending on what you think the
memory tradeoffs are likely to be like in this space for dsymutil.
In any case I think that's up to you, as the owner/developer of
llvm-dsymutil and I don't think llvm-dwarfdump should care either way. It's
not really a performance sensitive tool.
> Now, before you think too much about it, I first wrote that algorithm in a
> way that didn’t need the parent information beforehand, but this prevented
> me from generating the exact same result as the original dsymutil. So I
> needed to add a pass to compute the parent chains before doing the marking
> algorithm.
>
>
>> (I imagine a search query that responds with a path, rather than a node,
>> would be a reasonable API, for example - but I dunno)
>>
>>
>> To be more precise about the use I envisioned for dwarfdump itself: the
>> accelerator tables provide a mapping from names to DIE offset, allowing to
>> do fast names lookups. Let’s say I implement a —search=<name> option to
>> dwarfdump using those tables. If you just want to print the target DIE then
>> libDebugInfo handles that just fine. If you want to print the whole path
>> that leads to that DIE, you have to do some additional work that storing
>> the parent would avoid.
>>
>
> Right - as I was mentioning above, I would imagine this search query would
> have to walk the DIE hierarchy anyway (if you just jump into the actual
> object file memory at the specified offset you can reconstruct the parent
> chain easily anyway - so you have to at least walk the DIEs in some
> lightweight manner to reach the DIE you're searching for - so you can keep
> those parents you're handling and return them with as the search result).
>
>
> I have the feeling we are talking past each other a bit here, sorry if I’m
> just misunderstanding and restating obvious things. libDebugInfo provides
> an unconditional lightweight walk over all the DIEs and stores them in a
> vector. What the patch does is adding the parent information gathering to
> this initial walk. A libDebugInfoClient that wants the parent information
> can easily reconstruct it, but it requires another full walk over the DIEs.
>
> Using the accelerator table, I would like to do something like:
> uint64_t DIEOffset = Accel.lookup(Name);
> DebugInformationEntryMinimal *DIE = CUSection.getDIEForOffset(DIEOffset);
> printDIEWithParents(DIE);
>
> The last step isn’t possible today without an additional walk over the DIE
> vector. And I’d like your opinion (and Alexey’s) about the tradeoff
> involved in generating/storing the parent chains unconditionally in
> extractDIEsToVector(). Or of course, if you have an idea to gather the
> information during the initial walk without using additional memory when
> it’s not needed, then that’s even better!
>
> Fred
>
> It's just a thought,
>
> - David
>
>
>>
>> Fred
>>
>>
>>
>>>
>>> http://reviews.llvm.org/D7025
>>>
>>> Files:
>>> include/llvm/DebugInfo/DWARFDebugInfoEntry.h
>>> lib/DebugInfo/DWARFUnit.cpp
>>>
>>> Index: include/llvm/DebugInfo/DWARFDebugInfoEntry.h
>>> ===================================================================
>>> --- include/llvm/DebugInfo/DWARFDebugInfoEntry.h
>>> +++ include/llvm/DebugInfo/DWARFDebugInfoEntry.h
>>> @@ -30,13 +30,16 @@
>>> /// Offset within the .debug_info of the start of this entry.
>>> uint32_t Offset;
>>>
>>> + /// How many to substract to "this" to get the parent.
>>> + uint32_t ParentIdx;
>>> +
>>> /// How many to add to "this" to get the sibling.
>>> uint32_t SiblingIdx;
>>>
>>> const DWARFAbbreviationDeclaration *AbbrevDecl;
>>> public:
>>> DWARFDebugInfoEntryMinimal()
>>> - : Offset(0), SiblingIdx(0), AbbrevDecl(nullptr) {}
>>> + : Offset(0), ParentIdx(0), SiblingIdx(0), AbbrevDecl(nullptr) {}
>>>
>>> void dump(raw_ostream &OS, DWARFUnit *u, unsigned recurseDepth,
>>> unsigned indent = 0) const;
>>> @@ -66,6 +69,11 @@
>>> return SiblingIdx > 0 ? this + SiblingIdx : nullptr;
>>> }
>>>
>>> + // Returns the parent DIE or null if this is the root DIE.
>>> + const DWARFDebugInfoEntryMinimal *getParent() const {
>>> + return ParentIdx ? this - ParentIdx : nullptr;
>>> + }
>>> +
>>> // We know we are kept in a vector of contiguous entries, so we know
>>> // we don't need to store our child pointer, if we have a child it
>>> will
>>> // be the next entry in the list...
>>> @@ -82,6 +90,15 @@
>>> SiblingIdx = 0;
>>> }
>>>
>>> + void setParent(const DWARFDebugInfoEntryMinimal *Parent) {
>>> + if (Parent) {
>>> + // We know we are kept in a vector of contiguous entries, so we
>>> know
>>> + // our parent will be soewhere before "this".
>>> + ParentIdx = this - Parent;
>>> + } else
>>> + ParentIdx = 0;
>>> + }
>>> +
>>> const DWARFAbbreviationDeclaration *getAbbreviationDeclarationPtr()
>>> const {
>>> return AbbrevDecl;
>>> }
>>> Index: lib/DebugInfo/DWARFUnit.cpp
>>> ===================================================================
>>> --- lib/DebugInfo/DWARFUnit.cpp
>>> +++ lib/DebugInfo/DWARFUnit.cpp
>>> @@ -143,12 +143,18 @@
>>> if (DieArray.size() <= 1)
>>> return;
>>>
>>> - std::vector<DWARFDebugInfoEntryMinimal *> ParentChain;
>>> + // Make sure the vector is never empty so that we don't have to
>>> + // check for emptyness inside the loop bellow.
>>> + std::vector<DWARFDebugInfoEntryMinimal *> ParentChain(1,&DieArray[0]);
>>> DWARFDebugInfoEntryMinimal *SiblingChain = nullptr;
>>> for (auto &DIE : DieArray) {
>>> if (SiblingChain) {
>>> SiblingChain->setSibling(&DIE);
>>> }
>>> + // Note that this will call setParent with itself on the root
>>> + // node. This will result in a null offset which is handled as 'no
>>> + // parent'.
>>> + DIE.setParent(ParentChain.back());
>>> if (const DWARFAbbreviationDeclaration *AbbrDecl =
>>> DIE.getAbbreviationDeclarationPtr()) {
>>> // Normal DIE.
>>> @@ -165,7 +171,7 @@
>>> }
>>> }
>>> assert(SiblingChain == nullptr || SiblingChain == &DieArray[0]);
>>> - assert(ParentChain.empty());
>>> + assert(ParentChain.size() == 1 && ParentChain[0] == &DieArray[0]);
>>> }
>>>
>>> void DWARFUnit::extractDIEsToVector(
>>>
>>> EMAIL PREFERENCES
>>> http://reviews.llvm.org/settings/panel/emailpreferences/
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150120/fc2504f6/attachment.html>
More information about the llvm-commits
mailing list