[llvm] r191792 - Debug Info: remove duplication of DIEs when a DIE is part of the type system

Tue Oct 1 15:58:07 PDT 2013

On Tue, Oct 1, 2013 at 2:34 PM, David Blaikie <dblaikie at gmail.com> wrote:

>
>
>
> On Tue, Oct 1, 2013 at 2:18 PM, Manman Ren <manman.ren at gmail.com> wrote:
>
>>
>>
>>
>> On Tue, Oct 1, 2013 at 1:57 PM, David Blaikie <dblaikie at gmail.com> wrote:
>>
>>>
>>>
>>>
>>> On Tue, Oct 1, 2013 at 12:52 PM, Manman Ren <manman.ren at gmail.com>wrote:
>>>
>>>> Author: mren
>>>> Date: Tue Oct  1 14:52:23 2013
>>>> New Revision: 191792
>>>>
>>>> URL: http://llvm.org/viewvc/llvm-project?rev=191792&view=rev
>>>> Log:
>>>> Debug Info: remove duplication of DIEs when a DIE is part of the type
>>>> system
>>>> and it is shared across CUs.
>>>>
>>>
>>> This may require some discussion/consideration.
>>>
>>> The first thing that comes to my mind is that this is changing the DIE
>>> structure from trees to DAGs and, perhaps importantly, is going to make
>>> parent/child relationships incomplete (ie: A child of B no longer implies
>>> that B is the parent of A) which is a bit concerning.
>>>
>>
>> Hi David,
>>
>> Thanks for reviewing the commit. I don't quite get why the parent/child
>> relationship is going to change.
>> The patch removes duplication of DIEs so we don't create one DIE for each
>> CU that corresponds to the same MDNode.
>> We are not making the DIE to have multiple parents. Instead we may refer
>> to the DIE in another CU.
>>
>
> Yep, my mistake - this won't affect parent/child relationships, just
> references.
>
> The idea is that we're going to have references across-CUs? Do you have
> any idea if debuggers tolerate/handle this kind of referencing?
>
Yes, the DIE will be referenced across-CUs. We have been using ref_addr for
some cases, and lldb is fine with it.

>
>
>>
>>
>>>
>>>
>>>>
>>>> We add a few maps in DwarfDebug to map MDNodes for the type system to
>>>> the
>>>> corresponding DIEs: MDTypeNodeToDieMap, MDSPNodeToDieMap, and
>>>> MDStaticMemberNodeToDieMap. These DIEs can be shared across CUs, that
>>>> is why we
>>>> keep the maps in DwarfDebug instead of CompileUnit.
>>>>
>>>
>>> Why nod simply map MDNode* to DIE the same as we do in the CompileUnit?
>>> We should be able to do this for all DIEs, no? (essentially just move the
>>> map up from CompileUnit to DwarfDebug)
>>>
>>
>> Yes, we can use a single map instead of 3 maps, I figured that with 3
>> maps, each map can be smaller and it is faster to query.
>>
>
> I tend to agree, to a degree. But what I meant is that we might be able to
> make support for this more generic, rather than only doing this for
> specific kinds of entities. If the DIE map moved up from CompileUnit to
> DwarfDebug we'd just generically add/check for things there, rather than
> doing this for specific elements.
>
> But maybe this won't work - perhaps that would cause us to find, say, a
> namespace from one CU and add elements from another CU to that namespace.
>
> I suppose we'd need a different lookup for when we're adding children or
> attributes, rather than just adding references. Which is perhaps where
> you've already arrived (sorry, have to think through this all to catch up
> to you) - adding only elements that we might want to reference (well, no,
> that's incomplete - what about a namespace alias? In that case we could,
> technically, reference a namespace from an existing CU rather than
> introducing a new copy in the current CU) and then looking them up in the
> relevant contexts.
>
I tried to move the mapping for MDNodes related to the type system, from
CompileUnit to DwarfDebug, and I didn't include namespace in this commit.
Theoretically, for everything that can be shared across CUs, we should add
it to the DwarfDebug map to remove duplication.

>
> So a strawman alternative proposal would be: add all the MDNode/DIE
> mappings to both the local CompileUnit and 'global' DwarfDebug maps and
> have different lookup functions depending on the context - using the
> DwarfDebug mapping when we want to reference anything for any reason, and
> the CompileUnit mapping when we want to add children (eg: adding new things
> to a namespace, etc) or attributes.
>
I don't quite get why we need two mappings for one MDNode.
I tried to categorize MDNodes to two groups, one group belonging to
CompileUnit (Variables etc that will be referenced in a single CU), the
other group belonging to DwarfDebug (type system that can be shared across
CUs).

>
>
>>
>>> But I'm not sure how effective this would be - like I said, might need
>>> some thought/discussion to understand your solution & consider alternatives.
>>>
>>>
>>>>
>>>> Sometimes, when we try to add an attribute to a DIE, the DIE is not yet
>>>> added
>>>> to its owner yet, so we don't know whether we should use ref_addr or
>>>> ref4.
>>>> We create a worklist that will be processed during finalization to add
>>>> attributes with the correct form (ref_addr or ref4).
>>>>
>>>
>>> Do you have an example of this as a test case? (I really wish we had an
>>> easy way to see test coverage - so I could just tell at a glance whether
>>> the worklist code is being exercised by your test cases... oh well)
>>>
>> Yes, we don't always immediately add a newly-created DIE to a parent. So
>> it is possible to have DIEs without an owner when calling addDIEEntry.
>> One interesting example is:
>> getOrCreateTypeDIE --> addToContextOwner --> getOrCreateContextDIE (the
>> original type DIE can be referenced during the construction of the context,
>> and when that happens, the type DIE does not have an owner yet)
>>
>
> But you know that the reference to that type DIE will be from within the
> same CU as the type itself - since it's a child. So you know the reference
> will be a ref4, no?
>
For this case, yes, it is going to be ref4. But I am not sure whether we
can say the same for all cases when a DIE is not added to an owner.
It also means we need to pass in an extra flag telling addDIEEntry that we
are sure it is a ref4 and it is not an easy task to track the flag across
the chain of
function calls: getOrCreateTypeDIE --> addToContextOwner -->
getOrCreateContextDIE --> ... --> addDIEEntry.

Thanks,
Manman

>
>>
>>
>>>
>>>
>>>>
>>>> We add addDIEEntry to DwarfDebug to be a wrapper around DIE->addValue.
>>>> It checks
>>>> whether we know the correct form, if not, we update the worklist
>>>> (DIEEntryWorklist).
>>>>
>>>> A testing case is added to show that we only create a single DIE for a
>>>> type
>>>> MDNode and we use ref_addr to refer to the type DIE.
>>>>
>>>> Modified:
>>>>     llvm/trunk/lib/CodeGen/AsmPrinter/DIE.cpp
>>>>     llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h
>>>>     llvm/trunk/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
>>>>     llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
>>>>     llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h
>>>>     llvm/trunk/test/Linker/type-unique-simple-a.ll
>>>>
>>>
>>>> Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h
>>>> URL:
>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h?rev=191792&r1=191791&r2=191792&view=diff
>>>>
>>>> ==============================================================================
>>>> --- llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h (original)
>>>> +++ llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h Tue Oct  1 14:52:23 2013
>>>> @@ -152,6 +152,9 @@ namespace llvm {
>>>>      /// Climb up the parent chain to get the compile unit DIE this DIE
>>>> belongs
>>>>      /// to.
>>>>      DIE *getCompileUnit();
>>>> +    /// Similar to getCompileUnit, returns null when DIE is not added
>>>> to an
>>>> +    /// owner yet.
>>>> +    DIE *checkCompileUnit();
>>>>
>>>
>>> Why is this not just "getParent() == NULL"? Why do you need a walk for
>>> the DIE?
>>>
>> Yeah, that should work.
>>
>> Thanks,
>> Manman
>>
>>>
>>>
>>>>      void setTag(uint16_t Tag) { Abbrev.setTag(Tag); }
>>>>      void setOffset(unsigned O) { Offset = O; }
>>>>      void setSize(unsigned S) { Size = S; }
>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131001/da5680a9/attachment.html>