[llvm] r191792 - Debug Info: remove duplication of DIEs when a DIE is part of the type system

Tue Oct 1 17:58:24 PDT 2013

On Tue, Oct 1, 2013 at 4:10 PM, David Blaikie <dblaikie at gmail.com> wrote:

>
>  The idea is that we're going to have references across-CUs? Do you have
>>> any idea if debuggers tolerate/handle this kind of referencing?
>>>
>> Yes, the DIE will be referenced across-CUs. We have been using ref_addr
>> for some cases,
>>
>
> In which cases do we already do this?
>
In DwarfDebug::updateSubprogramScopeDIE, when we add abstract_origin of a
subprogram.
I agree that we are now using ref_addr broadly for LTO builds.

>
>
>> and lldb is fine with it.
>>
>
> lldb isn't the only DWARF consumer.
>
Other debuggers should work with ref_addr.

>
> I tried to move the mapping for MDNodes related to the type system, from
>> CompileUnit to DwarfDebug, and I didn't include namespace in this commit.
>> Theoretically, for everything that can be shared across CUs, we should
>> add it to the DwarfDebug map to remove duplication.
>>
>
> Then it would be nice to do this as a single, generic solution over all
> things that can be referenced rather than piecemeal with special cases for
> each kind of thing, I think.
>
>
>>
>>
>>>
>>> So a strawman alternative proposal would be: add all the MDNode/DIE
>>> mappings to both the local CompileUnit and 'global' DwarfDebug maps and
>>> have different lookup functions depending on the context - using the
>>> DwarfDebug mapping when we want to reference anything for any reason, and
>>> the CompileUnit mapping when we want to add children (eg: adding new things
>>> to a namespace, etc) or attributes.
>>>
>> I don't quite get why we need two mappings for one MDNode.
>>
>
> Because, say we have a namespace node - we need a CU-specific mapping so
> that when we need to add entities to the namespace (another type, global
> variable, etc) we can lookup the CU-local instance of the namespace. But
> when we only need to refer to the namespace (eg: for a namespace alias) we
> can lookup a cross-CU mapping.
>
If we already have multiple DIEs for the same MDNode (that is why we keep a
CU-specific mapping, right?), when referring to the namespace DIE, can we
just use CU-specific mapping?
I am not familiar with how a namespace alias works, so don't quite know why
we need a cross-CU mapping when referring to the namespace.

> This is exactly what you've already implemented - the mappings you added
> to DwarfDebug are the second mapping - the first mapping already exists
> within the CompileUnit and is still used for the cases where we want to add
> members (at least I hope we are, because we need to do that in some cases).
>
Oops, I moved the mapping from CompileUnit to DwarfDebug. Are you referring
to "the namespace alias" by "some cases"?

>
>
>> I tried to categorize MDNodes to two groups, one group belonging to
>> CompileUnit (Variables etc that will be referenced in a single CU), the
>> other group belonging to DwarfDebug (type system that can be shared across
>> CUs).
>>
>
> I don't believe that will be sufficient, but I could be wrong. Namespaces
> seem like a pretty solid example where a thing would need to be mapped in
> both a CU-specific and cross-CU way.
>
> For types, currently there are cases where we insert member DIEs into type
> DIEs lazily (related to the vtable debug info optimization, wherein we emit
> only a declaration of a type when that type is dynamic (has a vtable) and
> the vtable isn't emitted in this translation unit).
>
Do we have an existing testing case for this? I can try to extend it to
multiple CUs to check what will happen.

> It might not be wrong to have us insert those member DIEs in an existing
> instance of the type in another compile_unit - but I'm hesitant to assume
> that's good and right.
>
>
>>
>>
>>>
>>>
>>>>
>>>>> But I'm not sure how effective this would be - like I said, might need
>>>>> some thought/discussion to understand your solution & consider alternatives.
>>>>>
>>>>>
>>>>>>
>>>>>> Sometimes, when we try to add an attribute to a DIE, the DIE is not
>>>>>> yet added
>>>>>> to its owner yet, so we don't know whether we should use ref_addr or
>>>>>> ref4.
>>>>>> We create a worklist that will be processed during finalization to add
>>>>>> attributes with the correct form (ref_addr or ref4).
>>>>>>
>>>>>
>>>>> Do you have an example of this as a test case? (I really wish we had
>>>>> an easy way to see test coverage - so I could just tell at a glance whether
>>>>> the worklist code is being exercised by your test cases... oh well)
>>>>>
>>>> Yes, we don't always immediately add a newly-created DIE to a parent.
>>>> So it is possible to have DIEs without an owner when calling addDIEEntry.
>>>> One interesting example is:
>>>> getOrCreateTypeDIE --> addToContextOwner --> getOrCreateContextDIE (the
>>>> original type DIE can be referenced during the construction of the context,
>>>> and when that happens, the type DIE does not have an owner yet)
>>>>
>>>
>>> But you know that the reference to that type DIE will be from within the
>>> same CU as the type itself - since it's a child. So you know the reference
>>> will be a ref4, no?
>>>
>> For this case, yes, it is going to be ref4. But I am not sure whether we
>> can say the same for all cases when a DIE is not added to an owner.
>> It also means we need to pass in an extra flag telling addDIEEntry that
>> we are sure it is a ref4 and it is not an easy task to track the flag
>> across the chain of
>> function calls: getOrCreateTypeDIE --> addToContextOwner -->
>> getOrCreateContextDIE --> ... --> addDIEEntry.
>>
>
> Why would you need an extra flag? If we know that any time we come back
> around to the same entity that isn't already inserted (ie: the condition
> you already know because you use it to decide to add things to a work list)
> it must be a local reference, so you answer the question immediately rather
> than using a work list.
>
How can we tell that we are coming back around to the same entity? Maybe I
misunderstood something here.
When we see a DIE without an owner and add things to a work list, I don't
think there is a guarantee that the DIE and the Entry belong to the same CU.

Thanks,
Manman

>
> I could be wrong, but I want to have a good reason to do it some other way
> not just "maybe sometimes this might not happen".
>
> - David
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131001/3adefa0d/attachment.html>