[llvm] r191792 - Debug Info: remove duplication of DIEs when a DIE is part of the type system

David Blaikie dblaikie at gmail.com
Tue Oct 1 16:10:00 PDT 2013

> The idea is that we're going to have references across-CUs? Do you have
>> any idea if debuggers tolerate/handle this kind of referencing?
> Yes, the DIE will be referenced across-CUs. We have been using ref_addr
> for some cases,

In which cases do we already do this?

> and lldb is fine with it.

lldb isn't the only DWARF consumer.

I tried to move the mapping for MDNodes related to the type system, from
> CompileUnit to DwarfDebug, and I didn't include namespace in this commit.
> Theoretically, for everything that can be shared across CUs, we should add
> it to the DwarfDebug map to remove duplication.

Then it would be nice to do this as a single, generic solution over all
things that can be referenced rather than piecemeal with special cases for
each kind of thing, I think.

>> So a strawman alternative proposal would be: add all the MDNode/DIE
>> mappings to both the local CompileUnit and 'global' DwarfDebug maps and
>> have different lookup functions depending on the context - using the
>> DwarfDebug mapping when we want to reference anything for any reason, and
>> the CompileUnit mapping when we want to add children (eg: adding new things
>> to a namespace, etc) or attributes.
> I don't quite get why we need two mappings for one MDNode.

Because, say we have a namespace node - we need a CU-specific mapping so
that when we need to add entities to the namespace (another type, global
variable, etc) we can lookup the CU-local instance of the namespace. But
when we only need to refer to the namespace (eg: for a namespace alias) we
can lookup a cross-CU mapping.

This is exactly what you've already implemented - the mappings you added to
DwarfDebug are the second mapping - the first mapping already exists within
the CompileUnit and is still used for the cases where we want to add
members (at least I hope we are, because we need to do that in some cases).

> I tried to categorize MDNodes to two groups, one group belonging to
> CompileUnit (Variables etc that will be referenced in a single CU), the
> other group belonging to DwarfDebug (type system that can be shared across
> CUs).

I don't believe that will be sufficient, but I could be wrong. Namespaces
seem like a pretty solid example where a thing would need to be mapped in
both a CU-specific and cross-CU way.

For types, currently there are cases where we insert member DIEs into type
DIEs lazily (related to the vtable debug info optimization, wherein we emit
only a declaration of a type when that type is dynamic (has a vtable) and
the vtable isn't emitted in this translation unit). It might not be wrong
to have us insert those member DIEs in an existing instance of the type in
another compile_unit - but I'm hesitant to assume that's good and right.

>>>> But I'm not sure how effective this would be - like I said, might need
>>>> some thought/discussion to understand your solution & consider alternatives.
>>>>> Sometimes, when we try to add an attribute to a DIE, the DIE is not
>>>>> yet added
>>>>> to its owner yet, so we don't know whether we should use ref_addr or
>>>>> ref4.
>>>>> We create a worklist that will be processed during finalization to add
>>>>> attributes with the correct form (ref_addr or ref4).
>>>> Do you have an example of this as a test case? (I really wish we had an
>>>> easy way to see test coverage - so I could just tell at a glance whether
>>>> the worklist code is being exercised by your test cases... oh well)
>>> Yes, we don't always immediately add a newly-created DIE to a parent. So
>>> it is possible to have DIEs without an owner when calling addDIEEntry.
>>> One interesting example is:
>>> getOrCreateTypeDIE --> addToContextOwner --> getOrCreateContextDIE (the
>>> original type DIE can be referenced during the construction of the context,
>>> and when that happens, the type DIE does not have an owner yet)
>> But you know that the reference to that type DIE will be from within the
>> same CU as the type itself - since it's a child. So you know the reference
>> will be a ref4, no?
> For this case, yes, it is going to be ref4. But I am not sure whether we
> can say the same for all cases when a DIE is not added to an owner.
> It also means we need to pass in an extra flag telling addDIEEntry that we
> are sure it is a ref4 and it is not an easy task to track the flag across
> the chain of
> function calls: getOrCreateTypeDIE --> addToContextOwner -->
> getOrCreateContextDIE --> ... --> addDIEEntry.

Why would you need an extra flag? If we know that any time we come back
around to the same entity that isn't already inserted (ie: the condition
you already know because you use it to decide to add things to a work list)
it must be a local reference, so you answer the question immediately rather
than using a work list.

I could be wrong, but I want to have a good reason to do it some other way
not just "maybe sometimes this might not happen".

- David

> Thanks,
> Manman
>>>>> We add addDIEEntry to DwarfDebug to be a wrapper around DIE->addValue.
>>>>> It checks
>>>>> whether we know the correct form, if not, we update the worklist
>>>>> (DIEEntryWorklist).
>>>>> A testing case is added to show that we only create a single DIE for a
>>>>> type
>>>>> MDNode and we use ref_addr to refer to the type DIE.
>>>>> Modified:
>>>>>     llvm/trunk/lib/CodeGen/AsmPrinter/DIE.cpp
>>>>>     llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h
>>>>>     llvm/trunk/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
>>>>>     llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
>>>>>     llvm/trunk/lib/CodeGen/AsmPrinter/DwarfDebug.h
>>>>>     llvm/trunk/test/Linker/type-unique-simple-a.ll
>>>>> Modified: llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h
>>>>> URL:
>>>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h?rev=191792&r1=191791&r2=191792&view=diff
>>>>> ==============================================================================
>>>>> --- llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h (original)
>>>>> +++ llvm/trunk/lib/CodeGen/AsmPrinter/DIE.h Tue Oct  1 14:52:23 2013
>>>>> @@ -152,6 +152,9 @@ namespace llvm {
>>>>>      /// Climb up the parent chain to get the compile unit DIE this
>>>>> DIE belongs
>>>>>      /// to.
>>>>>      DIE *getCompileUnit();
>>>>> +    /// Similar to getCompileUnit, returns null when DIE is not added
>>>>> to an
>>>>> +    /// owner yet.
>>>>> +    DIE *checkCompileUnit();
>>>> Why is this not just "getParent() == NULL"? Why do you need a walk for
>>>> the DIE?
>>> Yeah, that should work.
>>> Thanks,
>>> Manman
>>>>>      void setTag(uint16_t Tag) { Abbrev.setTag(Tag); }
>>>>>      void setOffset(unsigned O) { Offset = O; }
>>>>>      void setSize(unsigned S) { Size = S; }
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131001/385a6f2c/attachment.html>

More information about the llvm-commits mailing list