[llvm] r191792 - Debug Info: remove duplication of DIEs when a DIE is part of the type system

Thu Oct 3 16:19:33 PDT 2013

On Thu, Oct 3, 2013 at 4:00 PM, Manman Ren <manman.ren at gmail.com> wrote:

>
>
>
> On Thu, Oct 3, 2013 at 1:48 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>>
>>
>>
>> On Thu, Oct 3, 2013 at 1:33 PM, Manman Ren <manman.ren at gmail.com> wrote:
>>
>>>
>>>
>>>
>>> On Thu, Oct 3, 2013 at 11:38 AM, David Blaikie <dblaikie at gmail.com>wrote:
>>>
>>>>
>>>>
>>>>
>>>> On Thu, Oct 3, 2013 at 11:26 AM, Manman Ren <manman.ren at gmail.com>wrote:
>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Oct 2, 2013 at 5:55 PM, Eric Christopher <echristo at gmail.com>wrote:
>>>>>
>>>>>> >
>>>>>> >>
>>>>>> >> I would like to revisit this (maybe revert this) later on when
>>>>>> type units
>>>>>> >> are working.
>>>>>> >
>>>>>> >
>>>>>> > I'd rather revert this before type units (& then revisit this
>>>>>> afterwards) as
>>>>>> > it may complicate the type unit work. I think the extra benefit
>>>>>> (using
>>>>>> > direct ref_addr, rather than signatures) of this approach over type
>>>>>> units
>>>>>> > may be implementable via an incremental improvement to type units.
>>>>>> >
>>>>>>
>>>>>> FWIW I've been following along with the thread and agree here. David
>>>>>> is actively working on type units and we should have something here
>>>>>> shortly. Let's let that work proceed and revert this for now.
>>>>>>
>>>>>
>>>>> I think we do agree that this approach has some extra benefit (using
>>>>> direct ref_addr, rather than signatures) over type units.
>>>>> The type-unit work can achieve what this approach does in removing DIE
>>>>> duplication, but using signatures rather than ref_addr.
>>>>> It may be possible to get the extra benefit via an incremental
>>>>> improvement to type units, but we are not certain about that.
>>>>>
>>>>
>>>> Agreed.
>>>>
>>>>
>>>>>
>>>>> I also want to emphasize this patch after wrapping up CU::getDIE as
>>>>> David sugggested is not big, the list of changes can be found at the end of
>>>>> this email.
>>>>>
>>>>
>>>> I don't quite understand this. The list of changes looks incomplete
>>>> (the insert*DIE functions in DwarfDebug and the maps they use, etc, are not
>>>> shown here) - are you describing this as "changes" as distinct from
>>>> "additions"? Even adding completely new code comes at a maintenance burden
>>>> that should be weighed against the value added.
>>>>
>>>
>>> Oops, I forgot to add the change to the header file: a single  map added
>>> to DwarfDebug and the corresponding access functions. (5 lines of code)
>>>
>>>
>>>> & this doesn't look like how I would expect the code to look after my
>>>> suggestion (for one thing, I think I was suggesting having one map in
>>>> DwarfDebug, rather than one per kind of thing - and simply having getDIE
>>>> (and insertDIE, for that matter) delegate based on a list of known tag
>>>> types that should be cross-CU versus CU-local)
>>>>
>>>
>>> Yes, that is what I am going to implement, the list I gave earlier is
>>> based on the current design, but the number of changes will be the same.
>>> Change to header file: 5 lines of code
>>> Change to CU::getDIE to delegate tags to DwarfDebug::getDIE: 5 lines of
>>> code
>>> The changes list in my earlier message: 20 lines of code
>>> I guess for type units, we need to wrap up addDIEEntry to use
>>> ref_signature, so part of the change can be reused by type units.
>>>
>>
>> I think it might be easier to discuss this in terms of reverting your
>> current patch and then looking at the complete new patch we're
>> discussing/proposing.
>>
>
> I am not proposing a complete new patch, I am going to change the current
> patch:
>

Yeah - I just think it'll be easier to discuss as completely separate
patch, on top of Clang before r191792 (or, actually, with r191792
reverted), rather than as a mutation to your existing patch.

> 1> combine 3 maps into a single map
> 2> wrap up CU::getDIE so we don't need to modify from the existing
> CU::getDIE to DwarfDebug::getDIE
> These should not affect functionality.
>
>
>>
>>
>>>
>>>
>>>>
>>>>
>>>>> I don't quite get the complication caused by this approach to type
>>>>> units. When we start constructing a DIE for a type MDNode, we either use
>>>>> the type unit or we use
>>>>> the standard approach  of creating a DIE, my changes touch the
>>>>> standard approach a little by moving the map from CU to DwarfDebug.
>>>>>
>>>>
>>>> Yep, it might come out this cleanly, though I'm a little cautious about
>>>> putting all these things in together then trying to clean it up/evaluate
>>>> the changes appropriately.
>>>>
>>>>
>>>>> When referring to the type, we either use ref_signature or ref_addr or
>>>>> ref4.
>>>>>
>>>>> Even after type units start working, people can still choose to use
>>>>> either type units or ref_addr|ref4 for a period of time before their tool
>>>>> chains are updated to support type units.
>>>>> As for the tool chains, extra time may be required to set up a map
>>>>> from the type signature to the actual address in the Dwarf.
>>>>>
>>>>
>>>> Certainly - as I said, I don't doubt your change has some value over
>>>> type units. There's a question as to how much value and how best to fit it
>>>> in, if we do, to the existing code.
>>>>
>>>>
>>>>> The performance improvement of this patch is kind of straight-forward,
>>>>> if we have X CUs refer to the same type MDNode, X copies of type DIEs will
>>>>> be created
>>>>> without this patch.
>>>>>
>>>>
>>>> Yes, I agree that it has value.
>>>>
>>>>
>>>>>  I do have some numbers collected before, I will dig that out.
>>>>>
>>>>
>>>> Thanks, that'd be very helpful.
>>>>
>>> 3GB memory reduction for xalan built with lto -g (without removing DIE
>>> duplication, the memory usage is 7GB)
>>>
>>
>> OK, so you're looking at memory usage during LTO, not the size of the
>> resulting debug info? (that's fine, just trying to understand exactly what
>> metrics you're targeting)
>>
>> & this is peak total memory usage (how/what exactly are you measuring?)
>> during LTO?
>>
> Yes, the peak total memory usage when building xalan, for the raw DWAF
> file size, it reduces the raw DWARF debug info from 58M to 7M.
>

Neat. (thanks for the numbers)

Does Clang self-host with LTO at the moment? I realize it's probably a
painfully big target to try to run, and perhaps without further improvement
it won't be achievable on reasonable hardware due to memory issues, but I
figured I'd ask because it's a nice common baseline that all the Clang/LLVM
developers can reproduce.

>
>
>>
>>>
>>>
>>>>
>>>> (but I'm also particularly interested in how much value this has over
>>>> type units - which is one reason I'd like to do this after type units,
>>>> either that or we'll need to add flags or somesuch to disable this
>>>> optimization so we can compare type units with and without this change)
>>>>
>>> The value will be the advantage of ref_addr over ref_signature after
>>> type units are fully working.
>>>
>>
>> Yes, I understand the mechanical benefit, when I speak of this I mean
>> looking at the actual numbers. The % benefit in whatever metrics we care
>> about.
>>
>>
>>> If I revert this change, we have a choice of modifying the standard
>>> route of ref4 to support ref_addr, as it is currently implemented;
>>>
>>
>> I don't quite understand what you're saying here.
>>
>
>>
>>>  or implementing this on top of type units (but we are not certain that
>>> we can implement this via an incremental improvement to type units, and
>>> what benefit did we gain compared
>>> to modifying the standard route).
>>>
>>
>> I'm not necessarily suggesting that this feature would be implemented in
>> a highly interrelated way with type units, but that seeing how it layers on
>> top of type units may be useful.
>>
>
>>
>>> But it provides us huge memory reduction before type units are full
>>> working and tool chains are updated.
>>>
>>
>> Tool chain support is a fair point, if lldb doesn't and won't support
>> type units any time soon, then the value of this feature over today's
>> behavior (or tomorrow's behavior with type units disabled, etc) is a
>> relevant metric to consider.
>>
>> I'm still inclined to prefer backing this patch out and discussing the
>> right design rather than trying to coerce this into the right design from
>> where it is now. It'll make the patch history and impact clearer.
>>
>
> Since we are already in discussion about this, let's reach a conclusion
> about what the right design for this feature is and what is the
> relationship of this feature (remove duplicated DIEs for a single MDNode
> and use ref_addr)
> to the type units stuff.
>
> If we are going to take a similar approach as this commit, I don't need to
> wait for the type units stuff to be working. I can revert the current patch
> and combine it with the changes suggested above into a single patch.
> 1> combine 3 maps into a single map
> 2> wrap up CU::getDIE so we don't need to modify from the existing
> CU::getDIE to DwarfDebug::getDIE
>
> I don't think it is worthwhile to wait for the type units working and see
> how this feature layers on top of type units.
>

I'm not entirely convinced of this, but I'm willing to discuss the design
of this feature in parallel as I work on type units - I don't want to hold
this patch back just to put type units in first for the sake of it, but I'm
still concerned about the interaction of these two patches (which is
usually a case of "whoever gets in first gets in first and the other guy
can deal with resolving the differences" - which I'm fine with, once we've
discussed and come to an appropriate conclusion on the design).

> I see this as complementary to the type units, especially if we want to
> support this feature with type units disabled.
> Do you have any suggestion on how to layer this feature on top of type
> units?
>

My thinking around layering this on top of type units was to move the
callback into DwarfDebug a little higher. In my prototype patch,
CompileUnit constructs the DIE for the type, then passes it up to
DwarfDebug to finish or reject - if DwarfDebug decides to create a type
unit, it builds the type unit, attaches the hash to the type, and returns
"accepted" to CompileUnit, which stops there, doing nothing more to the
type DIE. If DwarfDebug rejects the type (not creating a type unit),
CompileUnit continues on, building the full type description.

For cross-CU cached types, essentially we'd just want to move the type
construction up a layer - rather than CompileUnit constructing an empty
DIE, we'd call DwarfDebug which would return us a DIE fully constructed, or
nothing. If DwarfDebug returned nothing, CompileUnit would construct a type
unit as normal.

The registration side of things might be more work (how does DwarfDebug
learn about existing types) - one way might be for DwarfDebug to always own
calling back into the CompileUnit (either the existing/current CU, or the
Type Unit it just constructed) to finish the work, then registering the
newly created type before returning back to CompileUnit.

This wouldn't solve the registration of functions and static member DIEs in
the referencing CUs.

I'm not sure how well that'll work. The alternative, which is more in line
with what we've already been discussing, as you say, may be
orthogonal/compatible with type units. Simply having "getDIE" query the tag
type and for some specific set of tags, delegate to DwarfDebug. Have
"insertDIE" do the same thing (factor out the test for the specific kind of
tag to a common function) and go with that.

I haven't quite thought about the work list aspect of your patch yet. I'll
have to give that some consideration.

>
> As to the complication this can cause type units, I think we have some
> agreement:
>
>
>> I don't quite get the complication caused by this approach to type units.
>> When we start constructing a DIE for a type MDNode, we either use the type
>> unit or we use
>> the standard approach  of creating a DIE, my changes touch the standard
>> approach a little by moving the map from CU to DwarfDebug.
>>
>
>     Yep, it might come out this cleanly, though I'm a little cautious
> about putting all these things in together then trying to clean it
> up/evaluate the changes appropriately.
>
> Manman
>
>
>>
>> - David
>>
>>
>>>
>>> Manman
>>>
>>>
>>>>
>>>> The other problem is that removing these patches will be somewhat more
>>>> difficult after I've spent the next couple of weeks cleaning code up and
>>>> adding in type units. I think there's enough design discussion to have with
>>>> regards to how best to implement your feature that I'd rather have that
>>>> design discussion on top of type units rather than the other way around -
>>>> though it's possible we can come to a reasonable/good design in either
>>>> order.
>>>>
>>>>
>>>>>
>>>>> Thanks,
>>>>> Manman
>>>>>
>>>>>  15 -  insertDIE(Ty, TyDIE);
>>>>>  16 +  DD->insertTypeDIE(Ty, TyDIE);
>>>>>
>>>>>  31 -  insertDIE(SP, SPDie);
>>>>>  32 +  DD->insertSPDIE(SP, SPDie);
>>>>>
>>>>>  40 -  insertDIE(DT, StaticMemberDIE);
>>>>>  41 +  DD->insertStaticMemberDIE(DT, StaticMemberDIE);
>>>>>
>>>>>  52 +  // Process the worklist to add attributes with the correct form
>>>>> (ref_addr or
>>>>>  53 +  // ref4).
>>>>>  54 +  for (unsigned I = 0, E = DIEEntryWorklist.size(); I < E; I++) {
>>>>>   55 +    addDIEEntry(DIEEntryWorklist[I].Die,
>>>>> DIEEntryWorklist[I].Attribute,
>>>>>  56 +                dwarf::DW_FORM_ref4, DIEEntryWorklist[I].Entry);
>>>>>  57 +    assert(E == DIEEntryWorklist.size() &&
>>>>>  58 +           "We should not add to the worklist during
>>>>> finalization.");
>>>>>  59 +  }
>>>>>  60 +
>>>>>
>>>>>  68 +
>>>>>  69 +/// When we don't know whether the correct form is ref4 or
>>>>> ref_addr, we create
>>>>>  70 +/// a worklist item and insert it to DIEEntryWorklist.
>>>>>  71 +void DwarfDebug::addDIEEntry(DIE *Die, uint16_t Attribute,
>>>>> uint16_t Form,
>>>>>  72 +                             DIEEntry *Entry) {
>>>>>  73 +  /// Early exit when we only have a single CU.
>>>>>  74 +  if (GlobalCUIndexCount == 1 || Form != dwarf::DW_FORM_ref4) {
>>>>>  75 +    Die->addValue(Attribute, Form, Entry);
>>>>>  76 +    return;
>>>>>  77 +  }
>>>>>  78 +  DIE *DieCU = Die->checkCompileUnit();
>>>>>  79 +  DIE *EntryCU = Entry->getEntry()->checkCompileUnit();
>>>>>  80 +  if (!DieCU || !EntryCU) {
>>>>>  81 +    // Die or Entry is not added to an owner yet.
>>>>>  82 +    insertDIEEntryWorklist(Die, Attribute, Entry);
>>>>>  83 +    return;
>>>>>  84 +  }
>>>>>  85 +  Die->addValue(Attribute,
>>>>>  86 +         EntryCU == DieCU ? dwarf::DW_FORM_ref4 :
>>>>> dwarf::DW_FORM_ref_addr,
>>>>>  87 +         Entry);
>>>>>  88 +}
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> (and no, computing the hash isn't enough to matter time wise on
>>>>>> anything)
>>>>>>
>>>>>> -eric
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20131003/140e976d/attachment.html>