[LLVMdev] [Debug Info PATCH] for support of ref_addr and removal of DIE duplication

Wed Oct 16 22:32:12 PDT 2013

On Wed, Oct 16, 2013 at 9:10 PM, Manman Ren <manman.ren at gmail.com> wrote:

>
>
>
> On Wed, Oct 16, 2013 at 1:58 PM, David Blaikie <dblaikie at gmail.com> wrote:
>
>>
>>
>>
>> On Wed, Oct 16, 2013 at 1:21 PM, Manman Ren <manman.ren at gmail.com> wrote:
>>
>>>
>>> There are a few places where we break the assumption:
>>> 1> formal_parameter constructed in DwarfDebug when adding attribute type
>>>      we call SPCU->addType(Arg, ATy), where Arg does not belong to SPCU.
>>> 2> inlined_subroutine constructed in DwarfDebug when adding attribute
>>> abstract_origin
>>>      The inlined_subroutine does not belong to the CU we call
>>> addDIEEntry on.
>>>      We create the children of inlined_subroutine and call addDIEEntry
>>> before we add it to an owner.
>>> 3> ...
>>>
>>> When building xalan with "lto -g", I saw 3158 violations of the
>>> assumption. Should we try to fix all these violations to make the
>>> assumption true?
>>>
>>
>> As you've said, this assumption (that DIEs are always constructed by the
>> CU that will own them) isn't necessary for this patch - it will be
>> necessary for cu-at-a-time emission. Perhaps we should discuss it in a
>> separate thread rather than trying to make a stronger assumption than is
>> needed for this patch.
>>
>> Of course without your patch the assumption is trivially correct (well,
>> given your point (1) above, perhaps it wasn't - maybe we do do things out
>> of order and from outside the CU) and we might want to say that it's so
>> important that your patch shouldn't violate it - but I'm not sure that's
>> necessary (Eric?). I haven't given it a great deal of thought.
>>
>
>
> Patches are reattached for convenience: remove DIE duplication with a
> worklist (name it patch_with), remove DIE duplication without a worklist
> but an assertion when we emit a ref4, we make sure the DIE and the
> referenced DIE belong to the same CU (name it patch_without).
>

OK, we're going round in circles here and I'm not sure there are many other
ways I can communicate things.

> Without my patch, the assumption may be true, but it does not matter since
> we should always use ref4.
> I have provided some cases that the assertion fails with patch_without.
>

Yes, because it's a narrow assertion based on an assumption I've explained
is not true (due to adding DIEs to types after-the-fact, due to things like
implicit special members, member templates, and nested types).

The assertion you would need to check that a worklist is not required would
be a mapping listing the "assumptions" (ie: we found this parentless DIE
and assumed it was in the same DIE as this other DIE we're building) and
then, whenever a DIE is added to a parent that chains up to a CU we would
have to check that mapping to see if the assumption turned out to be true.

I don't mind if you implement this or not.

What I do mind and will not accept is committing the worklist version of
this patch without justification, and by that I mean a demonstrated example
where it is needed, not just where it causes an overly-narrow assertion to
fail, but where it causes us to produce the wrong DWARF. If you have such
an example, let's look at that and figure out how to fix that - it might
(probably isn't) not be the right answer to use a worklist, but simply to
rework the order of construction/registration/parenting.

>
> I didn't get a chance to implement another assertion we mentioned earlier
> (verify that inside addDIEEntry if a DIE does not have a parent,
> it should belong to "this" CU). If we want to check if the assumption
> holds without my patch, I can implement the assertion and check on
> Xalan.
>
> If the assumption does not hold even without my patch, are we okay with
> using a worklist?
>

No, because we're talking about the wrong assumption, so far as I can tell.
If I'm understanding you correctly you chose an overly narrow assumption
(that something without a parent belongs to the current CU, where it might
belong to another CU we're adding it to) that, while possibly sufficient,
isn't necessary to avoid the worklist.

> If the assumption does hold without my patch, but does not hold with my
> patch, are we okay with using a worklist? If not, we will need to fix the
> 3000+ violations (which may belong to <10 categories).
>

I'm still not following you here. If you take one of those failures, remove
the assertion, and run the non-worklist code, does it produce the wrong
DWARF?

>
>
>>
>>> As I stated earlier, this assumption, in my opinion, is not really
>>> general, and if we only make this assumption to remove a worklist, it may
>>> not worth it.
>>>
>>
>> There is still no discussion about "removing the worklist" - the only
>> discussion is about whether we need to /add/ the worklist. It's up to the
>> commiter to justify the addition of code, not for the community to justify
>> its removal. (I'd say this is /almost/ even true of committed code - if
>> something isn't tested, I'm likely to remove it and see what breaks - if
>> someone cared about it, they should've provided tests for it).
>>
> Agree. I was under the impression we are trying to remove the worklist.
>

Sorry, perhaps a phrasing problem. My point is that there is no worklist to
remove - we haven't added one yet. We must justify the addition, not the
removal. Which means you must justify why it should be included in your
patch - I don't have to justify why it shouldn't be included. The addition
of code/complexity must be justified by the demonstration, in the form of a
test, that shows that without that code we would produce the wrong answer.

Until we have such a test case in the patch, the worklist code should not
be committed.

>
>
>>
>>
>>> How can developers make sure this assumption is not later violated?
>>>
>>
>> We can add the kind of assertions we've discussed in this thread, keeping
>> a mapping every time we rely on the assumption and checking the mapping
>> whenever we resolve the parent chain. I don't think this is necessary, but
>> I wouldn't block such a change either.
>>
>>
>>> If we are going to make assumptions,
>>>
>>
>> We are always going to make assumptions - well, we're always going to
>> have invariants in our design where any violation of those invariants is a
>> bug.
>>
>>
>>> I will suggest making general assumptions such as
>>> 1> always use a CU to construct a DIE
>>> 2> Before constructing a DIE, figure out the owner CU first by chasing
>>> the context chain, then call construction on the owner CU.
>>> 3> Before calling addDIEEntry in DwarfDebug, find the owner CU first,
>>> then call addDIEEntry on the owner CU.
>>>
>>
>> I'm not sure how valuable this would be - perhaps highly, perhaps not.
>> It'd require more work for your patch, though.
>>
>
> Agree. These 3 will help emit-one-CU-at-a-time. I was thinking if we are
> going to enforce the single assumption, it may be easier to enforce 3
> general assumptions which can make emit-one-CU-at-a-time easier.
>

I'm just going to ignore the cu-at-a-time work/subthread for now, it may be
best to discuss that separately, unless you want to decide on a design for
the cross-CU-dies that is forward-looking enough to simplify cu-at-a-time
(ie: by never mutating a prior CU after it has been constructed) but I
don't think that would be helpful.

- David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131016/e0e3acc9/attachment.html>