[PATCH] D70121: [DebugInfo][LDV] Teach LDV how to identify source variables and handle fragments
Vedant Kumar via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Jan 30 17:09:10 PST 2020
vsk added a comment.
@Orlando thanks for your patient and thorough explanation, I really appreciate it!
In D70121#1849741 <https://reviews.llvm.org/D70121#1849741>, @Orlando wrote:
> @vsk this is the situation as far as I understand it. I appreciate that this is
> a lot of text; there is a summary at the end. I hope this helps.
>
> > Apologies, I'm not up-to-date on this patch and would appreciate some
> > clarification / a few short examples of what overlapping fragments are. Is it
> > when you have adjacent dbg.value / DBG_VALUEs, s.t. there is some overlap in
> > the bits they describe?
>
> Yes, except adjacency is not part of that relationship.
>
> Example 5
> DBG_VALUE %0, $noreg, "x", DIExpression() ; DV1
> ...
> DBG_VALUE %0, $noreg, "y", DIExpression(DW_OP_LLVM_fragment 0 16) ; DV2
> ...
> DBG_VALUE %1, $noreg, "x", DIExpression() ; DV3
> ...
> DBG_VALUE %0, $noreg, "y", DIExpression(DW_OP_LLVM_fragment 8 8) ; DV4
>
>
> DV1, DV3, have fully overlapping fragments. DV2 and DV4 have partially
> overlapping fragments.
>
> So, in your example 4 the first and last DBG_VALUEs partially overlap.
>
> > are these valid examples?
>
>
>
> Example 1: Correct - I will explicitly state "full overlap" if I must refer
> to this case.
> Example 2: Correct.
> Example 3: Correct.
> Example 4: Not quite, that's still a partial overlap. See above.
Got it.
>
>
>> I'm fuzzy on the definition here so apologies for the probably naive question,
>> but: why are overlaps a problem?
>
> Having to consider fragment overlaps complicates value liveness queries for any
> particular bits of a source variable. 'value' here refers to the combination of
> a location (e.g. vreg, const) and DIExpression (because this may add a modifier
> to the location).
>
>> Would you mind following up on the review for D71478 <https://reviews.llvm.org/D71478> describing what it's
>> missing?
>
> Ah, looking closely at D71478 <https://reviews.llvm.org/D71478> the pass will never produce wrong debuginfo. It
> doesn't remove any dbg.values which _may overlap_ by ignoring the fragment part.
> It takes a conservative approach, but this has no effect on the final debuginfo
> so it turns out it isn't a very compelling example.
>
> In the Example 6 below it would be safe to remove DV3, but the pass
> implemented in D71478 <https://reviews.llvm.org/D71478> will not currently do so because DV3 and DV2 _may
> overlap_.
>
> Example 6
> dbg.value %1, "x", Fragment(0, 16) ; DV1
> ...
> dbg.value %2, "x", Fragment(16, 16) ; DV2
> ...
> dbg.value %1, "x", Fragment(0, 16) ; DV3
>
>
>> why are overlaps a problem? (continued)
>
> LiveDebugVariables (LDV) removes DBG_VALUEs before regalloc and reinserts after.
> It cannot afford to be so conservative because this reduces coverage. Before I
> explain this I have to show you what is wrong with LDV currently.
>
> LDV uses interval maps to explicitly represent the intervals that the removed
> DBG_VALUEs implicitly described. The DBG_VALUEs are filtered into an interval
> map based on their { Source Variable, DIExpression }. The interval map will
> coalesce adjacent entries that use the same location.
>
> Consider (interval maps before coalescing on the left):
>
> Example 7
> +------------DBG_VALUE %0, $noreg, "x", DIExpression() ; DV1
> | ...
> | +----DBG_VALUE %1, $noreg, "x", DIExpression(DW_OP_plus_uconst, 4, DW_OP_stack_value) ; DV2
> | | ...
> +-------|----DBG_VALUE %0, $noreg, "x", DIExpression() ; DV3
> | |
> DV1(%0) DV2(%1)
> DV3(%0)
>
>
> This approach generates incorrect debuginfo. The interval map coalesces DV3 into
> DV1 because they use the same location.
Let me see if I can restate that correctly. So, the problem here is: for the list of user-values tracked by the interval map for the range containing [DV1, DV3], there are two entries: one for {DV1, DV3} (coalesced), and another for {DV2}. This would result in "x" being interpreted as `%1 + 4` instead of `%0` in that interval (by the debugger)?
> Taking the same approach as D71478 <https://reviews.llvm.org/D71478> and filtering only on { Source Variable }
> would fix the bug described above but also may greatly reduce coverage. For
> example:
IIUC that would essentially make fragments not 'work', or appear fully, in debug sessions? As you'd only ever see the the last fragment described in an interval?
>
>
> Example 8
> +----DBG_VALUE %0, "x", (DW_OP_LLVM_fragment 0 8) ; DV1
> | ...
> +----DBG_VALUE %1, "x", (DW_OP_LLVM_fragment 8 8) ; DV2
> |
> +----DBG_VALUE %0, "x", (DW_OP_LLVM_fragment 0 8) ; DV3
> |
> DV1(%0)
> DV2(%1)
> DV3(%0)
>
>
> In Example 8 DV3 is not coalesced into DV1. Imagine that regalloc wants to
> spill %0:
>
> Example 9
> +----DBG_VALUE %0, "x", (DW_OP_LLVM_fragment 0 8) ; DV1
> | ...
> +----DBG_VALUE %1, "x", (DW_OP_LLVM_fragment 8 8) ; DV2
> | <SPILL %0 into STACK0>
> +----DBG_VALUE %0, "x", (DW_OP_LLVM_fragment 0 8) ; DV3
> |
> DV1(%0)
> DV2(%1) <----| %0 is not used in this interval in this map.
> DV3(%0) | Really, DV2 should not be in this map.
>
>
> Because the spill range isn't represented in the interval map we just lose that
> coverage.
>
> Patch D66415 <https://reviews.llvm.org/D66415> changes this so that the DBG_VALUEs are filtered into interval maps
> based on { Source Variable, Fragment } and will coalesce adjacent entries that
> use the same { Location, DIExpression }.
Ah, sgtm, then.
> This method correctly handles fully overlapping fragments and non-overlapping
> fragments of the same Source Variable and fixes case [0] and [1]. However, it
> still produces incorrect debuginfo when presented with partially overlapping
> fragments in much the same way:
>
> Example 10
> +-------------------DBG_VALUE %0, "x", (DW_OP_LLVM_fragment 0 16) ; DV1
> | ...
> | +----DBG_VALUE %1, "x", (DW_OP_LLVM_fragment 0 8) ; DV2
> | | ...
> +--------------|----DBG_VALUE %0, "x", (DW_OP_LLVM_fragment 0 16) ; DV3
> | |
> DV1(%0, Expr0) DV2(%1, Expr1)
> DV3(%0, Expr0)
>
>
> DV2 will get filtered into a separate interval map and therefore DV3 and DV1
> will be incorrectly coalesced.
I don't think we should ban these partial overlaps from showing up in IR, as I can imagine situations where this sort of IR is legitimately describing the known bits in a value. What we might consider doing to make LDV more conservative is a) identifying partial overlaps in a dbg.value pack using a backwards scan and b) deleting any earlier dbg.values which overlap with bits described later in the pack. That should be conservatively correct, and wouldn't require changing the interval map structure?
> So, the patch D66415 <https://reviews.llvm.org/D66415> may cause incorrect debuginfo, but in practice it is should
> cause much less than the current implementation because we are more building more
> accurate intervals for any given bits.
>
>> ^ Sorry, I didn't quite parse this definition.
>
> TL;DR The upper bound of DBG_VALUEs coming into LDV with partially overlapping
> fragments in a clang 3.4 RelWithDebInfo build is around 0.00005%.
>
> Summary and questions
>
> ---
>
> Partially overlapping fragments (between DBG_VALUEs) are allowed in IR and MIR
> but they are not handled everywhere. For a large C++ codebase I saw a
> negligible number of partial overlaps coming into LiveDebugVariablesq.
>
> This patch (in theory) fixes the problem of overlapping fragments for LDV. But
> it is large and difficult to review, so:
>
> 1. Please can we accept a patch that looks like D66415 <https://reviews.llvm.org/D66415> (plus FIXME: overlaps)? It will produce incorrect debuginfo given partially overlapping fragments, but it is still strictly an improvement on the current implementation, fixes [0] and [1], and is far simpler than this patch which does handle partial overlaps.
Sgtm, I very strongly feel that we should handle the 99.99995% case first and then handle the rest later.
> My numbers are only anecdotal and there may be languages / situations which
> necessitate partially overlapping fragments (e.g. C allows much more type
> punning than C++). With this in mind:
>
> 2. It seems difficult to actually get clang to produce these pesky overlaps. Do you know of any cases where overlapping fragments in either IR or MIR are _required_ to produce correct debuginfo?
Anything I'd write here would be highly contrived, but I don't believe that these overlaps are senseless. E.g. we see this in IR all the time:
dbg.value(i32 0, "x", DIExpression())
dbg.value(i32 1, "x", DIExpression())
and this is seen as well-defined.
> Hypothetically, if it were feasible to ban partial overlaps, then D66415
> would be fully correct and it would be very easy to make D71478 optimal.
>
>
> Thanks,
> Orlando
>
> [0] https://bugs.llvm.org/show_bug.cgi?id=41992
> [1] https://bugs.llvm.org/show_bug.cgi?id=43957
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D70121/new/
https://reviews.llvm.org/D70121
More information about the llvm-commits
mailing list