[PATCH] Ensure debug info for two calls to the same function from the same location are not merged

David Blaikie dblaikie at gmail.com
Wed Dec 3 13:12:16 PST 2014


On Thu, Nov 27, 2014 at 9:41 AM, Duncan P. N. Exon Smith <
dexonsmith at apple.com> wrote:

>
> > On 2014 Nov 24, at 12:32, David Blaikie <dblaikie at gmail.com> wrote:
> >
> > Hi dexonsmith, echristo, aprantl,
> >
> > Starting in r176895 Clang began including column information on function
> calls in an attempt to fix inline debug info - if two calls from the same
> location (prior to r176895 this meant the same line, after r176895 it meant
> the same line and column) were both inlined, they'd get coalesced and
> treated as a single inline call ({call location} x {original location} was
> the uniqueness used to differentiate callers, if this was the same, they
> were the same call).
> >
> > This solution (doubly so after it was narrowed in r177164) is incomplete
> (it wasn't applied to constructor calls (where it is still causing issues,
> such as the one worked around in r220923/PR21408) or member function calls)
> and insufficient: in the worst case, two calls from within a macro will be
> attributed to both the same line and column. It also produces possibly
> surprising and unintended side effects by adding column information to
> calls (increasing debug info size, surprising debuggers that aren't
> necessarily ready to cope with this, increasing the size of line tables,
> etc).
> >
> > So, I present herein, a more robust solution to the problem. When
> inlining, uniquely identify the metadata associated with the inlined
> location ("InlinedAt") by adding an extra field to the end with a circular
> reference. In this way no two callers can be conflated together.
> >
> > Once this is committed, I'll remove the hacks/workarounds from Clang. (&
> there's a test in Clang (test/CodeGenCXX/debug-info-same-line.cpp) that's
> currently testing this stuff in LLVM, so it'll need to be changed at the
> same time as this LLVM change (or removed/modified before it))
> >
> > http://reviews.llvm.org/D6390
> >
> > Files:
> >  lib/Transforms/Utils/InlineFunction.cpp
> >  test/DebugInfo/inline-debug-info-multiret.ll
> >  test/DebugInfo/inline-debug-info.ll
> >  test/DebugInfo/inline-no-debug-info.ll
> >  test/Transforms/Inline/debug-info-duplicate-calls.ll
> >  test/Transforms/Inline/debug-invoke.ll
> > <D6390.16575.patch>
>
>
> Have you looked at how many extra locations get created in an LTO
> bootstrap?


Nope - that'd be one of the reasons I sent this for review. I haven't
tried/got an LTO setup (open to instructions as to how) nor have much of an
idea of how you've been doing the stat gathering for your analysis here.


> Disabling uniquing for inlined locations seems like a pretty
> heavy hammer to me, and I worry that in contexts with heavy inlining it
> could cause an explosion.
>

Any case where this creates more nodes due to inlining are cases that are
buggy currently, so far as I know.

The point of adding the column info was to make every call site distinct
from every other call site - this uniquing just does the same thing but
entirely consistently, as far as I understand it.


> Assuming disabling uniquing is the right goal, I really don't like this
> self-reference pattern.  I'm hoping to have a patch out early next week
> (I've been on holiday this week) that splits metadata and value -- and
> one of the first follow-ups will be a change to MDNode that allows
> "opting-out" of uniquing.  Do you think it makes sense to wait and use
> that, rather than introducing and pulling out an extra field?
>

Another reason I sent this for review - I don't mind much waiting/how it
gets done, just that it gets done.


> Idea: assume we solved the "incomplete" problems, leaving just (?) the
> macro problem.


Not sure what you mean by this - the macro problem is sort of the example
of how the current solution is incomplete.


> Doesn't clang keep track of the locations from inside
> the macros?  Could we, in theory, treat macro invocations as inlined
> locations from the macro definitions, at least for the sake of defining
> DILocations/DebugLocs?
>

This, I think, would just create as many locations (and would make them
pre-emptively, as the current solution does - even when no inlining occurs)
as the uniquing approach I'm proposing here (essentially any attempt to
unique from the frontend produces unique locations for every call site in
the IR - doing it during inlining means we don't need to unique every call
site, only those that actually get inlined, which seems like a strict
improvement).


> Another idea (a little less optimistic) that might reduce the scope of
> the change:  can we detect call sites that won't be correctly uniqued,
> and turn off uniquing on them from the start?  This would be a way to
> limit the number of DILocation sites that lost uniquing.


It's only the "inlined at" ones that need uniquing (& the only ones I'm
adding it to with this patch). These /could/ be reused for two calls from
the same location to distinct functions, so we could be a bit fancier if we
can opt in/out to each debug location, but I'm not sure whether that's of
much/any benefit.


> (I guess I'm
> assuming here that turning off uniquing for DILocation is a bad thing.
> I'm almost certain that the way `DebugLoc` is currently implemented and
> used, this would cause a big problem.  I'm not sure, outside of
> `DebugLoc`, how much uniquing matters for `DILocation` though.  After
> the metadata-value split -- and `DebugLoc` is just a pointer to an
> `MDNode` -- it may not matter at all.)
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20141203/f37711f1/attachment.html>


More information about the llvm-commits mailing list