[PATCH] Ensure debug info for two calls to the same function from the same location are not merged

Thu Nov 27 09:41:34 PST 2014

> On 2014 Nov 24, at 12:32, David Blaikie <dblaikie at gmail.com> wrote:
> 
> Hi dexonsmith, echristo, aprantl,
> 
> Starting in r176895 Clang began including column information on function calls in an attempt to fix inline debug info - if two calls from the same location (prior to r176895 this meant the same line, after r176895 it meant the same line and column) were both inlined, they'd get coalesced and treated as a single inline call ({call location} x {original location} was the uniqueness used to differentiate callers, if this was the same, they were the same call).
> 
> This solution (doubly so after it was narrowed in r177164) is incomplete (it wasn't applied to constructor calls (where it is still causing issues, such as the one worked around in r220923/PR21408) or member function calls) and insufficient: in the worst case, two calls from within a macro will be attributed to both the same line and column. It also produces possibly surprising and unintended side effects by adding column information to calls (increasing debug info size, surprising debuggers that aren't necessarily ready to cope with this, increasing the size of line tables, etc).
> 
> So, I present herein, a more robust solution to the problem. When inlining, uniquely identify the metadata associated with the inlined location ("InlinedAt") by adding an extra field to the end with a circular reference. In this way no two callers can be conflated together.
> 
> Once this is committed, I'll remove the hacks/workarounds from Clang. (& there's a test in Clang (test/CodeGenCXX/debug-info-same-line.cpp) that's currently testing this stuff in LLVM, so it'll need to be changed at the same time as this LLVM change (or removed/modified before it))
> 
> http://reviews.llvm.org/D6390
> 
> Files:
>  lib/Transforms/Utils/InlineFunction.cpp
>  test/DebugInfo/inline-debug-info-multiret.ll
>  test/DebugInfo/inline-debug-info.ll
>  test/DebugInfo/inline-no-debug-info.ll
>  test/Transforms/Inline/debug-info-duplicate-calls.ll
>  test/Transforms/Inline/debug-invoke.ll
> <D6390.16575.patch>

Have you looked at how many extra locations get created in an LTO
bootstrap?  Disabling uniquing for inlined locations seems like a pretty
heavy hammer to me, and I worry that in contexts with heavy inlining it
could cause an explosion.

Assuming disabling uniquing is the right goal, I really don't like this
self-reference pattern.  I'm hoping to have a patch out early next week
(I've been on holiday this week) that splits metadata and value -- and
one of the first follow-ups will be a change to MDNode that allows
"opting-out" of uniquing.  Do you think it makes sense to wait and use
that, rather than introducing and pulling out an extra field?

Idea: assume we solved the "incomplete" problems, leaving just (?) the
macro problem.  Doesn't clang keep track of the locations from inside
the macros?  Could we, in theory, treat macro invocations as inlined
locations from the macro definitions, at least for the sake of defining
DILocations/DebugLocs?

Another idea (a little less optimistic) that might reduce the scope of
the change:  can we detect call sites that won't be correctly uniqued,
and turn off uniquing on them from the start?  This would be a way to
limit the number of DILocation sites that lost uniquing.  (I guess I'm
assuming here that turning off uniquing for DILocation is a bad thing.
I'm almost certain that the way `DebugLoc` is currently implemented and
used, this would cause a big problem.  I'm not sure, outside of
`DebugLoc`, how much uniquing matters for `DILocation` though.  After
the metadata-value split -- and `DebugLoc` is just a pointer to an
`MDNode` -- it may not matter at all.)