[PATCH] Ensure debug info for two calls to the same function from the same location are not merged

Duncan P. N. Exon Smith dexonsmith at apple.com
Fri Dec 19 17:13:57 PST 2014


> On 2014-Dec-19, at 15:10, David Blaikie <dblaikie at gmail.com> wrote:
> 
> 
> 
> On Wed, Dec 3, 2014 at 2:18 PM, David Blaikie <dblaikie at gmail.com> wrote:
> 
> 
> On Wed, Dec 3, 2014 at 2:08 PM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> 
> > On 2014-Dec-03, at 13:12, David Blaikie <dblaikie at gmail.com> wrote:
> >
> >
> >
> > On Thu, Nov 27, 2014 at 9:41 AM, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote:
> >> Disabling uniquing for inlined locations seems like a pretty
> >> heavy hammer to me, and I worry that in contexts with heavy inlining it
> >> could cause an explosion.
> >>
> > Any case where this creates more nodes due to inlining are cases that are buggy currently, so far as I know.
> >
> > The point of adding the column info was to make every call site distinct from every other call site - this uniquing just does the same thing but entirely consistently, as far as I understand it.
> 
> Ah, okay.  This makes sense to me now.
> 
> I made a few testcases of my own and ran with/without your patch and I
> better understand what you've done.  I'd imagined a bunch of interactions
> with `DebugLoc` that your approach avoids.
> 
> >> Assuming disabling uniquing is the right goal, I really don't like this
> >> self-reference pattern.  I'm hoping to have a patch out early next week
> >> (I've been on holiday this week) that splits metadata and value -- and
> >> one of the first follow-ups will be a change to MDNode that allows
> >> "opting-out" of uniquing.  Do you think it makes sense to wait and use
> >> that, rather than introducing and pulling out an extra field?
> >
> > Another reason I sent this for review - I don't mind much waiting/how it gets done, just that it gets done.
> 
> Let's wait then.  I'm hoping to get things moving along soon.
> 
> Sure thing.
> 
> Ping on this - but expect it'll be some time in the new year,

Leaving in 2 minutes for some holiday :).

> so I'll try to remember to check back in a month or so.

Please do!

I've delayed a little bit introducing `distinct` because it's not clear
what behaviour is correct from `MapMetadata()`.

Right now, if you send a node into `MapMetadata` that's part of a cycle,
you get a completely new copy of the cycle (every node is duplicated).

I think this is bad behaviour for debug info, but that's been the
behaviour for a long time, and I suspect alias scopes and loops (non
debug info metadata schemas that use self-references) rely on this
behaviour during `CloneFunction()`.

My idea for `distinct` was that it's a more clear approach than using
a self-reference -- but I don't think it makes sense to get a new one
out of `MapMetadata` unless operands change... at least, not in general.
Some places you probably want a separate copy.

(Still thinking.)

(BTW, I haven't confirmed, but I suspect a lot of memory gets wasted
this way during `llvm-link`.  We still have plenty of type cycles for
types with local scope (defined in anonymous namespaces or functions).)

> Attaching a related clang patch for this just so I can get rid of it from my local stash & record it here with the related work.
>  
>  
> 
> >> Idea: assume we solved the "incomplete" problems, leaving just (?) the
> >> macro problem.
> >
> > Not sure what you mean by this - the macro problem is sort of the example of how the current solution is incomplete.
> 
> You split your description up into "incomplete" and "insufficient":
> >>> This solution (doubly so after it was narrowed in r177164) is incomplete (it wasn't applied to constructor calls (where it is still causing issues, such as the one worked around in r220923/PR21408) or member function calls) and insufficient: in the worst case, two calls from within a macro will be attributed to both the same line and column. It also produces possibly surprising and unintended side effects by adding column information to calls (increasing debug info size, surprising debuggers that aren't necessarily ready to cope with this, increasing the size of line tables, etc).
> 
> Ah, right right.
>  
> 
> 
> >> Doesn't clang keep track of the locations from inside
> >> the macros?  Could we, in theory, treat macro invocations as inlined
> >> locations from the macro definitions, at least for the sake of defining
> >> DILocations/DebugLocs?
> >
> > This, I think, would just create as many locations (and would make them pre-emptively, as the current solution does - even when no inlining occurs) as the uniquing approach I'm proposing here
> > (essentially any attempt to unique from the frontend produces unique locations for every call site in the IR - doing it during inlining means we don't need to unique every call site, only those that actually get inlined, which seems like a strict improvement).
> 
> *nod*
> 
> (Although if the debuggers could understand it, I think having debug info
> for macros would be pretty cool.)
> 
> Yep, could be rather handy. I think the full generality of that feature would probably look something like "multiple source locations" (maybe some kind of source location stack) - which could probably also be used to provide multiple levels of debug info - eg: allowing source location information for asm, LLVM IR, C(++) source, macro expansions, and maybe like a protobuf file or something - so you could step up & down through the abstractions.
> 
> - David
>  
> 
> <clang-call.diff>





More information about the llvm-commits mailing list