[cfe-dev] Fwd: Debug info for captured variables in a lambda

Richard Smith richard at metafoo.co.uk
Sun Apr 13 16:21:53 PDT 2014


My 2c:

 * 'this' inside a lambda should probably refer to the captured 'this', so
that the semantics match those of the source language. It would seem
reasonable to provide an easy way to access the closure object itself, but
I don't really have a preference for what name to give it.
 * I'm pretty sure that reference captures should have type [const] T&, not
T, within the lambda. Consider the case where the lambda is invoked (and is
crashing) after the enclosing function has returned. It's *much* more
useful to see that your reference capture is a reference (and that it's a
reference to something that no longer exists) than to transparently look
through it to the non-existent enclosing stack frame.


On Sun, Apr 6, 2014 at 8:00 PM, David Blaikie <dblaikie at gmail.com> wrote:

> (oops, add the list back...)
>
> Hmm - well there's a wrinkle.
>
> Seems GCC's approach allows GDB to print unqualified members of the
> captured 'this'. ie: in the below example, breaking in the body of the
> lambda and "print i" works as expected. I assume that this is
> happening by GDB completely ignoring object_pointer and doing
> unqualified name lookup with whatever variable "this" it happens to
> find in scope.
>
> That would explain why, if I remove the name of the "this" parameter*
> GDB stops being able to find anything - either the locally captured
> 'x' or the lambda.
>
> So I guess we can't have it both ways - we can't expect GDB to treat a
> member named "this" the way it should treat the object_pointer
> (allowing unqualified names to find members of the captured object)
> while also expecting GDB to find the "this" member to begin with...
>
> Well, maybe, but that'd be somewhat heroic of GDB to juggle those
> competing attitudes/ideas about what the this pointer is.
>
> I still dislike that GCC's approach causes the captures to shadow
> parameters, which is inconsistent with the behavior of C++... so I'm
> not sure there's a "right" answer here, yet.
>
> * my previous observation was that removing this name did nothing -
> that wasn't the case, I'd just removed it from the wrong function...
>
> On Sun, Apr 6, 2014 at 6:18 PM, David Blaikie <dblaikie at gmail.com> wrote:
> > On Sun, Apr 6, 2014 at 4:38 PM, Eric Christopher <echristo at gmail.com>
> wrote:
> >> I was pretty sure all of this already worked outside of the captured
> this?
> >
> > Seems to work well enough (my comments in (2) were mostly pedantry if
> > we wanted to get the type exactly as written, rather than as a
> > reference), though it's not quite the same as how GCC implements it.
> >
> >> The captured this was the only problem. GCC deals with this by calling
> the
> >> captured this 'this' and the anonymous this as '__this' IIRC.
> >
> > Looks like it might be the other way around. GCC's captured member is
> > "__this", their lambda's "this" is called "__closure", but then they
> > introduce local variables into the op() function to describe all of
> > the captures with locations like:
> >
> > 91 68 06 (fbreg+68 (the same location as "__closure"), 06 (deref)) for
> > the first capture
> > 91 68 06 23 08 (as above, then plus_uconst 8)
> >
> > I suspect this would get the wrong behavior if you happened to name
> > your parameters the same as your captured variables (can you do that?
> > seems you can - if you capture a global by reference (by qualifying
> > the name)) - with GCC's scheme the name of the captured variable
> > shadows the name of the parameter making the parameter impossible to
> > access. With Clang's scheme (just naming the member variables without
> > introducing the local variable DIEs) the unqualified name finds the
> > parameter and "this->name" finds the captured variable.
> >
> > So it's not just a matter of renaming things - indeed, even with the
> > lambda's 'this' unnamed, GDB manages to make that happen (presumably
> > based on the subprogram's object_pointer attribute and the assumption
> > that that variable should be called "this" even if the name wasn't
> > provided).
> >
> > If GDB were to be 'fixed' (perhaps there's a good reason it treats
> > unnamed object_pointers as though they were named "this", and thus
> > cannot be changed/fixed) then Clang should need to do nothing to be as
> > right as GCC and more right in some ways (shadowing).
> >
> >
> >
> > & the pedantry I was referring to in (2) is just that if you describe
> > the variable as a T& (either as a member like Clang does, or as a
> > local variable as GCC does) then GDB prints it out like "= (int &)
> > @0x7fffffffd994: 3" whereas if we do the extra deref (probably only
> > worth doing if we're doing a bunch of work to do custom location
> > descriptions from the frontend anyway) you get "= 3" which is a bit
> > nicer.
> >
> >
> >> Could be wrong
> >> though, example?
> >
> > struct foo { int i; int func(); };
> >
> > int foo::func() {
> >   int x = 3;
> >   return [&]() { return x + i; }();
> > }
> >
> > int main() {
> >   foo f;
> >   f.i = 42;
> >   f.func();
> > }
> >
> > GCC's lambda looks like this:
> >
> > structure_type
> >   name = "<lambda()>"
> >
> >   member
> >     name = "__x"
> >     type = int&
> >     data_member_location = 0
> >
> >   member
> >     name = "__this"
> >     type = foo*
> >     data_member_location = 8
> >
> >   subprogram
> >     name = "~<lambda>"
> >     ...
> >
> >   subprogram
> >     name = "operator()"
> >     frame_base = call_frame_cfa
> >     object_pointer = the __closure parameter DIE
> >     Unknown_2117 = true
> >
> >     formal_parameter
> >       name = "__closure"
> >       type = const lambda* const
> >       artificial
> >       location = fbreg(0x68)
> >
> >     variable
> >       name = "x"
> >       type = int&
> >       artificial
> >       location = fbreg(0x68) deref
> >
> >     variable
> >       name = "this"
> >       type = foo * const
> >       artificial
> >       location = fbreg(0x68) deref plus_uconst(8)
> >
> >
> > Clang's output looks like this:
> >
> > class_type
> >   member
> >     name = "x"
> >     type = int&
> >     data_member_location = 0
> >     accessibility = private
> >
> >   member
> >     name = "this"
> >     type = foo*
> >     data_member_location = 8
> >     accessibility = private
> >
> >   subprogram
> >     name = "operator()"
> >     accessibility = public
> >     declaration
> >
> >     formal_parameter
> >       type = const lambda*
> >       artificial
> >
> > ...
> >
> > subprogram
> >   specification = lambda::operator()
> >   frame_base = DW_OP_reg6
> >   object_pointer = the 'this' parameter DIE
> >
> >   formal_parameter
> >     name = "this"
> >     type = const lambda*
> >     artificial
> >     location = DW_OP_fbreg(0x78)
> >
> >
> > So, different implementations (GCC munges the names of the actual
> > member variables that do the capturing, but introduces local variables
> > (that just use location expressions to point to the member variables)
> > into the lambda's function call operator - clang just names the member
> > variables the actual name in the first place) but a similar result,
> > except for 'this'.
> >
> > Still pretty sure that GDB should let you name the object_pointer
> > something other than "this" and respect that, finding the member
> > variable "this" instead. (with a modification to Clang not to name the
> > parameter "this" - either no name, or "__closure" like GCC, etc)
> >
> >>
> >> Figured I'd start the discussion here, while it's on my mind/I've
> >> built up some state.
> >>
> >> The theory is that, at a bare minimum, we need to describe the
> >> location of the captured "this" variable (so the lambda's own "this"
> >> doesn't hide it).
> >>
> >> 1) Counterpoint: Clang correctly describes the member variable as
> >> being named "this". Perhaps we could argue that the debugger, upon
> >> seeing a member variable called "this" should treat that as what the
> >> user means when they say "this". But that's probably not actually a
> >> reasonable thing to do - the lambda's "this" is in a closer scope (an
> >> implicit argument to the member function) and should probably override
> >> (though clearly the compiler has to treat the object pointer as
> >> special in some way... )
> >>
> >> ooh, that gets me thinking: what if we just didn't provide a name for
> >> the object pointer? Since we want "this" to refer to the member
> >> variable, it seems like it should be possible to remove the name and
> >> then the debugger should fall back to finding the member variable
> >> 'this' instead.
> >>
> >> Hacking up LLVM's IR to do this produced the expected debug info, but
> >> it did not produce the expected debugger behavior, "ptype this" still
> >> found the lambda's "this" rather than the capture. I'm inclined to
> >> file that as a gdb bug. Thoughts?
> >>
> >> 2) If we wanted to be pedantically correct, we should actually create
> >> special variables to describe the location of all reference captures -
> >> currently with both Clang and GCC, the type of a reference capture is
> >> T&, but arguably we should have a variable of type "T" (that was the
> >> actual type in the user's code after all, right?) and we could do this
> >> by introducing a DW_TAG_variable for each captured variable that
> >> indirects through the lambda's this pointer to the member and
> >> indirects again through the pointer that lives there. But this is
> >> perhaps a bit more than is valuable.
> >>
> >> But if we do end up needing to describe the location of captured
> >> "this" rather than using my suggestion above, that would probably mean
> >> introducing a sufficiently generic location expression system into
> >> Clang that we could address (2) too.
> >>
> >> The main thing I think we need is the ability to do "indirect +
> >> offset" (currently for C++ non-trivial pass by value parameters we
> >> have "indirect", and if the captured this were always the first member
> >> of the lambda, that feature alone would be sufficient (I haven't
> >> tested this, but might) - just use the same location for captured
> >> "this" as for the lambda's "this", but add the indirect flag... -
> >> except we need an offset as well - and that might fall outside the
> >> sort of stuff we want to stuff in the DW_TAG_auto_variable metadata,
> >> and instead incline us towards generalizing the dbg.declare intrinsic
> >> to describe more complex location information?)
> >>
> >> Any other thoughts/ideas?
> _______________________________________________
> cfe-dev mailing list
> cfe-dev at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20140413/bb60442b/attachment.html>


More information about the cfe-dev mailing list