[cfe-dev] Debug info for captured variables in a lambda

David Blaikie dblaikie at gmail.com
Sun Apr 6 18:18:08 PDT 2014


On Sun, Apr 6, 2014 at 4:38 PM, Eric Christopher <echristo at gmail.com> wrote:
> I was pretty sure all of this already worked outside of the captured this?

Seems to work well enough (my comments in (2) were mostly pedantry if
we wanted to get the type exactly as written, rather than as a
reference), though it's not quite the same as how GCC implements it.

> The captured this was the only problem. GCC deals with this by calling the
> captured this 'this' and the anonymous this as '__this' IIRC.

Looks like it might be the other way around. GCC's captured member is
"__this", their lambda's "this" is called "__closure", but then they
introduce local variables into the op() function to describe all of
the captures with locations like:

91 68 06 (fbreg+68 (the same location as "__closure"), 06 (deref)) for
the first capture
91 68 06 23 08 (as above, then plus_uconst 8)

I suspect this would get the wrong behavior if you happened to name
your parameters the same as your captured variables (can you do that?
seems you can - if you capture a global by reference (by qualifying
the name)) - with GCC's scheme the name of the captured variable
shadows the name of the parameter making the parameter impossible to
access. With Clang's scheme (just naming the member variables without
introducing the local variable DIEs) the unqualified name finds the
parameter and "this->name" finds the captured variable.

So it's not just a matter of renaming things - indeed, even with the
lambda's 'this' unnamed, GDB manages to make that happen (presumably
based on the subprogram's object_pointer attribute and the assumption
that that variable should be called "this" even if the name wasn't
provided).

If GDB were to be 'fixed' (perhaps there's a good reason it treats
unnamed object_pointers as though they were named "this", and thus
cannot be changed/fixed) then Clang should need to do nothing to be as
right as GCC and more right in some ways (shadowing).



& the pedantry I was referring to in (2) is just that if you describe
the variable as a T& (either as a member like Clang does, or as a
local variable as GCC does) then GDB prints it out like "= (int &)
@0x7fffffffd994: 3" whereas if we do the extra deref (probably only
worth doing if we're doing a bunch of work to do custom location
descriptions from the frontend anyway) you get "= 3" which is a bit
nicer.


> Could be wrong
> though, example?

struct foo { int i; int func(); };

int foo::func() {
  int x = 3;
  return [&]() { return x + i; }();
}

int main() {
  foo f;
  f.i = 42;
  f.func();
}

GCC's lambda looks like this:

structure_type
  name = "<lambda()>"

  member
    name = "__x"
    type = int&
    data_member_location = 0

  member
    name = "__this"
    type = foo*
    data_member_location = 8

  subprogram
    name = "~<lambda>"
    ...

  subprogram
    name = "operator()"
    frame_base = call_frame_cfa
    object_pointer = the __closure parameter DIE
    Unknown_2117 = true

    formal_parameter
      name = "__closure"
      type = const lambda* const
      artificial
      location = fbreg(0x68)

    variable
      name = "x"
      type = int&
      artificial
      location = fbreg(0x68) deref

    variable
      name = "this"
      type = foo * const
      artificial
      location = fbreg(0x68) deref plus_uconst(8)


Clang's output looks like this:

class_type
  member
    name = "x"
    type = int&
    data_member_location = 0
    accessibility = private

  member
    name = "this"
    type = foo*
    data_member_location = 8
    accessibility = private

  subprogram
    name = "operator()"
    accessibility = public
    declaration

    formal_parameter
      type = const lambda*
      artificial

...

subprogram
  specification = lambda::operator()
  frame_base = DW_OP_reg6
  object_pointer = the 'this' parameter DIE

  formal_parameter
    name = "this"
    type = const lambda*
    artificial
    location = DW_OP_fbreg(0x78)


So, different implementations (GCC munges the names of the actual
member variables that do the capturing, but introduces local variables
(that just use location expressions to point to the member variables)
into the lambda's function call operator - clang just names the member
variables the actual name in the first place) but a similar result,
except for 'this'.

Still pretty sure that GDB should let you name the object_pointer
something other than "this" and respect that, finding the member
variable "this" instead. (with a modification to Clang not to name the
parameter "this" - either no name, or "__closure" like GCC, etc)

>
> Figured I'd start the discussion here, while it's on my mind/I've
> built up some state.
>
> The theory is that, at a bare minimum, we need to describe the
> location of the captured "this" variable (so the lambda's own "this"
> doesn't hide it).
>
> 1) Counterpoint: Clang correctly describes the member variable as
> being named "this". Perhaps we could argue that the debugger, upon
> seeing a member variable called "this" should treat that as what the
> user means when they say "this". But that's probably not actually a
> reasonable thing to do - the lambda's "this" is in a closer scope (an
> implicit argument to the member function) and should probably override
> (though clearly the compiler has to treat the object pointer as
> special in some way... )
>
> ooh, that gets me thinking: what if we just didn't provide a name for
> the object pointer? Since we want "this" to refer to the member
> variable, it seems like it should be possible to remove the name and
> then the debugger should fall back to finding the member variable
> 'this' instead.
>
> Hacking up LLVM's IR to do this produced the expected debug info, but
> it did not produce the expected debugger behavior, "ptype this" still
> found the lambda's "this" rather than the capture. I'm inclined to
> file that as a gdb bug. Thoughts?
>
> 2) If we wanted to be pedantically correct, we should actually create
> special variables to describe the location of all reference captures -
> currently with both Clang and GCC, the type of a reference capture is
> T&, but arguably we should have a variable of type "T" (that was the
> actual type in the user's code after all, right?) and we could do this
> by introducing a DW_TAG_variable for each captured variable that
> indirects through the lambda's this pointer to the member and
> indirects again through the pointer that lives there. But this is
> perhaps a bit more than is valuable.
>
> But if we do end up needing to describe the location of captured
> "this" rather than using my suggestion above, that would probably mean
> introducing a sufficiently generic location expression system into
> Clang that we could address (2) too.
>
> The main thing I think we need is the ability to do "indirect +
> offset" (currently for C++ non-trivial pass by value parameters we
> have "indirect", and if the captured this were always the first member
> of the lambda, that feature alone would be sufficient (I haven't
> tested this, but might) - just use the same location for captured
> "this" as for the lambda's "this", but add the indirect flag... -
> except we need an offset as well - and that might fall outside the
> sort of stuff we want to stuff in the DW_TAG_auto_variable metadata,
> and instead incline us towards generalizing the dbg.declare intrinsic
> to describe more complex location information?)
>
> Any other thoughts/ideas?



More information about the cfe-dev mailing list