[lldb-dev] [RFC][PATCH] Keep un-canonicalized template types in the debug information

Robinson, Paul Paul_Robinson at playstation.sony.com
Wed Sep 17 11:54:49 PDT 2014


> From: David Blaikie [mailto:dblaikie at gmail.com] 
> > On Wed, Sep 17, 2014 at 7:45 AM, Robinson, Paul <Paul_Robinson at playstation.sony.com> wrote:
> > > From: David Blaikie [mailto:dblaikie at gmail.com]
> > > On Sat, Sep 13, 2014 at 6:54 PM, Robinson, Paul <Paul_Robinson at playstation.sony.com> wrote:
> > > > > > On 09 Sep 2014, at 00:01, jingham at apple.com wrote:
> > > > > > >
> > > > > > > From the debugger's standpoint, the functional concern is that if you do
> > > > > > something more real, like:
> > > > > > >
> > > > > > > typedef int A;
> > > > > > > template <typename T>
> > > > > > > struct S
> > > > > > > {
> > > > > > >  T my_t;
> > > > > > > };
> > > > > > >
> > > > > > > I want to make sure that the type of my_t is given as "A" not as "int".
> > > > > > The reason for that is that it is not uncommon to have data formatters
> > > > > > that trigger off the typedef name.  This happens when you use some common
> > > > > > underlying type like "int" but the value has some special meaning when it
> > > > > > is formally an "A", and you want to use the data formatters to give it an
> > > > > > appropriate presentation. Since the data formatters work by matching type
> > > > > > name, starting from the most specific on down, it is important that the
> > > > > > typedef name be preserved.
> > > > > > >
> > > > > > > However, it would be really odd to see:
> > > > > > >
> > > > > > > (lldb) expr -T -- my_s
> > > > > > > (S<int>) $1 = {
> > > > > > >  (A) my_t = 5
> > > > > > > }
> > > > > > >
> > > > > > > instead of:
> > > > > > >
> > > > > > > (lldb) expr -T -- my_s
> > > > > > > (S<A>) $1 = {
> > > > > > >  (A) my_t = 5
> > > > > > > }
> > > > > > >
> > > > > > > so I am in favor of presenting the template parameter type with the most
> > > > > > specific name it was given in the overall template type name.
> > > > > >
> > > > > > OK, we get this wrong today. I’ll try to look into it.
> > > > > >
> > > > > > What’s your take on the debug info representation for the templated class
> > > > > > type? The tentative patch introduces a typedef that declares S<A> as a
> > > > > > typedef for S<int>. The typedef doesn’t exist in the code, thus I find it
> > > > > > a bit of a lie to the debugger. I was more in favour of something like :
> > > > > >
> > > > > > DW_TAG_variable
> > > > > > DW_AT_type: -> DW_TAG_structure_type
> > > > > >                DW_AT_name: S<A>
> > > > > >                DW_AT_specification: -> DW_TAG_structure_type
> > > > > >                                          DW_AT_name: S<int>
> > > > > >
> > > > > > This way the canonical type is kept in the debug information, and the
> > > > > > declaration type is a real class type aliasing the canonical type. But I’m
> > > > > > not sure debuggers can digest this kind of aliasing.
> > > > > >
> > > > > > Fred
> > > > >
> > > > > Why introduce the extra typedef? S<A> should have a template parameter
> > > > > entry pointing to A which points to int.  The info should all be there
> > > > > without any extra stuff.  Or if you think something is missing, please
> > > > > provide a more complete example.
> > > > My immediate concern here would be either loss of information or bloat
> > > > when using that with type units (either bloat because each instantiation
> > > > with differently spelled (but identical) parameters is treated as a separate
> > > > type - or loss when the types are considered the same and all but one are
> > > > dropped at link time)
> > > You'll need to unpack that more because I'm not following the concern.
> > > If the typedefs are spelled differently, don't they count as different types?
> > > DWARF wants to describe the program as-written, and there's no S<int> written
> > > in the program.
> > >
> > > Maybe not in this TU, but possibly in another TU? Or by the user.
> > >
> > >  void func(S<int>);
> > >  ...
> > >  typedef int A;
> > >  S<A> s;
> > >  func(s); // calls the same function
> > >
> > > The user probably wants to be able to call void func with S<int> or S<A>
> > Sure.
> >
> > > (and, actually, in theory, with S<B> where B is another typedef of int, but
> > > that'll /really/ require DWARF consumer support and/or new DWARF wording).
> >
> > Not DWARF wording. DWARF doesn't say when you can and can't call something;
> > that's a debugger feature and therefore a debugger decision.
> >
> What I mean is we'd need some new DWARF to help explain which types are 
> equivalent (or the debugger would have to do a lot of spelunking to try 
> to find structurally equivalent types - "S<B>" and "S<A>", go look through 
> their DW_TAG_template_type_params, see if they are typedefs to the same 
> underlying type, etc... )
> >
> >
> > > We can't emit these as completely independent types - it would be verbose
> > > (every instantiation with different typedefs would be a whole separate type
> > > in the DWARF, not deduplicated by type units, etc) and wrong
> >
> > Yes, "typedef int A;" creates a synonym/alias not a new type, so S<A> and S<int>
> > describe the same type from the C++ perspective, so you don't want two complete
> > descriptions with different names, because that really would be describing them
> > as separate types.  What wrinkles my brow is having S<int> be the "real"
> > description even though it isn't instantiated that way in the program.  I wonder
> > if it should be marked artificial... but if you do instantiate S<int> in another
> > TU then you don't want that.  Huh.  It also seems weird to have this:
> >  DW_TAG_typedef
> >    DW_AT_name "S<A>"
> >    DW_AT_type -> S<int>
> > but I seem to be coming around to thinking that's the most viable way to have
> > a single actual instantiated type, and still have the correct names of things

*mostly* correct; this still loses "A" as the type of the data member. Crud.
But I haven't come up with a way to get that back without basically instantiating
S<A> and S<int> separately.

> >
> Yep - it's the only way I can think of giving this information in a way that's 
> likely to work with existing consumers. It would probably be harmless to add 
> DW_AT_artificial to the DW_TAG_typedef, if that's any help to any debug info 
> consumer.

Hmmm no, S<A> is not the artificial name; some debuggers treat DW_AT_artificial
as meaning "don't show this to the user." But S<A> is the name we *do* want to 
show to the user.

> That said, I'm not opposed to proposing something to DWARF to define some more 
> 'proper' way to describe this.

Yah.  I've been thinking about the DW_AT_specification idea too, which would be
something like this:
    DW_TAG_class_type
      DW_AT_name "S<A>"
      DW_AT_specification -> S<int>

      DW_TAG_template_type_parameter
        DW_AT_name "T"
        DW_AT_type -> A

The problem with this is you don't know where T is used in the template, so
you *still* don't know when to use A as the type of "field". Also it's kind
of an abuse of DW_AT_specification.  If we can't get A as the type of "field"
then the typedef is more straightforward and understandable.

Maybe I'll pop a note to the DWARF committee for a broader set of opinions.

>
> One other open question is then, when, if ever, to reference the DW_TAG_typedef 
> rather than the underlying type? Do we just reference it whenever the user 
> writes using that name?
>
>  void f(S<A>);
>  ...
>  void f(S<B>) { ... }
>
> etc... (this would be just as possible/we could maybe treat it the same as if 
> the user wrote "void f(A); ... void f(B) { ... }")

That's what I would do, and I think is more conformant to the DWARF spec.
--paulr

> 
> > (because DWARF is all about the name "as it appears in the source program.")
> >
> > > (the debugger wouldn't know these are actually the same type so wouldn't
> > > allow function calls, etc).
> > >
> > > - David
> > >
> > > >
> > > >
> > > > > Jim
> > > > >
> > > >> On Sep 8, 2014, at 12:38 PM, Frédéric Riss <friss at apple.com> wrote:
> > > >>
> > > >>
> > > >>> On 08 Sep 2014, at 19:31, Greg Clayton <gclayton at apple.com> wrote:
> > > >>>
> > > >>> This means you will see "S<A>" as the type for your variables in the
> > > debugger when you view variables or children of structs/unions/classes. I
> > > think this is not what the user would want to see. I would rather see
> > > "S<int>" as the type for my variable than see "S<A>”.
> > > >>
> > > >> I find it more accurate for the debugger to report what has actually
> > > been put in the code. Moreover when a typedef is used, it’s usually to
> > > make things more readable not to hide information, thus I guess it would
> > > usually be as informative while being more compact. The debugger needs to
> > > have a way to describe the real type behind the abbreviated name though,
> > > we must not have less information compared to what we have today.
> > > >>
> > > >> Another point: this allows the debugger to know what S<A> actually is.
> > > Without it, the debugger only knows the canonical type. This means that
> > > currently you can’t copy/paste a piece of code that references that kind
> > > of template names and have it parse correctly. I /think/ that having this
> > > information in the debug info will allow more of this to work.
> > > >>
> > > >> But we can agree to disagree :-) It would be great to have more people
> > > chime and give their opinion.
> > > >>
> > > >> Fred
> > > >>
> > > >>>> On Sep 5, 2014, at 4:00 PM, Adrian Prantl <aprantl at apple.com> wrote:
> > > >>>>
> > > >>>>
> > > >>>>> On Sep 5, 2014, at 3:49 PM, Eric Christopher <echristo at gmail.com>
> > > wrote:
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>>
> > > >>>>> On Fri, Sep 5, 2014 at 3:43 PM, Duncan P. N. Exon Smith
> > > <dexonsmith at apple.com> wrote:
> > > >>>>>
> > > >>>>>> On 2014 Sep 5, at 16:01, Frédéric Riss <friss at apple.com> wrote:
> > > >>>>>>
> > > >>>>>> I couldn’t even find a subject expressing exactly what this patch
> > > is about… First of all, it’s meant to start a discussion, and I’m not
> > > proposing it for inclusion right now.
> > > >>>>>>
> > > >>>>>> The issue I’m trying to address is that template types are always
> > > canonicalized when emitted in the debug information (this is the desugar()
> > > call in UnwrapTypeForDebugInformation).
> > > >>>>>>
> > > >>>>>> This means that if the developer writes:
> > > >>>>>>
> > > >>>>>> typedef int A;
> > > >>>>>> template <typename T>
> > > >>>>>> struct S {};
> > > >>>>>>
> > > >>>>>> S<A> var;
> > > >>>>>>
> > > >>>>>> The variable var will have type S<int> and not S<A>. In this simple
> > > example, it’s not that much of an issue, but for heavily templated code,
> > > the full expansion might be really different from the original
> > > declaration.
> > > >>>>>>
> > > >>>>>> The attached patch makes us emit an intermediate typedef for the
> > > variable’s type:
> > > >>>>>>
> > > >>>>>> 0x0000002a:   DW_TAG_variable [2]
> > > >>>>>>             DW_AT_name [DW_FORM_strp]       (
> > > .debug_str[0x00000032] = “var")
> > > >>>>>>             DW_AT_type [DW_FORM_ref4]       (cu + 0x0040 =>
> > > {0x00000040})
> > > >>>>>>             DW_AT_external [DW_FORM_flag]   (0x01)
> > > >>>>>>             DW_AT_decl_file [DW_FORM_data1] (0x01)
> > > >>>>>>             DW_AT_decl_line [DW_FORM_data1] (8)
> > > >>>>>>             DW_AT_location [DW_FORM_block1] (<0x09> 03 70 6c 00 00
> > > 00 00 00 00 )
> > > >>>>>>
> > > >>>>>> 0x00000040:   DW_TAG_typedef [3]
> > > >>>>>>             DW_AT_type [DW_FORM_ref4]       (cu + 0x004b =>
> > > {0x0000004b})
> > > >>>>>>             DW_AT_name [DW_FORM_strp]       (
> > >.debug_str[0x00000035] = “S<A>")
> > > >>>>>>             DW_AT_decl_file [DW_FORM_data1] (0x01)
> > > >>>>>>             DW_AT_decl_line [DW_FORM_data1] (6)
> > > >>>>>>
> > > >>>>>> 0x0000004b:   DW_TAG_structure_type [4] *
> > > >>>>>>             DW_AT_name [DW_FORM_strp]       (
> > >.debug_str[0x0000003e] = “S<int>")
> > > >>>>>>             DW_AT_byte_size [DW_FORM_data1] (0x01)
> > > >>>>>>             DW_AT_decl_file [DW_FORM_data1] (0x01)
> > > >>>>>>             DW_AT_decl_line [DW_FORM_data1] (6)
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> Which basically is what I want, although I don’t like that it
> > > introduces a typedef where there is none in the code. I’d prefer that to
> > > be:
> > > >>>>>>
> > > >>>>>> DW_TAG_variable
> > > >>>>>> DW_AT_type: -> DW_TAG_structure_type
> > > >>>>>>                DW_AT_name: S<A>
> > > >>>>>>                DW_AT_specification: -> DW_TAG_structure_type
> > > >>>>>>                                          DW_AT_name: S<int>
> > > >>>>>>                                          …
> > > >>>>>>
> > > >>>>>> The patch also has the nice property of omitting the defaulted
> > > template arguments in the first level typedef. For example you get
> > > vector<A> instead of vector<int, std::__1::allocator<int> >.
> > > >>>>>
> > > >>>>> If you specify `vector<int>` in C++ do you get that instead of
> > > >>>>> `vector<int, std::__1::allocator<int>>`?
> > > >>>>>
> > > >>>>> Yeah, I mentioned this as possibly causing problems with debuggers
> > > or other consumers, but I don't have any proof past "ooooo scary!”.
> > > >>>>
> > > >>>> Well, [+lldb-dev], could this confuse debuggers? :-)
> > > >>>>
> > > >>>> -- adrian
> > > >>>>>
> > > >>>>> That said, I like the idea personally :)
> > > >>>>>
> > > >>>>> -eric
> > > >>>>>
> > > >>>>>
> > > >>>>>> Now there is one thing I really don’t like about the patch. In
> > > order not to emit typedefs for types that don’t need it, I use string
> > > comparison between the desugared and the original type. I haven’t
> > > quantified anything, but doing the construction of the type name for every
> > > template type and then comparing it to decide to use it or not seems like
> > > a big waste.
> > > >>>>>
> > > >>>>> Maybe someone on cfe-dev knows a better way.
> > > >>>>>
> > > >>>>>>
> > > >>>>>> Thoughts?
> > > >>>>>>
> > > >>>>>> <template-arg-typedefs.diff>
> > > >>>>>>
> > > >>>>>> Fred
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>>
> > > >>>>>> _______________________________________________
> > > >>>>>> llvm-commits mailing list
> > > >>>>>> llvm-commits at cs.uiuc.edu
> > > >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> > > >>>>>
> > > >>>>>
> > > >>>>
> > > >>>> _______________________________________________
> > > >>>> lldb-dev mailing list
> > > >>>> lldb-dev at cs.uiuc.edu
> > > >>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> > > >>>
> > > >>
> > > >>
> > > >> _______________________________________________
> > > >> lldb-dev mailing list
> > > >> lldb-dev at cs.uiuc.edu
> > > >> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> > > >
> > >
> > >
> > > _______________________________________________
> > > llvm-commits mailing list
> > > llvm-commits at cs.uiuc.edu
> > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits





More information about the llvm-commits mailing list