[lldb-dev] [RFC][PATCH] Keep un-canonicalized template types in the debug information

Thu Sep 18 13:05:57 PDT 2014

On Thu, Sep 18, 2014 at 1:00 PM, Robinson, Paul <
Paul_Robinson at playstation.sony.com> wrote:

> >From: David Blaikie [mailto:dblaikie at gmail.com]
> >On Wed, Sep 17, 2014 at 11:54 AM, Robinson, Paul <
> Paul_Robinson at playstation.sony.com> wrote:
> >> From: David Blaikie [mailto:dblaikie at gmail.com]
> >> > On Wed, Sep 17, 2014 at 7:45 AM, Robinson, Paul <
> Paul_Robinson at playstation.sony.com> wrote:
> >> > > From: David Blaikie [mailto:dblaikie at gmail.com]
> >> > > On Sat, Sep 13, 2014 at 6:54 PM, Robinson, Paul <
> Paul_Robinson at playstation.sony.com> wrote:
> >> > > > > > On 09 Sep 2014, at 00:01, jingham at apple.com wrote:
> >> > > > > > >
> >> > > > > > > From the debugger's standpoint, the functional concern is
> that if you do
> >> > > > > > something more real, like:
> >> > > > > > >
> >> > > > > > > typedef int A;
> >> > > > > > > template <typename T>
> >> > > > > > > struct S
> >> > > > > > > {
> >> > > > > > >  T my_t;
> >> > > > > > > };
> >> > > > > > >
> >> > > > > > > I want to make sure that the type of my_t is given as "A"
> not as "int".
> >> > > > > > The reason for that is that it is not uncommon to have data
> formatters
> >> > > > > > that trigger off the typedef name.  This happens when you use
> some common
> >> > > > > > underlying type like "int" but the value has some special
> meaning when it
> >> > > > > > is formally an "A", and you want to use the data formatters
> to give it an
> >> > > > > > appropriate presentation. Since the data formatters work by
> matching type
> >> > > > > > name, starting from the most specific on down, it is
> important that the
> >> > > > > > typedef name be preserved.
> >> > > > > > >
> >> > > > > > > However, it would be really odd to see:
> >> > > > > > >
> >> > > > > > > (lldb) expr -T -- my_s
> >> > > > > > > (S<int>) $1 = {
> >> > > > > > >  (A) my_t = 5
> >> > > > > > > }
> >> > > > > > >
> >> > > > > > > instead of:
> >> > > > > > >
> >> > > > > > > (lldb) expr -T -- my_s
> >> > > > > > > (S<A>) $1 = {
> >> > > > > > >  (A) my_t = 5
> >> > > > > > > }
> >> > > > > > >
> >> > > > > > > so I am in favor of presenting the template parameter type
> with the most
> >> > > > > > specific name it was given in the overall template type name.
> >> > > > > >
> >> > > > > > OK, we get this wrong today. I’ll try to look into it.
> >> > > > > >
> >> > > > > > What’s your take on the debug info representation for the
> templated class
> >> > > > > > type? The tentative patch introduces a typedef that declares
> S<A> as a
> >> > > > > > typedef for S<int>. The typedef doesn’t exist in the code,
> thus I find it
> >> > > > > > a bit of a lie to the debugger. I was more in favour of
> something like :
> >> > > > > >
> >> > > > > > DW_TAG_variable
> >> > > > > > DW_AT_type: -> DW_TAG_structure_type
> >> > > > > >                DW_AT_name: S<A>
> >> > > > > >                DW_AT_specification: -> DW_TAG_structure_type
> >> > > > > >                                          DW_AT_name: S<int>
> >> > > > > >
> >> > > > > > This way the canonical type is kept in the debug information,
> and the
> >> > > > > > declaration type is a real class type aliasing the canonical
> type. But I’m
> >> > > > > > not sure debuggers can digest this kind of aliasing.
> >> > > > > >
> >> > > > > > Fred
> >> > > > >
> >> > > > > Why introduce the extra typedef? S<A> should have a template
> parameter
> >> > > > > entry pointing to A which points to int.  The info should all
> be there
> >> > > > > without any extra stuff.  Or if you think something is missing,
> please
> >> > > > > provide a more complete example.
> >> > > > My immediate concern here would be either loss of information or
> bloat
> >> > > > when using that with type units (either bloat because each
> instantiation
> >> > > > with differently spelled (but identical) parameters is treated as
> a separate
> >> > > > type - or loss when the types are considered the same and all but
> one are
> >> > > > dropped at link time)
> >> > > You'll need to unpack that more because I'm not following the
> concern.
> >> > > If the typedefs are spelled differently, don't they count as
> different types?
> >> > > DWARF wants to describe the program as-written, and there's no
> S<int> written
> >> > > in the program.
> >> > >
> >> > > Maybe not in this TU, but possibly in another TU? Or by the user.
> >> > >
> >> > >  void func(S<int>);
> >> > >  ...
> >> > >  typedef int A;
> >> > >  S<A> s;
> >> > >  func(s); // calls the same function
> >> > >
> >> > > The user probably wants to be able to call void func with S<int> or
> S<A>
> >> > Sure.
> >> >
> >> > > (and, actually, in theory, with S<B> where B is another typedef of
> int, but
> >> > > that'll /really/ require DWARF consumer support and/or new DWARF
> wording).
> >> >
> >> > Not DWARF wording. DWARF doesn't say when you can and can't call
> something;
> >> > that's a debugger feature and therefore a debugger decision.
> >> >
> >> What I mean is we'd need some new DWARF to help explain which types are
> >> equivalent (or the debugger would have to do a lot of spelunking to try
> >> to find structurally equivalent types - "S<B>" and "S<A>", go look
> through
> >> their DW_TAG_template_type_params, see if they are typedefs to the same
> >> underlying type, etc... )
> >> >
> >> >
> >> > > We can't emit these as completely independent types - it would be
> verbose
> >> > > (every instantiation with different typedefs would be a whole
> separate type
> >> > > in the DWARF, not deduplicated by type units, etc) and wrong
> >> >
> >> > Yes, "typedef int A;" creates a synonym/alias not a new type, so S<A>
> and S<int>
> >> > describe the same type from the C++ perspective, so you don't want
> two complete
> >> > descriptions with different names, because that really would be
> describing them
> >> > as separate types.  What wrinkles my brow is having S<int> be the
> "real"
> >> > description even though it isn't instantiated that way in the
> program.  I wonder
> >> > if it should be marked artificial... but if you do instantiate S<int>
> in another
> >> > TU then you don't want that.  Huh.  It also seems weird to have this:
> >> >  DW_TAG_typedef
> >> >    DW_AT_name "S<A>"
> >> >    DW_AT_type -> S<int>
> >> > but I seem to be coming around to thinking that's the most viable way
> to have
> >> > a single actual instantiated type, and still have the correct names
> of things
> >*mostly* correct; this still loses "A" as the type of the data member.
> >
> >For the DW_TAG_template_type_parameter, you mean? No, it wouldn't.
> >
> > (as a side note, if you do actually have a data member (or any other
> mention) of
> >the template parameter type, neither Clang nor GCC really get that
> 'right' -
> >"template<typename T> struct foo { T t; }; foo<int> f;" - in both Clang
> and GCC,
> >the type of the 't' member of foo<int> is a direct reference to the "int"
> DIE, not
> >to the DW_TAG_template_type_parameter for "T" -> int)
>
> Huh. And DWARF doesn't say you should point to the
> template_type_parameter...
> I thought it did, but no.  Okay, so nothing is lost, but it feels desirable
> to me, that uses of the template parameter should cite it in the DWARF as
> well.
> But I guess we can leave that part of the debate for another time.
>
> >
> >Crud.
> >But I haven't come up with a way to get that back without basically
> instantiating
> >S<A> and S<int> separately.
> >
> >> >
> >> Yep - it's the only way I can think of giving this information in a way
> that's
> >> likely to work with existing consumers. It would probably be harmless
> to add
> >> DW_AT_artificial to the DW_TAG_typedef, if that's any help to any debug
> info
> >> consumer.
> >
> >Hmmm no, S<A> is not the artificial name;
> >
> >It's not the artificial name, but it is an artificial typedef.
>
> If the source only says S<A>, then the entire S<int> description is
> artificial,
> because *that's not what the user wrote*.  So both the typedef and the
> class type
> are artificial.  Gah.  Let's forget artificial here.
>
> >
> >some debuggers treat DW_AT_artificial
> >as meaning "don't show this to the user."
> >
> >In some sense that's what I want - we never wrote the typedef in the
> source
> >so I wouldn't want to see it rendered in the "list of typedefs" (or even
> >probably in the list of types, maybe).
> >
> >But S<A> is the name we *do* want to
> >show to the user.
> >
> >Maybe. Sometimes. But there could be many such aliases for the type. (&
> many
> >more that were never written in the source code, but are still valid in
> the
> >source language (every other typedef of int, every other way to name the
> int
> >type (decltype, etc)))
>
> But you *lose* cases where the typedef is the *same* *everywhere*.  And in
> many cases that typedef is a valuable thing, not the trivial rename we've
> been bandying about.  This is a more real example:
>
> typedef int int4 __attribute__((ext_vector_type(4)));
> template<typename T> struct TypeTraits {};
> template<>
> struct TypeTraits<int4> {
>   static unsigned MysteryNumber;
> };
> unsigned TypeTraits<int4>::MysteryNumber = 3U;
>
> Displaying "TypeTraits<int __attribute__((ext_vector_type(4)))>" is much
> worse than "TypeTraits<int4>" (and not just because it's shorter).
> More to the point, having the debugger *complain* when the user says
> something like "ptype TypeTraits<int4>" is a problem.
>
> Reducing debug-info size is a worthy goal, but don't degrade the debugging
> experience to get there.
>

I'm not sure which part of what I've said seemed like a suggestion to
degrade the debugging experience to minimize debug info size (the
proposition that we should use a typedef or other alias on top of the
canonical type? It wouldn't cause "ptype TypeTraits<int4>" to complain -
indeed for GDB ptyping a typedef gives /exactly/ the same output as if you
ptype the underlying type - it doesn't even mention that there's a typedef
involved:

typedef fooA foo<int>;

> --paulr
>
> >
> >
> >> That said, I'm not opposed to proposing something to DWARF to define
> some more
> >> 'proper' way to describe this.
> >
> >Yah.  I've been thinking about the DW_AT_specification idea too, which
> would be
> >something like this:
> >    DW_TAG_class_type
> >      DW_AT_name "S<A>"
> >      DW_AT_specification -> S<int>
> >
> >      DW_TAG_template_type_parameter
> >        DW_AT_name "T"
> >        DW_AT_type -> A
> >
> >The problem with this is you don't know where T is used in the template,
> so
> >you *still* don't know when to use A as the type of "field". Also it's
> kind
> >of an abuse of DW_AT_specification.  If we can't get A as the type of
> "field"
> >then the typedef is more straightforward and understandable.
> >
> >It's still a lot of DWARF to emit for every way the user has named the
> template
> >& I'm not sure how much value it provides - are there use cases you have
> in mind
> >that would benefit from the increased fidelity of knowing which template
> argument
> >corresponds to the way the user wrote the type.
> >
> > (& what would we emit if the user named the type in some other more
> exotic way:
> >int func(); template<typename T> struct S { }; ... S<decltype(func())> s;
> )
> >
> >
> >Maybe I'll pop a note to the DWARF committee for a broader set of
> opinions.
> >
> >>
> >> One other open question is then, when, if ever, to reference the
> DW_TAG_typedef
> >> rather than the underlying type? Do we just reference it whenever the
> user
> >> writes using that name?
> >>
> >>  void f(S<A>);
> >>  ...
> >>  void f(S<B>) { ... }
> >>
> >> etc... (this would be just as possible/we could maybe treat it the same
> as if
> >> the user wrote "void f(A); ... void f(B) { ... }")
> >
> >That's what I would do, and I think is more conformant to the DWARF spec.
> >--paulr
> >
> >>
> >> > (because DWARF is all about the name "as it appears in the source
> program.")
> >> >
> >> > > (the debugger wouldn't know these are actually the same type so
> wouldn't
> >> > > allow function calls, etc).
> >> > >
> >> > > - David
> >> > >
> >> > > >
> >> > > >
> >> > > > > Jim
> >> > > > >
> >> > > >> On Sep 8, 2014, at 12:38 PM, Frédéric Riss <friss at apple.com>
> wrote:
> >> > > >>
> >> > > >>
> >> > > >>> On 08 Sep 2014, at 19:31, Greg Clayton <gclayton at apple.com>
> wrote:
> >> > > >>>
> >> > > >>> This means you will see "S<A>" as the type for your variables
> in the
> >> > > debugger when you view variables or children of
> structs/unions/classes. I
> >> > > think this is not what the user would want to see. I would rather
> see
> >> > > "S<int>" as the type for my variable than see "S<A>”.
> >> > > >>
> >> > > >> I find it more accurate for the debugger to report what has
> actually
> >> > > been put in the code. Moreover when a typedef is used, it’s usually
> to
> >> > > make things more readable not to hide information, thus I guess it
> would
> >> > > usually be as informative while being more compact. The debugger
> needs to
> >> > > have a way to describe the real type behind the abbreviated name
> though,
> >> > > we must not have less information compared to what we have today.
> >> > > >>
> >> > > >> Another point: this allows the debugger to know what S<A>
> actually is.
> >> > > Without it, the debugger only knows the canonical type. This means
> that
> >> > > currently you can’t copy/paste a piece of code that references that
> kind
> >> > > of template names and have it parse correctly. I /think/ that
> having this
> >> > > information in the debug info will allow more of this to work.
> >> > > >>
> >> > > >> But we can agree to disagree :-) It would be great to have more
> people
> >> > > chime and give their opinion.
> >> > > >>
> >> > > >> Fred
> >> > > >>
> >> > > >>>> On Sep 5, 2014, at 4:00 PM, Adrian Prantl <aprantl at apple.com>
> wrote:
> >> > > >>>>
> >> > > >>>>
> >> > > >>>>> On Sep 5, 2014, at 3:49 PM, Eric Christopher <
> echristo at gmail.com>
> >> > > wrote:
> >> > > >>>>>
> >> > > >>>>>
> >> > > >>>>>
> >> > > >>>>>
> >> > > >>>>> On Fri, Sep 5, 2014 at 3:43 PM, Duncan P. N. Exon Smith
> >> > > <dexonsmith at apple.com> wrote:
> >> > > >>>>>
> >> > > >>>>>> On 2014 Sep 5, at 16:01, Frédéric Riss <friss at apple.com>
> wrote:
> >> > > >>>>>>
> >> > > >>>>>> I couldn’t even find a subject expressing exactly what this
> patch
> >> > > is about… First of all, it’s meant to start a discussion, and I’m
> not
> >> > > proposing it for inclusion right now.
> >> > > >>>>>>
> >> > > >>>>>> The issue I’m trying to address is that template types are
> always
> >> > > canonicalized when emitted in the debug information (this is the
> desugar()
> >> > > call in UnwrapTypeForDebugInformation).
> >> > > >>>>>>
> >> > > >>>>>> This means that if the developer writes:
> >> > > >>>>>>
> >> > > >>>>>> typedef int A;
> >> > > >>>>>> template <typename T>
> >> > > >>>>>> struct S {};
> >> > > >>>>>>
> >> > > >>>>>> S<A> var;
> >> > > >>>>>>
> >> > > >>>>>> The variable var will have type S<int> and not S<A>. In this
> simple
> >> > > example, it’s not that much of an issue, but for heavily templated
> code,
> >> > > the full expansion might be really different from the original
> >> > > declaration.
> >> > > >>>>>>
> >> > > >>>>>> The attached patch makes us emit an intermediate typedef for
> the
> >> > > variable’s type:
> >> > > >>>>>>
> >> > > >>>>>> 0x0000002a:   DW_TAG_variable [2]
> >> > > >>>>>>             DW_AT_name [DW_FORM_strp]       (
> >> > > .debug_str[0x00000032] = “var")
> >> > > >>>>>>             DW_AT_type [DW_FORM_ref4]       (cu + 0x0040 =>
> >> > > {0x00000040})
> >> > > >>>>>>             DW_AT_external [DW_FORM_flag]   (0x01)
> >> > > >>>>>>             DW_AT_decl_file [DW_FORM_data1] (0x01)
> >> > > >>>>>>             DW_AT_decl_line [DW_FORM_data1] (8)
> >> > > >>>>>>             DW_AT_location [DW_FORM_block1] (<0x09> 03 70 6c
> 00 00
> >> > > 00 00 00 00 )
> >> > > >>>>>>
> >> > > >>>>>> 0x00000040:   DW_TAG_typedef [3]
> >> > > >>>>>>             DW_AT_type [DW_FORM_ref4]       (cu + 0x004b =>
> >> > > {0x0000004b})
> >> > > >>>>>>             DW_AT_name [DW_FORM_strp]       (
> >> > >.debug_str[0x00000035] = “S<A>")
> >> > > >>>>>>             DW_AT_decl_file [DW_FORM_data1] (0x01)
> >> > > >>>>>>             DW_AT_decl_line [DW_FORM_data1] (6)
> >> > > >>>>>>
> >> > > >>>>>> 0x0000004b:   DW_TAG_structure_type [4] *
> >> > > >>>>>>             DW_AT_name [DW_FORM_strp]       (
> >> > >.debug_str[0x0000003e] = “S<int>")
> >> > > >>>>>>             DW_AT_byte_size [DW_FORM_data1] (0x01)
> >> > > >>>>>>             DW_AT_decl_file [DW_FORM_data1] (0x01)
> >> > > >>>>>>             DW_AT_decl_line [DW_FORM_data1] (6)
> >> > > >>>>>>
> >> > > >>>>>>
> >> > > >>>>>> Which basically is what I want, although I don’t like that it
> >> > > introduces a typedef where there is none in the code. I’d prefer
> that to
> >> > > be:
> >> > > >>>>>>
> >> > > >>>>>> DW_TAG_variable
> >> > > >>>>>> DW_AT_type: -> DW_TAG_structure_type
> >> > > >>>>>>                DW_AT_name: S<A>
> >> > > >>>>>>                DW_AT_specification: -> DW_TAG_structure_type
> >> > > >>>>>>                                          DW_AT_name: S<int>
> >> > > >>>>>>                                          …
> >> > > >>>>>>
> >> > > >>>>>> The patch also has the nice property of omitting the
> defaulted
> >> > > template arguments in the first level typedef. For example you get
> >> > > vector<A> instead of vector<int, std::__1::allocator<int> >.
> >> > > >>>>>
> >> > > >>>>> If you specify `vector<int>` in C++ do you get that instead of
> >> > > >>>>> `vector<int, std::__1::allocator<int>>`?
> >> > > >>>>>
> >> > > >>>>> Yeah, I mentioned this as possibly causing problems with
> debuggers
> >> > > or other consumers, but I don't have any proof past "ooooo scary!”.
> >> > > >>>>
> >> > > >>>> Well, [+lldb-dev], could this confuse debuggers? :-)
> >> > > >>>>
> >> > > >>>> -- adrian
> >> > > >>>>>
> >> > > >>>>> That said, I like the idea personally :)
> >> > > >>>>>
> >> > > >>>>> -eric
> >> > > >>>>>
> >> > > >>>>>
> >> > > >>>>>> Now there is one thing I really don’t like about the patch.
> In
> >> > > order not to emit typedefs for types that don’t need it, I use
> string
> >> > > comparison between the desugared and the original type. I haven’t
> >> > > quantified anything, but doing the construction of the type name
> for every
> >> > > template type and then comparing it to decide to use it or not
> seems like
> >> > > a big waste.
> >> > > >>>>>
> >> > > >>>>> Maybe someone on cfe-dev knows a better way.
> >> > > >>>>>
> >> > > >>>>>>
> >> > > >>>>>> Thoughts?
> >> > > >>>>>>
> >> > > >>>>>> <template-arg-typedefs.diff>
> >> > > >>>>>>
> >> > > >>>>>> Fred
> >> > > >>>>>>
> >> > > >>>>>>
> >> > > >>>>>>
> >> > > >>>>>>
> >> > > >>>>>>
> >> > > >>>>>> _______________________________________________
> >> > > >>>>>> llvm-commits mailing list
> >> > > >>>>>> llvm-commits at cs.uiuc.edu
> >> > > >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >> > > >>>>>
> >> > > >>>>>
> >> > > >>>>
> >> > > >>>> _______________________________________________
> >> > > >>>> lldb-dev mailing list
> >> > > >>>> lldb-dev at cs.uiuc.edu
> >> > > >>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> >> > > >>>
> >> > > >>
> >> > > >>
> >> > > >> _______________________________________________
> >> > > >> lldb-dev mailing list
> >> > > >> lldb-dev at cs.uiuc.edu
> >> > > >> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> >> > > >
> >> > >
> >> > >
> >> > > _______________________________________________
> >> > > llvm-commits mailing list
> >> > > llvm-commits at cs.uiuc.edu
> >> > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> >> >
> >> > _______________________________________________
> >> > llvm-commits mailing list
> >> > llvm-commits at cs.uiuc.edu
> >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20140918/51e57347/attachment.html>