[lldb-dev] [RFC][PATCH] Keep un-canonicalized template types in the debug information

David Blaikie dblaikie at gmail.com
Wed Sep 17 12:04:11 PDT 2014


On Wed, Sep 17, 2014 at 11:54 AM, Robinson, Paul <
Paul_Robinson at playstation.sony.com> wrote:

> > From: David Blaikie [mailto:dblaikie at gmail.com]
> > > On Wed, Sep 17, 2014 at 7:45 AM, Robinson, Paul <
> Paul_Robinson at playstation.sony.com> wrote:
> > > > From: David Blaikie [mailto:dblaikie at gmail.com]
> > > > On Sat, Sep 13, 2014 at 6:54 PM, Robinson, Paul <
> Paul_Robinson at playstation.sony.com> wrote:
> > > > > > > On 09 Sep 2014, at 00:01, jingham at apple.com wrote:
> > > > > > > >
> > > > > > > > From the debugger's standpoint, the functional concern is
> that if you do
> > > > > > > something more real, like:
> > > > > > > >
> > > > > > > > typedef int A;
> > > > > > > > template <typename T>
> > > > > > > > struct S
> > > > > > > > {
> > > > > > > >  T my_t;
> > > > > > > > };
> > > > > > > >
> > > > > > > > I want to make sure that the type of my_t is given as "A"
> not as "int".
> > > > > > > The reason for that is that it is not uncommon to have data
> formatters
> > > > > > > that trigger off the typedef name.  This happens when you use
> some common
> > > > > > > underlying type like "int" but the value has some special
> meaning when it
> > > > > > > is formally an "A", and you want to use the data formatters to
> give it an
> > > > > > > appropriate presentation. Since the data formatters work by
> matching type
> > > > > > > name, starting from the most specific on down, it is important
> that the
> > > > > > > typedef name be preserved.
> > > > > > > >
> > > > > > > > However, it would be really odd to see:
> > > > > > > >
> > > > > > > > (lldb) expr -T -- my_s
> > > > > > > > (S<int>) $1 = {
> > > > > > > >  (A) my_t = 5
> > > > > > > > }
> > > > > > > >
> > > > > > > > instead of:
> > > > > > > >
> > > > > > > > (lldb) expr -T -- my_s
> > > > > > > > (S<A>) $1 = {
> > > > > > > >  (A) my_t = 5
> > > > > > > > }
> > > > > > > >
> > > > > > > > so I am in favor of presenting the template parameter type
> with the most
> > > > > > > specific name it was given in the overall template type name.
> > > > > > >
> > > > > > > OK, we get this wrong today. I’ll try to look into it.
> > > > > > >
> > > > > > > What’s your take on the debug info representation for the
> templated class
> > > > > > > type? The tentative patch introduces a typedef that declares
> S<A> as a
> > > > > > > typedef for S<int>. The typedef doesn’t exist in the code,
> thus I find it
> > > > > > > a bit of a lie to the debugger. I was more in favour of
> something like :
> > > > > > >
> > > > > > > DW_TAG_variable
> > > > > > > DW_AT_type: -> DW_TAG_structure_type
> > > > > > >                DW_AT_name: S<A>
> > > > > > >                DW_AT_specification: -> DW_TAG_structure_type
> > > > > > >                                          DW_AT_name: S<int>
> > > > > > >
> > > > > > > This way the canonical type is kept in the debug information,
> and the
> > > > > > > declaration type is a real class type aliasing the canonical
> type. But I’m
> > > > > > > not sure debuggers can digest this kind of aliasing.
> > > > > > >
> > > > > > > Fred
> > > > > >
> > > > > > Why introduce the extra typedef? S<A> should have a template
> parameter
> > > > > > entry pointing to A which points to int.  The info should all be
> there
> > > > > > without any extra stuff.  Or if you think something is missing,
> please
> > > > > > provide a more complete example.
> > > > > My immediate concern here would be either loss of information or
> bloat
> > > > > when using that with type units (either bloat because each
> instantiation
> > > > > with differently spelled (but identical) parameters is treated as
> a separate
> > > > > type - or loss when the types are considered the same and all but
> one are
> > > > > dropped at link time)
> > > > You'll need to unpack that more because I'm not following the
> concern.
> > > > If the typedefs are spelled differently, don't they count as
> different types?
> > > > DWARF wants to describe the program as-written, and there's no
> S<int> written
> > > > in the program.
> > > >
> > > > Maybe not in this TU, but possibly in another TU? Or by the user.
> > > >
> > > >  void func(S<int>);
> > > >  ...
> > > >  typedef int A;
> > > >  S<A> s;
> > > >  func(s); // calls the same function
> > > >
> > > > The user probably wants to be able to call void func with S<int> or
> S<A>
> > > Sure.
> > >
> > > > (and, actually, in theory, with S<B> where B is another typedef of
> int, but
> > > > that'll /really/ require DWARF consumer support and/or new DWARF
> wording).
> > >
> > > Not DWARF wording. DWARF doesn't say when you can and can't call
> something;
> > > that's a debugger feature and therefore a debugger decision.
> > >
> > What I mean is we'd need some new DWARF to help explain which types are
> > equivalent (or the debugger would have to do a lot of spelunking to try
> > to find structurally equivalent types - "S<B>" and "S<A>", go look
> through
> > their DW_TAG_template_type_params, see if they are typedefs to the same
> > underlying type, etc... )
> > >
> > >
> > > > We can't emit these as completely independent types - it would be
> verbose
> > > > (every instantiation with different typedefs would be a whole
> separate type
> > > > in the DWARF, not deduplicated by type units, etc) and wrong
> > >
> > > Yes, "typedef int A;" creates a synonym/alias not a new type, so S<A>
> and S<int>
> > > describe the same type from the C++ perspective, so you don't want two
> complete
> > > descriptions with different names, because that really would be
> describing them
> > > as separate types.  What wrinkles my brow is having S<int> be the
> "real"
> > > description even though it isn't instantiated that way in the
> program.  I wonder
> > > if it should be marked artificial... but if you do instantiate S<int>
> in another
> > > TU then you don't want that.  Huh.  It also seems weird to have this:
> > >  DW_TAG_typedef
> > >    DW_AT_name "S<A>"
> > >    DW_AT_type -> S<int>
> > > but I seem to be coming around to thinking that's the most viable way
> to have
> > > a single actual instantiated type, and still have the correct names of
> things
>
> *mostly* correct; this still loses "A" as the type of the data member.


For the DW_TAG_template_type_parameter, you mean? No, it wouldn't.

(as a side note, if you do actually have a data member (or any other
mention) of the template parameter type, neither Clang nor GCC really get
that 'right' - "template<typename T> struct foo { T t; }; foo<int> f;" - in
both Clang and GCC, the type of the 't' member of foo<int> is a direct
reference to the "int" DIE, not to the DW_TAG_template_type_parameter for
"T" -> int)


> Crud.
> But I haven't come up with a way to get that back without basically
> instantiating
> S<A> and S<int> separately.
>
> > >
> > Yep - it's the only way I can think of giving this information in a way
> that's
> > likely to work with existing consumers. It would probably be harmless to
> add
> > DW_AT_artificial to the DW_TAG_typedef, if that's any help to any debug
> info
> > consumer.
>
> Hmmm no, S<A> is not the artificial name;


It's not the artificial name, but it is an artificial typedef.


> some debuggers treat DW_AT_artificial
> as meaning "don't show this to the user."


In some sense that's what I want - we never wrote the typedef in the source
so I wouldn't want to see it rendered in the "list of typedefs" (or even
probably in the list of types, maybe).


> But S<A> is the name we *do* want to
> show to the user.
>

Maybe. Sometimes. But there could be many such aliases for the type. (&
many more that were never written in the source code, but are still valid
in the source language (every other typedef of int, every other way to name
the int type (decltype, etc)))


>
> > That said, I'm not opposed to proposing something to DWARF to define
> some more
> > 'proper' way to describe this.
>
> Yah.  I've been thinking about the DW_AT_specification idea too, which
> would be
> something like this:
>     DW_TAG_class_type
>       DW_AT_name "S<A>"
>       DW_AT_specification -> S<int>
>
>       DW_TAG_template_type_parameter
>         DW_AT_name "T"
>         DW_AT_type -> A
>
> The problem with this is you don't know where T is used in the template, so
> you *still* don't know when to use A as the type of "field". Also it's kind
> of an abuse of DW_AT_specification.  If we can't get A as the type of
> "field"
> then the typedef is more straightforward and understandable.
>

It's still a lot of DWARF to emit for every way the user has named the
template & I'm not sure how much value it provides - are there use cases
you have in mind that would benefit from the increased fidelity of knowing
which template argument corresponds to the way the user wrote the type.

(& what would we emit if the user named the type in some other more exotic
way: int func(); template<typename T> struct S { }; ... S<decltype(func())>
s; )


>
> Maybe I'll pop a note to the DWARF committee for a broader set of opinions.
>
> >
> > One other open question is then, when, if ever, to reference the
> DW_TAG_typedef
> > rather than the underlying type? Do we just reference it whenever the
> user
> > writes using that name?
> >
> >  void f(S<A>);
> >  ...
> >  void f(S<B>) { ... }
> >
> > etc... (this would be just as possible/we could maybe treat it the same
> as if
> > the user wrote "void f(A); ... void f(B) { ... }")
>
> That's what I would do, and I think is more conformant to the DWARF spec.
> --paulr
>
> >
> > > (because DWARF is all about the name "as it appears in the source
> program.")
> > >
> > > > (the debugger wouldn't know these are actually the same type so
> wouldn't
> > > > allow function calls, etc).
> > > >
> > > > - David
> > > >
> > > > >
> > > > >
> > > > > > Jim
> > > > > >
> > > > >> On Sep 8, 2014, at 12:38 PM, Frédéric Riss <friss at apple.com>
> wrote:
> > > > >>
> > > > >>
> > > > >>> On 08 Sep 2014, at 19:31, Greg Clayton <gclayton at apple.com>
> wrote:
> > > > >>>
> > > > >>> This means you will see "S<A>" as the type for your variables in
> the
> > > > debugger when you view variables or children of
> structs/unions/classes. I
> > > > think this is not what the user would want to see. I would rather see
> > > > "S<int>" as the type for my variable than see "S<A>”.
> > > > >>
> > > > >> I find it more accurate for the debugger to report what has
> actually
> > > > been put in the code. Moreover when a typedef is used, it’s usually
> to
> > > > make things more readable not to hide information, thus I guess it
> would
> > > > usually be as informative while being more compact. The debugger
> needs to
> > > > have a way to describe the real type behind the abbreviated name
> though,
> > > > we must not have less information compared to what we have today.
> > > > >>
> > > > >> Another point: this allows the debugger to know what S<A>
> actually is.
> > > > Without it, the debugger only knows the canonical type. This means
> that
> > > > currently you can’t copy/paste a piece of code that references that
> kind
> > > > of template names and have it parse correctly. I /think/ that having
> this
> > > > information in the debug info will allow more of this to work.
> > > > >>
> > > > >> But we can agree to disagree :-) It would be great to have more
> people
> > > > chime and give their opinion.
> > > > >>
> > > > >> Fred
> > > > >>
> > > > >>>> On Sep 5, 2014, at 4:00 PM, Adrian Prantl <aprantl at apple.com>
> wrote:
> > > > >>>>
> > > > >>>>
> > > > >>>>> On Sep 5, 2014, at 3:49 PM, Eric Christopher <
> echristo at gmail.com>
> > > > wrote:
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>
> > > > >>>>> On Fri, Sep 5, 2014 at 3:43 PM, Duncan P. N. Exon Smith
> > > > <dexonsmith at apple.com> wrote:
> > > > >>>>>
> > > > >>>>>> On 2014 Sep 5, at 16:01, Frédéric Riss <friss at apple.com>
> wrote:
> > > > >>>>>>
> > > > >>>>>> I couldn’t even find a subject expressing exactly what this
> patch
> > > > is about… First of all, it’s meant to start a discussion, and I’m not
> > > > proposing it for inclusion right now.
> > > > >>>>>>
> > > > >>>>>> The issue I’m trying to address is that template types are
> always
> > > > canonicalized when emitted in the debug information (this is the
> desugar()
> > > > call in UnwrapTypeForDebugInformation).
> > > > >>>>>>
> > > > >>>>>> This means that if the developer writes:
> > > > >>>>>>
> > > > >>>>>> typedef int A;
> > > > >>>>>> template <typename T>
> > > > >>>>>> struct S {};
> > > > >>>>>>
> > > > >>>>>> S<A> var;
> > > > >>>>>>
> > > > >>>>>> The variable var will have type S<int> and not S<A>. In this
> simple
> > > > example, it’s not that much of an issue, but for heavily templated
> code,
> > > > the full expansion might be really different from the original
> > > > declaration.
> > > > >>>>>>
> > > > >>>>>> The attached patch makes us emit an intermediate typedef for
> the
> > > > variable’s type:
> > > > >>>>>>
> > > > >>>>>> 0x0000002a:   DW_TAG_variable [2]
> > > > >>>>>>             DW_AT_name [DW_FORM_strp]       (
> > > > .debug_str[0x00000032] = “var")
> > > > >>>>>>             DW_AT_type [DW_FORM_ref4]       (cu + 0x0040 =>
> > > > {0x00000040})
> > > > >>>>>>             DW_AT_external [DW_FORM_flag]   (0x01)
> > > > >>>>>>             DW_AT_decl_file [DW_FORM_data1] (0x01)
> > > > >>>>>>             DW_AT_decl_line [DW_FORM_data1] (8)
> > > > >>>>>>             DW_AT_location [DW_FORM_block1] (<0x09> 03 70 6c
> 00 00
> > > > 00 00 00 00 )
> > > > >>>>>>
> > > > >>>>>> 0x00000040:   DW_TAG_typedef [3]
> > > > >>>>>>             DW_AT_type [DW_FORM_ref4]       (cu + 0x004b =>
> > > > {0x0000004b})
> > > > >>>>>>             DW_AT_name [DW_FORM_strp]       (
> > > >.debug_str[0x00000035] = “S<A>")
> > > > >>>>>>             DW_AT_decl_file [DW_FORM_data1] (0x01)
> > > > >>>>>>             DW_AT_decl_line [DW_FORM_data1] (6)
> > > > >>>>>>
> > > > >>>>>> 0x0000004b:   DW_TAG_structure_type [4] *
> > > > >>>>>>             DW_AT_name [DW_FORM_strp]       (
> > > >.debug_str[0x0000003e] = “S<int>")
> > > > >>>>>>             DW_AT_byte_size [DW_FORM_data1] (0x01)
> > > > >>>>>>             DW_AT_decl_file [DW_FORM_data1] (0x01)
> > > > >>>>>>             DW_AT_decl_line [DW_FORM_data1] (6)
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> Which basically is what I want, although I don’t like that it
> > > > introduces a typedef where there is none in the code. I’d prefer
> that to
> > > > be:
> > > > >>>>>>
> > > > >>>>>> DW_TAG_variable
> > > > >>>>>> DW_AT_type: -> DW_TAG_structure_type
> > > > >>>>>>                DW_AT_name: S<A>
> > > > >>>>>>                DW_AT_specification: -> DW_TAG_structure_type
> > > > >>>>>>                                          DW_AT_name: S<int>
> > > > >>>>>>                                          …
> > > > >>>>>>
> > > > >>>>>> The patch also has the nice property of omitting the defaulted
> > > > template arguments in the first level typedef. For example you get
> > > > vector<A> instead of vector<int, std::__1::allocator<int> >.
> > > > >>>>>
> > > > >>>>> If you specify `vector<int>` in C++ do you get that instead of
> > > > >>>>> `vector<int, std::__1::allocator<int>>`?
> > > > >>>>>
> > > > >>>>> Yeah, I mentioned this as possibly causing problems with
> debuggers
> > > > or other consumers, but I don't have any proof past "ooooo scary!”.
> > > > >>>>
> > > > >>>> Well, [+lldb-dev], could this confuse debuggers? :-)
> > > > >>>>
> > > > >>>> -- adrian
> > > > >>>>>
> > > > >>>>> That said, I like the idea personally :)
> > > > >>>>>
> > > > >>>>> -eric
> > > > >>>>>
> > > > >>>>>
> > > > >>>>>> Now there is one thing I really don’t like about the patch. In
> > > > order not to emit typedefs for types that don’t need it, I use string
> > > > comparison between the desugared and the original type. I haven’t
> > > > quantified anything, but doing the construction of the type name for
> every
> > > > template type and then comparing it to decide to use it or not seems
> like
> > > > a big waste.
> > > > >>>>>
> > > > >>>>> Maybe someone on cfe-dev knows a better way.
> > > > >>>>>
> > > > >>>>>>
> > > > >>>>>> Thoughts?
> > > > >>>>>>
> > > > >>>>>> <template-arg-typedefs.diff>
> > > > >>>>>>
> > > > >>>>>> Fred
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>>
> > > > >>>>>> _______________________________________________
> > > > >>>>>> llvm-commits mailing list
> > > > >>>>>> llvm-commits at cs.uiuc.edu
> > > > >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> > > > >>>>>
> > > > >>>>>
> > > > >>>>
> > > > >>>> _______________________________________________
> > > > >>>> lldb-dev mailing list
> > > > >>>> lldb-dev at cs.uiuc.edu
> > > > >>>> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> > > > >>>
> > > > >>
> > > > >>
> > > > >> _______________________________________________
> > > > >> lldb-dev mailing list
> > > > >> lldb-dev at cs.uiuc.edu
> > > > >> http://lists.cs.uiuc.edu/mailman/listinfo/lldb-dev
> > > > >
> > > >
> > > >
> > > > _______________________________________________
> > > > llvm-commits mailing list
> > > > llvm-commits at cs.uiuc.edu
> > > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
> > >
> > > _______________________________________________
> > > llvm-commits mailing list
> > > llvm-commits at cs.uiuc.edu
> > > http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20140917/74e3b42e/attachment.html>


More information about the llvm-commits mailing list