[PATCH] D39622: Fix type name generation in DWARF for template instantiations with enum types and template specializations

Tue Dec 19 13:04:01 PST 2017

There was a discussion in lldb-dev mailing list on this topic and I 
suppose a reliable solution was suggested [1]. It is to generate 
DW_AT_linkage_name for vtable DIE of a class and provide an additional 
accelerator table. I am going to try to implement this approach (it will 
require some work on both clang and lldb sides), but I'd like also to 
understand if I should discard or complete the current patch. Certainly, 
I'd prefer to complete it if it could be applied (I suppose, at least 
tests should be added), because even with long term solution implemented 
in clang/lldb, gdb still won't resolve dynamic types properly for the 
described cases.

[1] - http://lists.llvm.org/pipermail/lldb-dev/2017-December/013048.html

15.12.2017 21:25, David Blaikie via cfe-commits wrote:
>
>
> On Fri, Dec 15, 2017 at 8:09 AM xgsa <xgsa at yandex.ua 
> <mailto:xgsa at yandex.ua>> wrote:
>
>     David, thank you for the detailed answer and corner cases.
>     Just to clarify: everywhere in my mail where I mentioned
>     "debugger", I meant LLDB, but not GDB (except, where I mentioned
>     GDB explicitly). Currently, I have no plans to work on GDB,
>     however I would like to make the clang+LLDB pair working in such
>     cases.
>
>
> *nod* My concern is making sure, if possible, we figure out a design 
> that seems viable long-term/in general. (& if we figure out what that 
> design is, but decide it's not achievable immediately, we can make 
> deliberate tradeoffs, document the long term goal & what the short 
> term solutions cost relative to that goal, etc)
>
>     Thus, I have described your idea in the lldb-dev mailing list [1].
>     Still, I have some concerns about the performance of such
>     semantically aware matching. Currently, with acceleration tables
>     (e.g. apple_types etc) the matching is as fast as lookup in hash
>     map and hash map is loade almost without postprocessing.
>     Semantically aware matching will require either processing during
>     statup or almost linear lookup.
>
>
> Yep, I agree - that seems like a reasonable concern. I wonder whether 
> it'd be reasonable to put accelerator table entries containing the 
> base name of the template to ease such lookup?
>
>      Still, should this topic be raised in cde-dev or are all the
>     interested people already here?
>
>
> Yeah, might be worth moving this to a thread there. Though we probably 
> have all the right people here, it's a better spot for the 
> conversation even for spectators, history (finding this later when we 
> have similar questions, etc), etc.
>
>     [1] -
>     http://lists.llvm.org/pipermail/lldb-dev/2017-December/013038.html
>     14.12.2017, 22:40, "David Blaikie" <dblaikie at gmail.com
>     <mailto:dblaikie at gmail.com>>:
>>     On Thu, Dec 14, 2017 at 2:21 AM Anton via Phabricator
>>     <reviews at reviews.llvm.org <mailto:reviews at reviews.llvm.org>> wrote:
>>
>>         xgsa added a comment.
>>
>>         In https://reviews.llvm.org/D39622#954585, @probinson wrote:
>>
>>         > Philosophically, mangled names and DWARF information serve
>>         different purposes, and I don't think you will find one true
>>         solution where both of them can yield the same name that
>>         everyone will be happy with.  Mangled names exist to provide
>>         unique and reproducible identifiers for the "same" entity
>>         across compilation units.  They are carefully specified (for
>>         example) to allow a linker to associate a reference in one
>>         object file to a definition in a different object file, and
>>         be guaranteed that the association is correct.  A demangled
>>         name is a necessarily context-free translation of the mangled
>>         name into something that has a closer relationship to how a
>>         human would think of or write the name of the thing, but
>>         isn't necessarily the only way to write the name of the thing.
>>         >
>>         > DWARF names are (deliberately not carefully specified)
>>         strings that ought to bear some relationship to how source
>>         code would name the thing, but you probably don't want to
>>         attach semantic significance to those names.  This is rather
>>         emphatically true for names containing template parameters. 
>>         Typedefs (and their recent offspring, 'using' aliases) are
>>         your sworn enemy here.  Enums, as you have found, are also a
>>         problem.
>>         >
>>         > Basically, the type of an entity does not have a unique
>>         name, and trying to coerce different representations of the
>>         type into having the same unique name is a losing battle.
>>
>>
>>     I'm actually going back and forth on this ^. It seems to me,
>>     regardless of mangled names, etc, it'd be good if LLVM used the
>>     same name for a type in DWARF across different translation units.
>>     And, to a large extent, we do (the case of typedefs in template
>>     parameters doesn't seem to present a problem for the current
>>     implementation - the underlying type is used), enums being one
>>     place where we don't - and we don't actually make it that much
>>     closer to the source/based on what the user wrote.
>>
>>     Even if the user had: "enum X { Y = 0, Z = 0; } ... template<enum
>>     X> struct foo; ... foo<Z>" LLVM still describes that type as
>>     "foo<X::Y>". Also if you have "enum X: int; ... foo<(X)0>" you
>>     get "foo<0>" whereas in another translation unit with a
>>     definition of X you'd get "foo<X::Y>".
>>
>>     So for consistency there, I kind of think maybe a change like
>>     this isn't bad.
>>
>>     But of course the specific way a template name is written may
>>     easily still vary between compilers, so relying on it being
>>     exactly the same might not be a great idea anyway...
>>
>>         Thank you for clarification, Paul! Nevertheless, I suppose,
>>         showing actual type of a dynamic variable is very important
>>         for the projects, where RTTI is used. Moreover, it works
>>         properly in gcc+gdb pair, so I am extremely interested in
>>         fixing it in clang+lldb.
>>
>>         I understand that the suggested solution possibly does not
>>         cover all the cases, but it improves the situation and
>>         actually covers all the cases found by me (I have just
>>         rechecked -- typedefs/usings seems to work fine when
>>         displaying the real type of variable). If more cases are
>>         found in future, they could be fixed similarly too. Moreover,
>>         the debuggers already rely on the fact that the type name
>>         looks the same in RTTI and DWARF, and I suppose they have no
>>         choice, because there is no other source of information for
>>         them (or am I missing something?).
>>
>>
>>     I think they would have a choice, actually - let's walk through
>>     it...
>>
>>     It sounds like you're thinking of two other possibilities:
>>
>>     1) "I suppose, we cannot extend RTTI with the debug type name (is
>>     it correct?)" - yeah, that's probably correct, extending the RTTI
>>     format probably isn't desirable and we'd still need a
>>     singular/canonical DWARF name which we don't seem to have (& the
>>     RTTI might go in another object file that may not have debug
>>     info, or debug info generated by a different compiler with a
>>     different type printing format, etc... )
>>
>>     2) Extending DWARF to include the mangled name
>>     Sort of possible, DW_AT_linkage_name on a DW_AT_class could be
>>     used for this just fine - no DWARF extension required.
>>
>>     But an alternative would be to have debuggers use a more
>>     semantically aware matching here. The debugger does have enough
>>     information in the DWARF to semantically match "foo<(X)0>" with
>>     "foo<X::Y>". enum X is in the DWARF, and the enumerator Y is
>>     present with its value 0.
>>
>>     Another case of Clang's DWARF type printing differing from a
>>     common demangling, is an unsigned parameter. template<unsigned>
>>     foo; foo<0> - common demangling for this is "foo<0u>" but Clang
>>     will happily render the type as "foo<0>" - this one seems less
>>     easy to justify changing than the enum case (the enum case, given
>>     the declared-but-not-defined enum example, seems more compelling
>>     to try to have clang give a consistent name to the type (which,
>>     while not complete (differing compilers could still use different
>>     printings), seems somewhat desirable)) because it's at least
>>     self-consistent.
>>
>>     Again, in this case, a debugger could handle this.
>>
>>     All that said, GDB is the elephant in the room and I imagine
>>     might have no interest in adopting a more complex name
>>     lookup/comparison strategy & we might just have to bow to their
>>     demangling printing and naming scheme... but might be worth
>>     asking GDB folks first? Not sure.
>>
>>         Another advantage of this solution is that it doesn't require
>>         any format extension and will probably work out of the box in
>>         gdb and other debuggers. Moreover, I have just rechecked, gcc
>>         generates exactly the same type names in DWARF for examples
>>         in the description.
>>
>>         On the other hand, I understand the idea you have described,
>>         but I am not sure how to implement this lookup in another
>>         way. I suppose, we cannot extend RTTI with the debug type
>>         name (is it correct?). Thus, the only way I see is to add
>>         additional information about the mangled type name into
>>         DWARF. It could be either a separate section (like
>>         apple_types) or a special node for
>>         TAG_structure_type/TAG_class_type, which should be indexed
>>         into map for fast lookup. Anyway, this will be an extension
>>         to DWARF and will require special support in a debugger.
>>         Furthermore, such solution will be much complicated (still I
>>         don't mind working on it).
>>
>>         So what do you think? Is the suggested solution not full or
>>         not acceptable? Do you have other ideas how this feature
>>         should be implemented?
>>
>>         P.S. Should this question be raised in mailing list? And if
>>         yes, actually, in which ones (clang or lldb?), because it
>>         seems related to both clang and lldb?
>>
>>
>>         https://reviews.llvm.org/D39622
>>
>>
>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits