[cfe-dev] [llvm-dev] Emiting linkage names for Types to Debuginfo (C++ RTTI support in GDB/LLDB)

David Blaikie via cfe-dev cfe-dev at lists.llvm.org
Tue Mar 6 09:22:28 PST 2018


On Tue, Mar 6, 2018 at 8:39 AM Daniel Berlin <dberlin at dberlin.org> wrote:

> On Mon, Mar 5, 2018 at 11:55 PM, Roman Popov <ripopov at gmail.com> wrote:
>
>> I don't understand how extra vtable ref DIE will help in case on
>> non-polymorphic classes. If you remove virtual destructor from example,
>> vtable won't be generated for class, but DWARF will still have incorrect
>> ambiguous names for types.
>>
> 1. Calling them incorrect is ... not right.  As Andrew quoted on the gdb
> mailing list, this is what DWARF specifies should happen,
>

Might be helpful to point to/include any details cited here for the purpose
of this conversation - a bit hard for the rest of us to follow along.


> so they are correct by spec. If you believe the spec is wrong, file an
> issue on the DWARF web site and discuss it on the mailing list, and bring
> back the consensus of the committee as to what to do :)
>

The ambiguous names are probably incorrect - having two distinct types that
have the same name's not really going to work out well for a consumer. (so
having the distinct types foo<11u> and foo<11> in source both produce a
DWARF type named "foo<11>" I'd say is a bug that ought to be fixed - as is
any other case where the names become ambiguous, otherwise matching up
types between TUs would become impossible, which would be not good)


> 2. The failure that was cited on the gdb mailing list only occurs on
> polymorphic classes.   If you have it occurring on non-polymorphic classes,
> that seems like a very different problem, and probably related to the fact
> that GDB does not know how to assemble or parse C++ names properly in some
> cases.  Otherwise, this would occur on literally every class you saw in
> GDB, and that's definitely not the case:)
>

Sounds like Roman's talking about other use cases apart from GDB.


> The only reason linkage names would fix that issue is because they provide
> an exact match to GDB's parsing failure.
>

Not sure I follow this - providing linkage names would provide a reliable
mechanism to match the vtable symbol. There wouldn't need to be any
parsing, or any failure of parsing involved.

But, yes, addresses would be potentially a better description rather than
having to match names in the object's symbol table.


> You should just fix GDB.
> (GDB already knows how to collect and print out multiple symbols in the
> case they have the same name, FWIW)
>
>
>
>> It will become a problem when you need to use debuginfo as a C++ runtime
>> reflection (I've already seen this in a couple of projects).
>>
> Or when you need to go back from LLVM IR to Clang AST (I've already
>> encountered this problem).
>>
>
> I don't understand these use cases well enough to help, but if you think
> it's a serious issue, again, i'd take it up with the DWARF folks.
>
>
>> I wonder if  abi::__cxa_demangle guarantees unambigous names?
>>
>
> No, it does not.
>
>
>> If so, then I can just replace current incorrect names that Clang
>> generates, with names from demangler. In this case I don't even need to
>> patch gdb, it will work as is.
>>
>> -Roman
>>
>> 2018-03-05 10:46 GMT-08:00 Daniel Berlin <dberlin at dberlin.org>:
>>
>>>
>>>
>>> On Mon, Mar 5, 2018, 9:26 AM David Blaikie <dblaikie at gmail.com> wrote:
>>>
>>>> On Mon, Mar 5, 2018 at 9:09 AM Daniel Berlin <dberlin at dberlin.org>
>>>> wrote:
>>>>
>>>>> On Mon, Mar 5, 2018 at 8:37 AM, David Blaikie <dblaikie at gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, Mar 3, 2018 at 8:20 PM Daniel Berlin via llvm-dev <
>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>
>>>>>>> On Fri, Mar 2, 2018 at 3:58 PM, Roman Popov via llvm-dev <
>>>>>>> llvm-dev at lists.llvm.org> wrote:
>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> As you may know modern C++ debuggers (GDB and LLDB) support dynamic
>>>>>>>> type identification for polymorphic objects, by utilizing C++ RTTI.
>>>>>>>> Unfortunately this feature does not work with Clang and GDB >= 7.x
>>>>>>>> .  The last compiler that worked well was G++ 6.x
>>>>>>>>
>>>>>>>> I've asked about this issue both on GDB and LLDB maillists.
>>>>>>>> Unfortunately it's hard or impossible to fix it on debugger side.
>>>>>>>>
>>>>>>>
>>>>>>> Errr, i posited a solution on the gdb mailing list that i haven't
>>>>>>> seen shot down so far, that doesn't require linkage names, it only requires
>>>>>>> one new attribute that is a DW_FORM_ref, and very cheap.
>>>>>>>
>>>>>>
>>>>>> FWIW, for C++ at least, neither Clang nor GCC (6.3) produce any DWARF
>>>>>> to describe the vtable itself (they describe the vtable pointer inside the
>>>>>> struct, but not the constant vtable array) - so it'll be a bit more than
>>>>>> one attribute, but the bytes describe the vtable (as a global variable? Do
>>>>>> we give it a name? (if so, we're back to paying that cost)) first, then to
>>>>>> add the reference from that to the type.
>>>>>>
>>>>>
>>>>> Right, they produce a named symbol but not debug info.
>>>>>
>>>>> The only thing you need is a single DIE for that symbol, with a single
>>>>> ref.
>>>>>
>>>>
>>>> When you say "a single DIE" what attributes are you picturing that DIE
>>>> having? If it has a single attribute, a ref_addr to the type, that doesn't
>>>> seem to provide anything useful. Presumably this DIE would need a
>>>> DW_AT_location with the address of the vtable (with a relocation to resolve
>>>> that address, etc).
>>>>
>>>
>>> Location and concrete type it belongs to.  That's the minimum you should
>>> need here.
>>> You don't need the name, though it doesn't hurt.
>>>
>>>
>>>> No name? No other identifying features? I don't think we've ever really
>>>> produced DIEs like that, though it sounds OK to me.
>>>>
>>>>
>>>>>
>>>>> (IE they just need to be able to say "find me the DIE for this address
>>>>> range", have it get to the vtable DIE, and get to the concrete type die)
>>>>>
>>>>>
>>>>>>
>>>>>> & I'm not sure what Apple would do or anyone else that has libraries
>>>>>> without debug info shipped & users have to debug them (this is what broke
>>>>>> -fno-standalone-debug for Apple - their driver API which ships without
>>>>>> debug info of its own, has strong vtables in it).
>>>>>>
>>>>>
>>>>> I'm confused.
>>>>> This already seems to have  has the same issue?
>>>>> Just because it uses one linker symbol, it still requires full debug
>>>>> info to print the type anyway.
>>>>>
>>>> So if it's gone, nothing changes.
>>>>>
>>>>
>>>> Sorry, I don't quite understand your comment here - could you explain
>>>> it in more detail - the steps/issues you're seeing?
>>>>
>>>
>>> I think we are starting from different positions here, so let me add a
>>> few pieces of data and see how it helps.
>>>
>>> Let's assume the below is true and it won't work on OSX as described
>>> (i'm certainly in no place to disagree).
>>>
>>> Some data points:
>>>
>>> 1. LLDB works just fine on Darwin (it appears to do the same thing we
>>> did in gdb, staring
>>> at source/Plugins/LanguageRuntime/CPlusPlus/ItaniumABI/ItaniumABILanguageRuntime.cpp)
>>>
>>> 2. GDB does not work on Darwin at all for any real debugging right now
>>> (You can't debug llvm with it, for example).  There are barely working
>>> versions here and there.  The startup time to debug an "opt" binary from
>>> llvm is well over 2 minutes alone to get to a prompt just from typing "gdb
>>> bin/opt". It requires 4 gigs of ram.  It usually fails to print most
>>> symbols/types/crashes calling functions, blah blah blah.
>>> You can't even quit most of the time without hitting an assert.
>>> (gdb) q
>>> thread.c:93: internal-error: struct thread_info *inferior_thread():
>>> Assertion `tp' failed.
>>> A problem internal to GDB has been detected,
>>> further debugging may prove unreliable.
>>> Quit this debugging session? (y or n) y
>>>
>>>
>>>
>>> 3. On every platform, GDB will have to continue to use what it does now
>>> as a fallback anyway, as all existing binaries will not be rebuilt.
>>> 4. Ditto LLDB
>>>
>>> So for GDB, it doesn't really matter whether it breaks OSX, to start.
>>> Even if it did, it will still work as well or as not well as it has in the
>>> past :)
>>>
>>> LLDB works, and should work as well as it did with or without this as
>>> well.
>>>
>>> Given all that: No matter what we do, LLDB and GDB will continue to work
>>> exactly as well or as broken as they have before on OSX. Nothing will
>>> change.
>>>
>>> So i wouldn't call it broken, i'd call it, at worst, inapplicable to
>>> certain situations on OSX, and triggering a fallback :)
>>>
>>>
>>>
>>>> I'll try to do the same:
>>>> Currently the DWARF type information (the actual DW_TAG_class_type DIE
>>>> with the full definition of the class - its members, etc) on OSX goes
>>>> everywhere the type is used (rather than only in the object files where the
>>>> vtable is defined) to ensure that types defined in objects built without
>>>> debug info, but used in objects built with debug info can still be
>>>> debugged. (whereas on other platforms, like Linux, the assumption is made
>>>> that the whole program is built with debug info - OSX is different because
>>>> it has these system libraries for drivers that break this convention (&
>>>> because LLDB can't handle this situation) - so, because the system itself
>>>> breaks the assumption, the default is to turn off the assumption)
>>>>
>>>> I assumed your proposal would only add this debug info to describe the
>>>> vtable constant where the vtable is defined. Which would break OSX.
>>>>
>>>> If the idea would be to, in OSX (& other -fstandalone-debug
>>>> situations/platforms/users) would be to include this vtable DIE even where
>>>> the vtable is not defined - that adds a bit more debug info & also it means
>>>> debug info describing the declaration of a variable, also something we
>>>> haven't really done in LLVM before - again, technically possible, but a
>>>> nuance I'd call out/want to be aware of/think about/talk about (hence this
>>>> conversation), etc.
>>>>
>>>>
>>>>>
>>>>>> I can go into more detail there - but there are certainly some
>>>>>> annoying edge cases/questions I have here :/
>>>>>>
>>>>>
>>>>> Constructive alternative?
>>>>>
>>>>
>>>> Not sure - not saying what your proposing isn't workable - but I do
>>>> want to understand the practical/implementation details a bit to see how it
>>>> plays out - hence the conversation above.
>>>>
>>>
>>> FWIW, i don't have a lot of time/energy to push this, so i'm pretty much
>>> going to bow out at this point and leave folks to their own devices. I just
>>> wanted to point out there are other solutions that would likely work a lot
>>> better over time.
>>>
>>>
>>>>
>>>>
>>>>> Right now, relying on *more* names, besides being huge in a lot of
>>>>> binaries,  relies on the demangler producing certain text (which is not
>>>>> guaranteed)
>>>>> That text has to exactly match the text of some other symbol (which is
>>>>> not guaranteed).
>>>>>
>>>>
>>>> *nod* I agree that the name matching based on demangling is a bad idea.
>>>>
>>>>
>>>>> That 10 second delay you get sometimes with going to print a C++
>>>>> symbol in a large binary?
>>>>>
>>>>> That's this lookup.
>>>>>
>>>>> So right now it:
>>>>> 1. Uses a ton of memory
>>>>> 2. Uses a ton of time
>>>>> 3. Doesn't work all the time (depends on demanglers, and there are
>>>>> very weird edge cases here).
>>>>>
>>>>> Adding linkage names will not change any of these, whereas adding a
>>>>> DWARF extension fixes all three, forever.
>>>>>
>>>>
>>>> Not sure I follow this - debuggers do lots of name lookups, I would've
>>>> thought linkage name<>linkage name lookup could be somewhat practical
>>>> (without all the fuzzy matching logic).
>>>>
>>>
>>> You'd think it would be optimized for this, but for GDB, it will now
>>> pull in every symbol table looking for the name, until it finds it.  It
>>> does not, for example, build a global index of names so it knows what CU to
>>> go read from or anything smart like that.
>>> (it's a little more nuanced than this, but in practice, not)
>>>
>>>>
>>>>
>>>>> I don't even care about the details of the extension, my overriding
>>>>> constraint is "please don't extend this hack further given the above".
>>>>>
>>>>
>>>> Mangled to demangled name matching seems like a hack - matching the
>>>> mangled names doesn't seem like such a hack to me - but, yeah, I'm totally
>>>> open to an address based solution as you're suggesting, just trying to
>>>> figure out the details/issues.
>>>>
>>>
>>> At the time, the mangled name was not available anywhere.
>>> It looks like name() is supposed to now return the mangled name in the
>>> itanium ABI.
>>> So theoretically, you could just change GDB to call the name function(),
>>> look that up in the minimal symbol tables (name->address mappings, without
>>> debug info), and go to the full symbol table info for that address.  This
>>> avoids needing the DW_AT_name in the debuginfo to match, only the name in
>>> the symbol table.
>>>
>>> This will break if you use -fno-rtti, whereas the vtable way (either
>>> existing or proposed) would still work.
>>>
>>> G++ actually *had* linkage names for types for a long time in the debug
>>> info, and deliberately removed them due to space usage.
>>>
>>>
>>>> Have you got a link/steps to a sample/way to get GCC to produce this
>>>> sort of debug info? (at least with 6.3 using C++ I don't see any debug info
>>>> like this describing a vtable)
>>>>
>>>
>>> Yeah, nothing does it yet.
>>> Bug tom tromey,  who did it for Rust, not C++
>>>
>>>
>>>> - Dave
>>>>
>>>>
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/cfe-dev/attachments/20180306/f06aa922/attachment.html>


More information about the cfe-dev mailing list