[lldb-dev] Resolving dynamic type based on RTTI fails in case of type names inequality in DWARF and mangled symbols
Anton Gorenkov via lldb-dev
lldb-dev at lists.llvm.org
Tue Dec 19 13:20:42 PST 2017
19.12.2017 23:12, Greg Clayton wrote:
>
>> On Dec 19, 2017, at 12:33 PM, Anton Gorenkov <xgsa at yandex.ru
>> <mailto:xgsa at yandex.ru>> wrote:
>>
>> Tamas, Greg, thank you, I got the idea how it should work without
>> accelerator tables, but I still cannot figure out how to use/update
>> the existing accelerator tables. So let me walk trough it once again:
>> 1. It is necessary to perform lookup by mangled name (as all we
>> initially have is mangled "vtable for ClassName"-symbol).
>> 2. All the existing apple accelerator tables (e.g. apple_types)
>> have demangled and unqualified names as a key.
>> 3. It is not always possible to get the original demanled type name
>> by the mangled one (e.g. for templates parametrized with enums the
>> demangled one is Impl<(TagType)0> vs original Impl<TagType::Tag1>,
>> but there are more complex cases).
>>
>> Thus, I don't see how adding DW_AT_linkage_name to vtable member of
>> class (or even to class itself) could help, as it still won't be
>> possible to resolve DIE by the mangled type name. However possible
>> solutions are:
>> 1. To generate a separate accelerator table: mangled name for
>> vtable member of a class => DIE;
>> 2. Build index on startup iterating through the apple_types and
>> gather the map mangled name => DIE;
>>
>> Greg, did you mean some of these or something else?
>
> I didn't realize that the mangled name differs in certain cases and
> that it wouldn't suffice for a lookup. Can you give an example of the
> name we try looking up versus what is actually in the symbol table?
Case 1:
enum class TagType : bool {
Tag1
};
struct I {
virtual ~I() = default;
};
template <TagType Tag>
struct Impl : public I {
private:
int v = 123;
};
int main(int argc, const char * argv[]) {
Impl<TagType::Tag1> impl;
I& i = impl;
return 0;
}
lldb demangles the name to Impl<(TagType)0> and it's "Impl<TagType::Tag1>" in DWARF generated by clang.
Case 2:
struct I
{
virtual ~I(){}
};
template <int Tag>
struct Impl : public I
{
int v = 123;
};
template <>
struct Impl<1+1+1> : public I // Note the expression used for this specialization
{
int v = 124;
};
template <class T>
struct TT {
I* i = new T();
};
int main(int argc, const char * argv[]) {
TT<Impl<3>> tt;
return 0; // [*]
}
lldb demangles name to "Impl<3>", whereas clang generates "Impl<1+1+1>" in DWARF.
> IIUC right now we lookup the address of the first pointer within a
> class if it is virtual and find the symbol name that this corresponds
> to, and in the failing cases you have we don't find anything in the
> DWARF that matches. Is that right?
Exactly, for the cases above and some others.
>>
>> Thanks,
>> Anton.
>>
>> 19.12.2017 19:39, Greg Clayton wrote:
>>> I agree with Tamas. The right way to do this it to add the
>>> DW_AT_linkage_name to the class. Apple accelerator tables have many
>>> different forms, but one is a mapping of type name to exact DIE
>>> offset (in the __DWARF_ segment in the __apple_types section). If
>>> the mangled name was added to the class, then the apple accelerator
>>> tables would have it. So when a lookup happens with these tables
>>> around, we do a very quick hash lookup, and we find the exact DIE
>>> (or DIEs) we need. Entries for classes in the Apple accelerator
>>> tables have both the mangled and raw class name as entries pointing
>>> to the same DIE since lookups don't usually happen via mangled
>>> names. LLDB also knows how to pull names apart and search correctly,
>>> so if someone tries to lookup a type with "a::b::MyClass", we will
>>> chop that up into "MyClass" and do a lookup on that. We might get
>>> many many different "MyClass" results back (a::c::MyClass,
>>> ::MyClass, b::MyClass), but then we cull those down by making sure
>>> any matches have a matching decl context of "a::b::". For mangled
>>> names, it is easy and just a direct lookup.
>>>
>>> The apple accelerator tables are only enabled for Darwin target, but
>>> there is nothing to say we couldn't enable these for other targets
>>> in ELF files. It would be a quick way to gauge the performance
>>> improvement that these accelerator tables provide for linux.
>>> Currently linux will completely index the DWARF, but it will load
>>> the DWARF, index it, and unload the DWARF so we don't hog memory for
>>> things we don't need loaded yet. We must manually index the DWARF
>>> because the DWARF accelerator tables are really not accelerator
>>> tables, they are random indexes of related data (names in no
>>> particular order, addresses in or particular order). These tables
>>> are also not complete so no debugger can rely on them. For example
>>> ".debug_pubtypes" is for "public" types only. ".debug_pubnames" is a
>>> random name table with only public functions (no static functions or
>>> functions in anonymous namespaces). So the DWARF accelerator tables
>>> can't be used by debuggers.
>>>
>>> There is now a modified version of the Apple accelerator tables in
>>> the DWARF standard that can provide the same data as the Apple
>>> versions, but I don't believe anyone has added this support to any
>>> compilers yet. So for simplicity, we can try things out with the
>>> Apple accelerator tables and see how things go.
>>>
>>> Another solution involves using llvm-dsymutil, a DWARF linker that
>>> is used on Apple platforms. It is a tool that is normally run on
>>> executables where the DWARF is left in the .o files and linked later
>>> into final DWARF files. This tool also has a "--update" option that
>>> take a linked dSYM file and updates the accelerator tables in case
>>> they change over time, or in case an older version of llvm-dsymutil
>>> didn't add everything that was needed to the tables due to a bug. So
>>> another way we can try this out is to modify the llvm-dsymutil to
>>> work with ELF files and have it generate and add the Apple
>>> accelerator tables to the ELF files. This is nice because it allows
>>> us to use DWARF that is generated by any compiler (no need for the
>>> compiler to support making the accelerator tables). This would a
>>> great way to try out the accelerator tables without requiring
>>> compiler changes.
>>>
>>> The short term solution is to validate that the Apple accelerator
>>> tables work and do speed debugging up by a large amount. The long
>>> term solution is to have clang start emitting the new DWARF
>>> accelerator tables and modify LLDB to support and use those tables.
>>>
>>> Let me know if there are any questions on any of this.
>>>
>>> Greg Clayton
>>>
>>>> On Dec 19, 2017, at 5:35 AM, Tamas Berghammer via lldb-dev
>>>> <lldb-dev at lists.llvm.org
>>>> <mailto:lldb-dev at lists.llvm.org><mailto:lldb-dev at lists.llvm.org>>
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I thought most compiler still emits DW_AT_MIPS_linkage_name instead
>>>> of the standard DW_AT_linkage_name but I agree that if we can we
>>>> should use the standard one.
>>>>
>>>> Regarding performance we have 2 different scenarios. On Apple
>>>> platforms we have the apple accelerator tables to improve load time
>>>> (might work on FreeBsd as well) while on other platforms we Index
>>>> the DWARF data (DWARFCompileUnit::Index) to effectively generate
>>>> accelerator tables in memory what is a faster process then fully
>>>> parsing the DWARF (currently we only parse function DIEs and we
>>>> don't build the clang types). I think an ideal solution would be to
>>>> have the vtable name stored in DWARF so the DWARF data is
>>>> standalone and then have some accelerator tables to be able to do
>>>> fast lookup from mangled symbol name to DIE offset. I am not too
>>>> familiar with the apple accelerator tables but if we have anything
>>>> what maps from mangled name to DIE offset then we can add a few
>>>> entry to it to map from mangled vtable name to type DIE or vtable DIE.
>>>>
>>>> Tamas
>>>>
>>>> On Mon, Dec 18, 2017 at 9:02 PM xgsa <xgsa at yandex.ru
>>>> <mailto:xgsa at yandex.ru><mailto:xgsa at yandex.ru>> wrote:
>>>>
>>>> Hi Tamas,
>>>> First, why DW_AT_MIPS_linkage_name, but not just
>>>> DW_AT_linkage_name? The later is standartized and currently
>>>> generated by clang at least on x64.
>>>> Second, this doesn't help to solve the issue, because this will
>>>> require parsing all the DWARF types during startup to build a map
>>>> that breaks DWARF lazy load, performed by lldb. Or am I missing
>>>> something?
>>>> Thanks,
>>>> Anton.
>>>> 18.12.2017, 22:59, "Tamas Berghammer" <tberghammer at google.com
>>>> <mailto:tberghammer at google.com>
>>>> <mailto:tberghammer at google.com>>:
>>>>>
>>>>> Hi Anton and Jim,
>>>>>
>>>>> What do you think about storing the mangled type name or the
>>>>> mangled vtable symbol name somewhere in DWARF in the
>>>>> DW_AT_MIPS_linkage_name attribute? We are already doing it for
>>>>> the mangled names of functions so extending it to types
>>>>> shouldn't be too controversial.
>>>>>
>>>>> Tamas
>>>>>
>>>>> On Mon, 18 Dec 2017, 17:29 xgsa via lldb-dev,
>>>>> <lldb-dev at lists.llvm.org
>>>>> <mailto:lldb-dev at lists.llvm.org><mailto:lldb-dev at lists.llvm.org>>
>>>>> wrote:
>>>>>
>>>>> Thank you for clarification, Jim, you are right, I
>>>>> misunderstood a little bit what lldb actually does.
>>>>>
>>>>> It is not that the compiler can't be fixed, it's about the
>>>>> fact that relying on correspondence of mangled and demangled
>>>>> forms are not reliable enough, so we are looking for more
>>>>> robust alternatives. Moreover, I am not sure that such fuzzy
>>>>> matching could be done just basing on class name, so it will
>>>>> require reading more DIEs. Taking into account that, for
>>>>> instance, in our project there are quite many such types, it
>>>>> could noticeable slow down the debugger.
>>>>>
>>>>> Thus, I'd like to mention one more alternative and get your
>>>>> feedback, if possible. Actually, what is necessary is the
>>>>> correspondence of mangled and demangled vtable symbol.
>>>>> Possibly, it worth preparing a separate section during
>>>>> compilation (like e.g. apple_types), which would store this
>>>>> correspondence? It will work fast and be more reliable than
>>>>> the current approach, but certainly, will increase debug
>>>>> info size (however, cannot estimate which exact increase
>>>>> will be, e.g. in persent).
>>>>>
>>>>> What do you think? Which solution is preferable?
>>>>>
>>>>> Thanks,
>>>>> Anton.
>>>>>
>>>>> 15.12.2017, 23:34, "Jim Ingham" <jingham at apple.com
>>>>> <mailto:jingham at apple.com>
>>>>> <mailto:jingham at apple.com>>:
>>>>> > First off, just a technical point. lldb doesn't use RTTI
>>>>> to find dynamic types, and in fact works for projects like
>>>>> lldb & clang that turn off RTTI. It just uses the fact that
>>>>> the vtable symbol for an object demangles to:
>>>>> >
>>>>> > vtable for CLASSNAME
>>>>> >
>>>>> > That's not terribly important, but I just wanted to make
>>>>> sure people didn't think lldb was doing something fancy with
>>>>> RTTI... Note, gdb does (or at least used to do) dynamic
>>>>> detection the same way.
>>>>> >
>>>>> > If the compiler can't be fixed, then it seems like your
>>>>> solution [2] is what we'll have to try.
>>>>> >
>>>>> > As it works now, we get the CLASSNAME from the vtable
>>>>> symbol and look it up in the the list of types. That is
>>>>> pretty quick because the type names are indexed, so we can
>>>>> find it with a quick search in the index. Changing this over
>>>>> to a method where we do some additional string matching
>>>>> rather than just using the table's hashing is going to be a
>>>>> fair bit slower because you have to run over EVERY type
>>>>> name. But this might not be that bad. You would first look
>>>>> it up by exact CLASSNAME and only fall back on your fuzzy
>>>>> match if this fails, so most dynamic type lookups won't see
>>>>> any slowdown. And if you know the cases where you get into
>>>>> this problem you can probably further restrict when you need
>>>>> to do this work so you don't suffer this penalty for every
>>>>> lookup where we don't have debug info for the dynamic type.
>>>>> And you could keep a side-table of mangled-name -> DWARF
>>>>> name, and maybe a black-list for unfound names, so you only
>>>>> have to do this once.
>>>>> >
>>>>> > This estimation is based on the assumption that you can do
>>>>> your work just on the type names, without having to get more
>>>>> type information out of the DWARF for each candidate match.
>>>>> A solution that relies on realizing every class in lldb so
>>>>> you can get more information out of the type information to
>>>>> help with the match will defeat all our attempts at lazy
>>>>> DWARF reading. This can cause quite long delays in big
>>>>> programs. So I would be much more worried about a solution
>>>>> that requires this kind of work. Again, if you can reject
>>>>> most potential candidates by looking at the name, and only
>>>>> have to realize a few likely types, the approach might not
>>>>> be that slow.
>>>>> >
>>>>> > Jim
>>>>> >
>>>>> >> On Dec 15, 2017, at 7:11 AM, xgsa via lldb-dev
>>>>> <lldb-dev at lists.llvm.org
>>>>> <mailto:lldb-dev at lists.llvm.org><mailto:lldb-dev at lists.llvm.org>>
>>>>> wrote:
>>>>> >>
>>>>> >> Sorry, I probably shouldn't have used HTML for that
>>>>> message. Converted to plain text.
>>>>> >>
>>>>> >> -------- Original message --------
>>>>> >> 15.12.2017, 18:01, "xgsa" <xgsa at yandex.ru
>>>>> <mailto:xgsa at yandex.ru>
>>>>> <mailto:xgsa at yandex.ru>>:
>>>>> >>
>>>>> >> Hi,
>>>>> >>
>>>>> >> I am working on issue that in C++ program for some
>>>>> complex cases with templates showing dynamic type based on
>>>>> RTTI in lldb doesn't work properly. Consider the following
>>>>> example:
>>>>> >> enum class TagType : bool
>>>>> >> {
>>>>> >> Tag1
>>>>> >> };
>>>>> >>
>>>>> >> struct I
>>>>> >> {
>>>>> >> virtual ~I() = default;
>>>>> >> };
>>>>> >>
>>>>> >> template <TagType Tag>
>>>>> >> struct Impl : public I
>>>>> >> {
>>>>> >> private:
>>>>> >> int v = 123;
>>>>> >> };
>>>>> >>
>>>>> >> int main(int argc, const char * argv[]) {
>>>>> >> Impl<TagType::Tag1> impl;
>>>>> >> I& i = impl;
>>>>> >> return 0;
>>>>> >> }
>>>>> >>
>>>>> >> For this example clang generates type name
>>>>> "Impl<TagType::Tag1>" in DWARF and "__ZTS4ImplIL7TagType0EE"
>>>>> when mangling symbols (which lldb demangles to
>>>>> Impl<(TagType)0>). Thus when in
>>>>> ItaniumABILanguageRuntime::GetTypeInfoFromVTableAddress()
>>>>> lldb tries to resolve the type, it is unable to find it.
>>>>> More cases and the detailed description why lldb fails here
>>>>> can be found in this clang review, which tries to fix this
>>>>> in clang [1].
>>>>> >>
>>>>> >> However, during the discussion around this review [2],
>>>>> it was pointed out that DWARF names are expected to be close
>>>>> to sources, which clang does perfectly, whereas mangling
>>>>> algorithm is strictly defined. Thus matching them on
>>>>> equality could sometimes fail. The suggested idea in [2] was
>>>>> to implement more semantically aware matching. There is
>>>>> enough information in the DWARF to semantically match
>>>>> "Impl<(TagType)0>)" with "Impl<TagType::Tag1>", as enum
>>>>> TagType is in the DWARF, and the enumerator Tag1 is present
>>>>> with its value 0. I have some concerns about the performance
>>>>> of such solution, but I'd like to know your opinion about
>>>>> this idea in general. In case it is approved, I'm going to
>>>>> work on implementing it.
>>>>> >>
>>>>> >> So what do you think about type names inequality and the
>>>>> suggested solution?
>>>>> >
>>>>> >> [1] -https://reviews.llvm.org/D39622
>>>>> >> [2] -
>>>>> http://lists.llvm.org/pipermail/cfe-commits/Week-of-Mon-20171211/212859.html
>>>>> >>
>>>>> >> Thank you,
>>>>> >> Anton.
>>>>> >> _______________________________________________
>>>>> >> lldb-dev mailing list
>>>>> >>lldb-dev at lists.llvm.org
>>>>> <mailto:lldb-dev at lists.llvm.org><mailto:lldb-dev at lists.llvm.org>
>>>>> >>http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>>>> _______________________________________________
>>>>> lldb-dev mailing list
>>>>> lldb-dev at lists.llvm.org
>>>>> <mailto:lldb-dev at lists.llvm.org><mailto:lldb-dev at lists.llvm.org>
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>>>>>
>>>> _______________________________________________
>>>> lldb-dev mailing list
>>>> lldb-dev at lists.llvm.org
>>>> <mailto:lldb-dev at lists.llvm.org><mailto:lldb-dev at lists.llvm.org>
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20171219/1afd6ab6/attachment-0001.html>
More information about the lldb-dev
mailing list