[lldb-dev] [llvm-dev] Adding DWARF5 accelerator table support to llvm

Fri Jun 15 07:42:13 PDT 2018

To elaborate a bit more on the issue that is detailed in https://reviews.llvm.org/rL260308:

There are many clang AST contexts that are used in LLDB:
- one for each lldb_private::Module that contains type definitions as we know them in the module and its symbol vendor
- one for each expression
- one for results of expressions in the lldb_private::Target

As we run expressions we end up copying classes around between the Module ASTs and the expression and Target ASTs. If a class has templated functions, they will only be in the DWARF is a specialization was created and used. If you have a class that looks like:

class A {
   A();
   <template T> void Foo(T t);
};

And then you have main.cpp that has a "double" and "int" specialization, the class definition in DWARF looks like:

class A {
   A();
   <int> void Foo(int t);
   <double> void Foo(double t);
};

In another source file, say foo.cpp, if its use of class A doesn't specialize Foo, we have a class definition in DWARF that looks like:

class A {
   A();
};

With the C++ ODR rules, we can pick any of the "class A" definitions whose qualified name matches ("::A") and has the same decl file + decl line. So when parsing "class A", the DWARF parser will use the accelerator tables to find all instances of "class A", and it will pick on and use it and this will become the one true definition for "class A". This is because DWARF is only emitted for template functions when there is a specialization, that mean any definition of "class A" might or might not include any definition for "<template T> A::Foo(T t);". When we copy types between ASTs, everything is fine if the classes match (no copy needs to be made), but things go wrong if things don't match and this causes expression errors. 

Some ways to fix this:
1 - anytime we need _any_ C++ class, we must dig up all definitions and check _all_ DW_TAG_subprogram DIEs within the class to check if any functions have templates and use the class with the most specializations
2 - have DWARF actually emit the template function info all the time as a type T, not a specialization, so we always have the full definition
3 - have some accelerator table that explicitly points us to all specializations of class methods given a class name

Solution #1 would cause us to dig through all definitions of all C++ classes all the time when parsing DWARF to check if definitions of the classes had template methods. And we would need to find the class that has the most template methods. This would cause us to parse much more of the debug info all of the time and cause increased memory consumption and performance regressions.

Solution #2: not sure if DWARF even supports generic template definitions where the template isn't specialized. And, how would we be able to tell DWARF that emits only specialized templates vs one that has generic definitions...

Solution #3 will require compiler changes.

So this is another vote to support the ability for a given class to be able to locate all of its functions, kind of like we need for Objective C where the class definition doesn't contain all of methods, so we have the .apple_objc section that provides this mapping for us. We would need something similar for C++.

So maybe a possible solution is some sort of section that can specify all of the DIEs related to a class that are not contained in the class hierarchy itself. This would work for Objective C and for C++.

Thoughts?

Greg

> On Jun 15, 2018, at 3:34 AM, Pavel Labath <labath at google.com> wrote:
> 
> I wasn't using type units (those don't work at all right now).
> 
> I've done a bit of digging, and i found this patch
> <https://reviews.llvm.org/rL260308> which explicitly disables template
> member function parsing (though it seems it didn't really work before
> either). The patch contains a quite long explanation of why is this
> not working. I can't say I understand all of it (this is getting a bit
> out of my league), but the core of the issue seems to be that when we
> start to mix classes from two CU which have different sets of
> instantiations in a single expression, things quickly go south because
> the recycled clang ASTs from the two dwarf versions do not match.
> 
> For better or worse, it seems gdb is having similar issues as well, as
> I couldn't get it to grok my member template expressions either..
> 
> On Thu, 14 Jun 2018 at 19:47, David Blaikie <dblaikie at gmail.com> wrote:
>> 
>> oh, awesome.
>> 
>> Were you using type units? (I imagine that'd make the situation worse - since the way clang emits DWARF for a type with a member function template implicit specialization is to emit the type unit without any mention of this, and to emit the implicit specialization declaration into the stub type in the CU (that references the type unit)) Without type units I'd be pretty surprised if you couldn't call the implicit specialization at least from the CU in which it was instantiated.
>> 
>> On Thu, Jun 14, 2018 at 11:41 AM Pavel Labath <labath at google.com> wrote:
>>> 
>>> On Thu, 14 Jun 2018 at 19:29, Pavel Labath <labath at google.com> wrote:
>>>> 
>>>> On Thu, 14 Jun 2018 at 19:26, David Blaikie <dblaikie at gmail.com> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>> On Thu, Jun 14, 2018 at 11:24 AM Pavel Labath <labath at google.com> wrote:
>>>>>> 
>>>>>> On Thu, 14 Jun 2018 at 17:58, Greg Clayton <clayborg at gmail.com> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Jun 14, 2018, at 9:36 AM, Adrian Prantl <aprantl at apple.com> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Jun 14, 2018, at 7:01 AM, Pavel Labath via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>>>>>> 
>>>>>>> Thank you all. I am going to try to reply to all comments in a single email.
>>>>>>> 
>>>>>>> Regarding the  .apple_objc idea, I am afraid the situation is not as
>>>>>>> simple as just flipping a switch.
>>>>>>> 
>>>>>>> 
>>>>>>> Jonas is currently working on adding the support for DWARF5-style Objective-C accelerator tables to LLVM/LLDB/dsymutil. Based on the assumption that DWARF 4 and earlier are unaffected by any of this, I don't think it's necessary to spend any effort of making the transition smooth. I'm fine with having Objective-C on DWARF 5 broken on trunk for two weeks until Jonas is done adding Objective-C support to the DWARF 5 implementation.
>>>>>> 
>>>>>> Ideally, I would like to enable the accelerator tables (possibly with
>>>>>> a different version number or something) on DWARF 4 too (on non-apple
>>>>>> targets only). The reason for this is that their absence if causing
>>>>>> large slowdowns when debugging on non-apple platforms, and I wouldn't
>>>>>> want to wait for dwarf 5 for that to go away (I mean no disrespect to
>>>>>> Paul and DWARF 5 effort in general, but even if all of DWARF 5 in llvm
>>>>>> was done tomorrow, there would still be lldb, which hasn't even begun
>>>>>> to look at this version).
>>>>>> 
>>>>>> That said, if you are working on the Objective C support right now,
>>>>>> then I am happy to wait two weeks or so that we have a full
>>>>>> implementation from the get-go.
>>>>>> 
>>>>>>> But, other options may be possible as well. What's not clear to me is
>>>>>>> whether these tables couldn't be replaced by extra information in the
>>>>>>> .debug_info section. It seems to me that these tables are trying to
>>>>>>> work around the issue that there is no straight way to go from a
>>>>>>> DW_TAG_structure type DIE describing an ObjC class to it's methods. If
>>>>>>> these methods (their forward declarations) were be present as children
>>>>>>> of the type DIE (as they are for c++ classes), then these tables may
>>>>>>> not be necessary. But maybe (probably) that has already been
>>>>>>> considered and deemed infeasible for some reason. In any case this
>>>>>>> seemed like a thing best left for people who actually work on ObjC
>>>>>>> support to figure out.
>>>>>>> 
>>>>>>> 
>>>>>>> That's really a question for Greg or Jim — I don't know why the current representation has the Objective-C methods outside of the structs. One reason might be that an interface's implementation can define more methods than are visible in its public interface in the header file, but we already seem to be aware of this and mark the implementation with DW_AT_APPLE_objc_complete_type. I also am not sure that this is the *only* reason for the objc accelerator table. But I'd like to learn.
>>>>>> 
>>>>>> My observation was based on studying lldb code. The only place where
>>>>>> the objc table is used is in the AppleDWARFIndex::GetObjCMethods
>>>>>> function, which is called from
>>>>>> SymbolFileDWARF::GetObjCMethodDIEOffsets, whose only caller is
>>>>>> DWARFASTParserClang::CompleteTypeFromDWARF, which seems to have a
>>>>>> class DIE as an argument. However, if not all declarations of a
>>>>>> class/interface have access to the full list of methods then this
>>>>>> might be a problem for the approach I suggested.
>>>>> 
>>>>> 
>>>>> Maybe, but the same is actually true for C++ classes too (see my comments in another reply about implicit specializations of class member templates (and there are a couple of other examples)) - so might be worth considering how those are handled/could be improved, and maybe in fixing those we could improve/normalize the ObjC representation and avoid the need for ObjC tables... maybe.
>>>>> 
>>>> 
>>>> That's a good point! I need to check out how we handle that right now.
>>> 
>>> Apparently we handle that very poorly. :/ I wasn't even able to call
>>> the instantiation which was present in the CU I was stopped in. I
>>> didn't even get to the part about trying an instantiation from a
>>> different CU.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/lldb-dev/attachments/20180615/929783bb/attachment-0001.html>