[lldb-dev] RFC: -flimit-debug-info + frame variable

Mon Jul 20 14:04:34 PDT 2020

Thanks for the write up!

I agree that the existing APIs are useful for exploring the types as they appear (and are completed within) in the module they came from. Now we are asking more complex questions from them. As with all software, things started out simple and have gotten quite a bit more complex as we went along and added interfaces and new requirements.

My first thought was that as soon as you dive into a CompilerType with a question about the type itself, or anything contained within, where any parts of the type can be incomplete, you can't just use a CompilerType on its own. Each CompilerType might be able to tell you the module, and possibly only a target for expressions ASTs or target ASTs for expression results, but it might not have access to a target if we have a type from a module. 

So we need new API calls where you must supply a target so that you can find the type in other modules when you find something that we know is incomplete and need the real type. We can add new APIs to the TypeSystem class to take care of this and these APIs will need to take a target to allow finding types outside of the current module's types. If we add new APIs and switch any code that requires resolving of types on the fly over to using the new APIs, we might be able to leave the old APIs in place or remove them if they are no longer used after the refactor.

Or the other option is to try and leave the TypeSystem and CompilerType stuff alone and add a new "TargetType" class that has a target + CompilerType. And lookups on those types can be smart about where they grab types? They could still call through to new TypeSystem virtual functions that use the target for resolving types. 

More comments inlined below between your paragraphs.

> On Jul 20, 2020, at 5:34 AM, Pavel Labath <pavel at labath.sk> wrote:
> 
> Hello all,
> 
> With the expression evaluation part of -flimit-debug-info nearly
> completed, I started looking at doing the same for the "frame variable"
> command.
> 
> I have thought this would be simpler than the expression evaluator,
> since we control most of that code. However, I have hit one snag, hence
> this RFC.
> 
> The problem centers around how to implement
> ValueObject::GetChildMemberWithName, which is the engine of the
> subobject resultion in the "frame variable" command. Currently, this
> function delegates most of the work to
> CompilerType::GetIndexOfChildMemberWithName, which returns a list of (!)
> indexes needed to access the relevant subobject. The list aspect is
> important, because the desired object can be in a base class or in a C11
> anonymous struct member.
> 
> The CompilerType instance in question belongs to the type system of the
> module from which we retrieved the original variable. Therein lies the
> problem -- this type system does not have complete information about the
> contents of the base class subobjects.

yes, and this requires a target (or a module list from the target to be more precise) in order to answer the questions.

> 
> Now, my question is what to do about it. At the moment, it seems to me
> that the easiest solution to this problem would be to replace
> CompilerType::GetIndexOfChildMemberWithName, with two new interfaces:
> - Get(IndexOf)**Direct**ChildMemberWithName -- return any direct
> children with the given name
> - IsTransparent -- whether to descend into the type during name lookups
> (i.e., is this an anonymous struct member)
> 
> The idea is that these two functions (in conjunction with existing
> methods) can provide their answers even in a -flimit-debug-info setting,
> and they also provide enough information for the caller to perform the
> full name lookup himself. It would first check for direct members, and
> if no matches are found, (recursively) proceed to look in all the
> transparent members and base classes, switching type systems if the
> current one does not contain the full type definition.
> 
> The downside of that is that this would hardcode a specific, c++-based,
> algorithm which may not be suitable for all languages. Swift has a
> fairly simple inheritance model, so I don't think this should be a
> problem there, but for example python uses a slightly different method
> to resolve ambiguities. The second downside is that a faithful
> implementation of the c++ model, including the virtual inheritance
> dominance is going to be fairly complicated.

Sounds like that can work easily for C/C++. I would prefer to leave things up to the type systems for name lookups so they can each do the lookup however they can by using the type itself and or looking up completed types in the target's modules. The fact that the current solution for name lookup relies on indexes was just a convenience and happened to work for C/C++ and the static typing we have had up until your new support. The index solution isn't required in any new solution.

> 
> The first issue could be solved by moving this logic into the clang
> plugin, but making it independent of any specific type system instance.
> The second issue is unavoidable, except by creating a unified view of
> the full type in some scratch ast context, as we do for expression
> evaluation.

Creating a complete new type in an AST would be a possible solution that avoids all of this, but then do we have ValueObject objects that each have a scratch AST? or do we fill up the target scratch AST up and hope for no collisions? We can avoid needing to copy the type over into the scratch AST if we know it has all completed information. So maybe ValueObjects can do a quick check on the type to see if anything requires completion and only copy the type over to the target scratch AST if needed. The main downfall there is we might end up completing a lot of the type when we don't need to when we make this copy. But it has the benefit of sharing code with the expression parser, so we know it would be kept up to date.
> 
> That said, it's not clear to me how faithful do we need the "frame
> variable" algorithm to be. The frame variable syntax does not precisely
> follow the c++ semantics anyway. And a simple "recurse into subclasses"
> algorithm is going to be understandable and be "close enough" under
> nearly all circumstances. Virtual inheritance is used very seldomly, and
> shadowing of members defined in a base class is even rarer.
> 
> While analysing this code I've found much more serious bugs (e.g.,
> accessing a transparent member fetches a random other value if the class
> it is in also has base cases; fetching a transparent member in a base
> class does not work at all), which seem to have existed for quite some
> time without being discovered.
> 
> For that reason I am tempted to just implement a basic "recurse into
> subclasses" algorithm inside ValueObject::GetChildMemberWithName, and
> not bother with the virtual inheritance details, nor with being able to
> customize this algorithm to the needs of a specific language.
> 
> What do you think?

IMHO it would be nice to let the TypeSystem class handle the GetChildMemberWithName() in a language specific kind of way. There could possibly be a default implementation in the TypeSystem code that matches what we are currently doing. This new API could have a target pointer as an argument to allow it to search in other modules in the target is specified. Swift always have complete definitions of types, so there are no worries for Swift that I know of. 

I am not tied to my suggestion and really want to hear what others think on this as this are my thoughts after briefly thinking about the issue.

Greg
> 
> regards,
> Pavel