[lldb-dev] RFC: -flimit-debug-info + frame variable

Thu Jul 23 11:14:00 PDT 2020

> On Jul 23, 2020, at 5:15 AM, Pavel Labath <pavel at labath.sk> wrote:
> 
> On 22/07/2020 01:31, Jim Ingham wrote:
>> 
>> 
>>> On Jul 21, 2020, at 9:27 AM, Pavel Labath <pavel at labath.sk
>>> <mailto:pavel at labath.sk>> wrote:
>>> I do see the attractiveness of constructing of a full compiler type. The
>>> reason I am hesitant to go that way, because it seems to me that this
>>> would negate the two main benefits of the frame variable command over
>>> the expression evaluator: a) it's fast; b) it's less likely to crash.
>>> 
>>> And while I don't think it will be as slow or as crashy as the
>>> expression evaluator, the usage of the ast importer will force a lot
>>> more types to be parsed than are strictly needed for this functionality.
>>> And the insertion of all potentially conflicting types from different
>>> modules into a single ast context is also somewhat worrying.
>> 
>> Importation should be incremental as well, so this shouldn’t make things
>> that much slower.  And you shouldn’t ever be looking things up by name
>> in this AST so you wouldn’t be led astray that way.  You also are going
>> to have to do pretty much the same job for “expr”, right?  So you
>> wouldn’t be opening new dangerous pathways.
> 
> The import is not as incremental as we might want, and it actually sort
> of depends on what is the state of the source ast. Let's the source AST
> has types A and B, and A depends on B in some way (say as a method
> argument). Let's say that A is complete (parsed) and B isn't. While
> importing A, the ast importer will import the method which has the B
> argument, but whether it will not descend into B (and cause us to parse it).
> If however, B happens to be B already parsed then it will import B and
> all of its base classes (but not fields and methods).
> 
> On top of that we also have our own additions -- whenever we encounter a
> method returning a pointer, we import the pointer target type (this has
> to do with covariant return types). These things compound and so even a
> simple import can end up importing quite a lot.
> 
> I actually tried making the ast importer more lazy -- I have a proof of
> concept, but it required adding more explicit lookups into clang's Sema,
> so that's why I haven't pursued it yet.

Anything we can do along these lines will help folks with large projects.  We have been getting slower in this area over the years.  But I understand the need to tread with caution here.

> 
> I could also try to disable some of these things for these frame
> variable imports (they don't need methods at all), but then I would be
> opening new dangerous pathways...
> 
> 
>> 
>> OTOH, the AST’s are complex beasts, so I am not unmoved by your worries...
> 
> Yeah... :)
> 
>>> The dlclose issue is an interesting one. Presumably, we could ensure
>>> that the module does not go away by storing a module shared (or weak?)
>>> pointer somewhere inside the value object. BTW, how does this work with
>>> ValueObject casts right now? If I cast a ValueObject to a CompilerType
>>> belonging to a different module, does anything ensure this module does
>>> not go away? Or when dereferencing a pointer to an type which is not
>>> complete in the current module?
>> 
>> I don’t think at present we do anything smart about this.  It’s just
>> always bugged me at the back of my brain that we could get into trouble
>> with this, and so I don’t want to do something that would make it worse,
>> especially in a systemic way.
> 
> Is there a reason we don't store a pointer to the module where the
> TypeSystem came from? We could do either do that for all ValueObjects,
> or just when the type system changes (casts, dereferences of incomplete
> types, and now -flimit-debug-info) ?
> 

ValueObjects currently treat their types as a computed not stored entity.  There’s not a "CompilerType m_type” ivar, only a pure virtual “CompilerType *GetCompilerType”.  But I don’t know whether we’re taking use of that fact or not.  But we could broadcast a “ModulesChanged” to the ValueObjects as well as to the Breakpoints and have them react to that.

>> 
>>> 
>>> I'm hoping that this stuff won't be "hard work". I haven't prototyped
>>> the code yet, but I am hoping to keep this lookup code in under 200 LOC.
>>> And as Greg points out, there are ways to put this stuff into the type
>>> system -- I'm just not sure whether that is needed given that the
>>> ValueObject class is the only user of the GetIndexOfChildMemberWithName
>>> interface. The whole function is pretty clearly designed with
>>> ValueObject::GetChildMemberWithName in mind.
>> 
>> It seems fine to me to proceed along the lines you propose.  If it ends
>> up being smooth sailing, I can’t see any reason not to do it this way.
>>  When/If you end up having lots of corner cases to manage, would be the
>> time to consider cutting back to using the real type system to back
>> these computations.
> 
> Ok, sounds good. Let me create a prototype for this, and we'll see how
> it goes from there. It may take a while because I'm now entangled in
> some line table stuff.

Excellent, I look forward to seeing what you come up with!

> 
> 
> On 21/07/2020 23:23, Greg Clayton wrote:
>>> On Jul 21, 2020, at 9:27 AM, Pavel Labath <pavel at labath.sk> wrote:
>>> The dlclose issue is an interesting one. Presumably, we could ensure
>>> that the module does not go away by storing a module shared (or weak?)
>>> pointer somewhere inside the value object. BTW, how does this work with
>>> ValueObject casts right now? If I cast a ValueObject to a CompilerType
>>> belonging to a different module, does anything ensure this module does
>>> not go away? Or when dereferencing a pointer to an type which is not
>>> complete in the current module?
>> 
>> I am not sure dlclose is a problem, the module won't usually be
> cleaned up. And that shared library shouldn't have the definition we
> need and be able to be unloaded IIUC how the -flimit-debug-info stuff works.
>> 
> 
> In a well-behaved application, I think it shouldn't be possible to
> dlclose a library if a library inheriting a type from it is still
> loaded. However, there's no way to really guarantee that.
> 
> For example, and application might have two libraries with different
> defintions of a class A, which don't cause conflict because the relevant
> symbols are hidden. But when searching for a base class A from a third
> library, we end up picking the wrong one. Or the same (odr) class is
> defined in two libraries, and we pick the one which gets unloaded,
> although the application actually uses the code from the other library.
> 
> Now we could try to be fancy and analyze module dependencies, symbol
> visibility, etc. but it would still be pretty hard to guarantee that
> this really always is the case.

Note, also, that C has opaque structs that we want to find cross module just like with C++ classes, but it doesn’t have ODR.  So I don’t think we can count on ODR to help us out here.

Jim

> 
> pl