[Lldb-commits] [lldb] [lldb][DWARFASTParserClang] Adjust language type for conflicting Objective-C++ forward declarations (PR #130768)

Wed Mar 12 02:47:56 PDT 2025

labath wrote:

> > So what would happen if we treated `class NSString` and `@class NSString` as two distinct types? I guess that would be a problem because then we could end up with duplicate versions of everything that depends on that type (two `void foo(NSString)`s, depending on whether it's parsed in a C++ or ObjC CU)? (I'm guessing the reason that it uses this hack is because it wants to refer to NSString from generic code.)
> 
> I'll try giving this a shot. If we can pull this off that'd be great. We'll probably need to modify the `CompleteRecordType` logic to realize that the definition DIE we found for a forward declaration is actually objective-c, and if so, create a new type from it. In theory I'd prefer this much more over what I proposed. Though can't say how feasible it is to do yet, let me try

I'm not sure how creating a new type would help, since at that point the c++ type already exists and could be referred to from a bunch of places. I was imagining we would just recognise that we did not find a ***C++*** definition for this type and then do whatever we do when we don't find a definition (leave the type incomplete? forcefully complete it?...). Later, if we reach the type through an ObjC declaration, we find the ObjC defintion and complete it using that.

I think the tricky part is what happens with types like your `struct Request { NSString * m_request;};`. If we parse the definition of `Request` from a c++ unit, it will end up referring the the (incomplete) C++ version of `NSString`, which will prevent us from inspecting it, even if we happen to be in objc code. It might be possible to do some switcheroos in the AST importer, similar to how we replace a "forcefully completed" type from one module with an actual definition from another one. Or we could say people get what they deserve (I don't know how much you care about actually supporting this use case vs. simply not crashing).

> 
> > Since this makes lldb to do type searching more aggressively, I want to get a sense of performance impact on simplified template names from this PR
> 
> If we were to go down the route I proposed, I would only do a `FindTypes` call for DIEs at the root namespace. And since this is only searching within the current module/library, not across all modules in the target, I suspect the performance cost wouldn't be an issue. But I'll try to get some numbers (if the approach Pavel suggested doesn't work out)

For our use cases, ~all of the code is in a single module anyway, so not searching in other modules doesn't help much. Limiting it to the global namespace definitely helps, but due to how the accelerator tables work, it still means that you have to iterate through all DIEs with the given name anywhere in the program, just to find out whether they are at the top level or not. I don't know what how much of a performance hit would that be in practice.

What would help is if we could short-circuit this code in the (common, on linux) situation where you don't have any ObjC code in your module. I'm not sure if we have a way to do that. (I don't know if we're currently parsing all CU dies, but if we are, we could check if any of them is an ObjC unit)

I also think the FindTypes call is unnecessarily heavy (and dangerously recursive) for this. Since, in this world, we *want* to treat the two types as equivalent, I don't think we need to find all possible definitions of that type. I think it should be sufficient to settle on *a* canonical definition for the type (per the ODR rule and all), so we can just pick the first one we find. And we can do the search at the DWARF DIE level -- and then also record the definition die in the unique type map, so that we pick the same one when its time to complete the type.

https://github.com/llvm/llvm-project/pull/130768