[clang] [libcxxabi] [lldb] [llvm] [lldb][Expression] Add structor variant to LLDB's function call labels (PR #149827)

Michael Buch via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 6 09:29:06 PDT 2025


================
@@ -2482,6 +2485,134 @@ bool SymbolFileDWARF::ResolveFunction(const DWARFDIE &orig_die,
   return false;
 }
 
+static int ClangToItaniumCtorKind(clang::CXXCtorType kind) {
+  switch (kind) {
+  case clang::CXXCtorType::Ctor_Complete:
+    return 1;
+  case clang::CXXCtorType::Ctor_Base:
+    return 2;
+  case clang::CXXCtorType::Ctor_CopyingClosure:
+  case clang::CXXCtorType::Ctor_DefaultClosure:
+  case clang::CXXCtorType::Ctor_Comdat:
+    llvm_unreachable("Unexpected constructor kind.");
+  }
+}
+
+static int ClangToItaniumDtorKind(clang::CXXDtorType kind) {
+  switch (kind) {
+  case clang::CXXDtorType::Dtor_Deleting:
+    return 0;
+  case clang::CXXDtorType::Dtor_Complete:
+    return 1;
+  case clang::CXXDtorType::Dtor_Base:
+    return 2;
+  case clang::CXXDtorType::Dtor_Comdat:
+    llvm_unreachable("Unexpected destructor kind.");
+  }
+}
+
+static std::optional<int>
+GetItaniumCtorDtorVariant(llvm::StringRef discriminator) {
+  const bool is_ctor = discriminator.consume_front("C");
+  if (!is_ctor && !discriminator.consume_front("D"))
+    return std::nullopt;
+
+  uint64_t structor_kind;
+  if (!llvm::to_integer(discriminator, structor_kind))
+    return std::nullopt;
+
+  if (is_ctor) {
+    if (structor_kind > clang::CXXCtorType::Ctor_DefaultClosure)
+      return std::nullopt;
+
+    return ClangToItaniumCtorKind(
+        static_cast<clang::CXXCtorType>(structor_kind));
+  }
+
+  if (structor_kind > clang::CXXDtorType::Dtor_Comdat)
+    return std::nullopt;
+
+  return ClangToItaniumDtorKind(static_cast<clang::CXXDtorType>(structor_kind));
+}
+
+DWARFDIE SymbolFileDWARF::FindFunctionDefinition(const FunctionCallLabel &label,
+                                                 const DWARFDIE &declaration) {
+  DWARFDIE definition;
+  llvm::DenseMap<int, DWARFDIE> structor_variant_to_die;
+
+  // eFunctionNameTypeFull for mangled name lookup.
+  // eFunctionNameTypeMethod is required for structor lookups (since we look
+  // those up by DW_AT_name).
+  Module::LookupInfo info(ConstString(label.lookup_name),
+                          lldb::eFunctionNameTypeFull |
+                              lldb::eFunctionNameTypeMethod,
+                          lldb::eLanguageTypeUnknown);
+
+  m_index->GetFunctions(info, *this, {}, [&](DWARFDIE entry) {
+    if (entry.GetAttributeValueAsUnsigned(llvm::dwarf::DW_AT_declaration, 0))
+      return IterationAction::Continue;
+
+    auto spec = entry.GetAttributeValueAsReferenceDIE(DW_AT_specification);
+    if (!spec)
+      return IterationAction::Continue;
+
+    if (spec != declaration)
----------------
Michael137 wrote:

> > is actually redundant because we can just rely on the fallback lookup
> 
> Is it? I can see how the fallback lookup can work for functions where you have mangled names, as those are more-or-less unique. However, I don't see how we could find the correct constructor if all we have is the name of the class.

Yup that's correct. I meant it should work for mangled name lookup. For constructors we just would behave as we do today. Not ideal, but not a regression. But yea, ideally we'd fix this as you suggest. All that is to say, the type-unit check in this PR is redundant (regardless of which route we take).

> > Happy to try and change it to a structural match though.
> 
> I think that is necessary for this to work. Otherwise, this will fail whenever we parse a type from a CU which does not contain the definition of the methods (at least constructors) of that type. And I think that can happen pretty easily in a big binary.

Would a structural match here help though? If the declaration DIE for a ctor got encoded into the `AsmLabel` but the definition lives in a different CU (and thus `SymbolFile`), we wouldn't be able to find it anyway using our approach right? AFAIU, the case where a structural match would help is if:
1. constructor declaration DIE (and its module) got encoded into the `AsmLabel`
2. the definition lives in the same module but the `DW_AT_specification` points outside of the module

I have been having a hard time coming up with an example of (2) happening. But I'm sure compilers are allowed to do this (and probably already do?). With type-units this can happen because the declaration lives in `.debug_types` but the definition (and its specification) lives in `.debug_info. So the `SymbolFile` lookup will find the definition but the specification DIE is not the same as the declaration DIE. So there a structural match would help.

> > Here a in lib.o would point to the DW_TAG_structure_type of main.o, but the subprogram definitions for the constructors are different (and local to each CU).
> 
> Hm... That's confusing. I was under the impression that ld64 doesn't touch debug info (that it just leaves breadcrumbs so that the debug info can be linked later. Did that change? If not, how can lib.o point to a different file. Surely clang does not inspect main.o when compiling lib.o.
> 
> Or did you mean that this happens after running dsymutil over the binary? Because this matches what I would expect dsymutil to do.

Ah yes, sorry, forgot about `dsymutil` here.

> > Do you prefer us looking into alternatives to the approach in this PR? E.g., make DWARF do some of the heavy lifting? We could put e.g., the structor variant info into DWARF, alleviating the need for demangling, etc.
> 
> No, I still think this is the best approach for this. I'm just disappointed that it's turning out to be more complicated than I expected. I view the introduction of abi tags as an orthogonal issue -- if it gets included in DWARF, we can include it in the structural match.

Ah I see, agreed

> I would love if DWARF contained more information on structor variants. I find it disappointing that the only way to locate the complete object constructor is to remangle the name of the subobject constructor and look it up in the symbol table. I think that's bad because DWARF should be usable without a symbol table. However, I suspect that's not going to be easy...

I wonder how GDB makes use of the placeholder mangling (i.e., `C4`) that GCC puts on declarations. Or if not for GDB, why does GCC do it? I guess if we had such a placeholder mangling, our lookup could become "find all DIEs with name Foo whose specification has _Z3FooC4 linkage name (regardless of which module it lives in)". That does make our lives a bit easier. But maybe that's not a blocker for this PR (unless it turns out the structural match is a big problem).

https://github.com/llvm/llvm-project/pull/149827


More information about the llvm-commits mailing list