[llvm] [llvm][dsymutil] Use the DW_AT_name of the uniqued DIE for insertion into .debug_names (PR #168513)
Michael Buch via llvm-commits
llvm-commits at lists.llvm.org
Tue Nov 18 03:10:49 PST 2025
https://github.com/Michael137 created https://github.com/llvm/llvm-project/pull/168513
The root cause of the issue is that we have a DW_AT_subprogram definition whose DW_AT_specification DIE got deduplicated. But the DW_AT_name of the original specification is different than the one it got uniqued to. That’s technically fine because dsymutil uniques by linkage name, which uniquely identifies any function with non-internal linkage.
But we insert the definition DIE into the debug-names table using the DW_AT_name of the original specification (we call getDIENames(InputDIE…)). But what we really want to do is use the name of the adjusted DW_AT_specifcation (i.e., the DW_AT_specification of the output DIE). That’s not as simple as it sounds because we can’t just get ahold of the DIE in the output CU. We have to grab the ODR DeclContext of the input DIE’s specification. That is the only link back to the canonical specification DIE. For that to be of any use, we have to stash the DW_AT_name into DeclContext so we can use it in getDIENames.
We have to account for the possibility of multiple levels of DW_AT_specification/DW_AT_abstract_origin. So my proposed solution is to recursively scan the referenced DIE’s, grab the canonical DIE for those and get the name from the DeclContext (if none exists then use the DW_AT_name of the DIE itself).
>From 1b4e1911223386edfe4b69932e5e696c8a87a68a Mon Sep 17 00:00:00 2001
From: Michael Buch <michaelbuch12 at gmail.com>
Date: Tue, 18 Nov 2025 11:07:27 +0000
Subject: [PATCH] [llvm][dsymutil] Use the DW_AT_name of the uniqued DIE for
insertion into .debug_names
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
The root cause of the issue is that we have a DW_AT_subprogram definition whose DW_AT_specification DIE got deduplicated. But the DW_AT_name of the original specification is different than the one it got uniqued to. That’s technically fine because dsymutil uniques by linkage name, which uniquely identifies any function with non-internal linkage.
But we insert the definition DIE into the debug-names table using the DW_AT_name of the original specification (we call getDIENames(InputDIE…)). But what we really want to do is use the name of the adjusted DW_AT_specifcation (i.e., the DW_AT_specification of the output DIE). That’s not as simple as it sounds because we can’t just get ahold of the DIE in the output CU. We have to grab the ODR DeclContext of the input DIE’s specification. That is the only link back to the canonical specification DIE. For that to be of any use, we have to stash the DW_AT_name into DeclContext so we can use it in getDIENames.
We have to account for the possibility of multiple levels of DW_AT_specification/DW_AT_abstract_origin. So my proposed solution is to recursively scan the referenced DIE’s, grab the canonical DIE for those and get the name from the DeclContext (if none exists then use the DW_AT_name of the DIE itself).
---
.../llvm/DWARFLinker/Classic/DWARFLinker.h | 6 +-
.../Classic/DWARFLinkerDeclContext.h | 8 ++-
llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp | 62 ++++++++++++++++++-
.../Classic/DWARFLinkerDeclContext.cpp | 28 +++++----
4 files changed, 85 insertions(+), 19 deletions(-)
diff --git a/llvm/include/llvm/DWARFLinker/Classic/DWARFLinker.h b/llvm/include/llvm/DWARFLinker/Classic/DWARFLinker.h
index 5b9535380aebf..65935ad841181 100644
--- a/llvm/include/llvm/DWARFLinker/Classic/DWARFLinker.h
+++ b/llvm/include/llvm/DWARFLinker/Classic/DWARFLinker.h
@@ -708,7 +708,9 @@ class LLVM_ABI DWARFLinker : public DWARFLinkerBase {
/// already there.
/// \returns is a name was found.
bool getDIENames(const DWARFDie &Die, AttributesInfo &Info,
- OffsetsStringPool &StringPool, bool StripTemplate = false);
+ OffsetsStringPool &StringPool,
+ const DWARFFile &File, CompileUnit &Unit,
+ bool StripTemplate = false);
uint32_t hashFullyQualifiedName(DWARFDie DIE, CompileUnit &U,
const DWARFFile &File,
@@ -725,6 +727,8 @@ class LLVM_ABI DWARFLinker : public DWARFLinkerBase {
/// Translate directories and file names if necessary.
/// Relocate address ranges.
void generateLineTableForUnit(CompileUnit &Unit);
+
+ llvm::StringRef getCanonicalDIEName(DWARFDie Die, const DWARFFile &File, CompileUnit *Unit);
};
/// Assign an abbreviation number to \p Abbrev
diff --git a/llvm/include/llvm/DWARFLinker/Classic/DWARFLinkerDeclContext.h b/llvm/include/llvm/DWARFLinker/Classic/DWARFLinkerDeclContext.h
index 9fb1b3f80e2ff..feccd89b74bd9 100644
--- a/llvm/include/llvm/DWARFLinker/Classic/DWARFLinkerDeclContext.h
+++ b/llvm/include/llvm/DWARFLinker/Classic/DWARFLinkerDeclContext.h
@@ -84,10 +84,10 @@ class DeclContext {
DeclContext() : DefinedInClangModule(0), Parent(*this) {}
DeclContext(unsigned Hash, uint32_t Line, uint32_t ByteSize, uint16_t Tag,
- StringRef Name, StringRef File, const DeclContext &Parent,
+ StringRef Name, StringRef NameForUniquing, StringRef File, const DeclContext &Parent,
DWARFDie LastSeenDIE = DWARFDie(), unsigned CUId = 0)
: QualifiedNameHash(Hash), Line(Line), ByteSize(ByteSize), Tag(Tag),
- DefinedInClangModule(0), Name(Name), File(File), Parent(Parent),
+ DefinedInClangModule(0), Name(Name), NameForUniquing(NameForUniquing), File(File), Parent(Parent),
LastSeenDIE(LastSeenDIE), LastSeenCompileUnitID(CUId) {}
uint32_t getQualifiedNameHash() const { return QualifiedNameHash; }
@@ -100,6 +100,7 @@ class DeclContext {
uint32_t getCanonicalDIEOffset() const { return CanonicalDIEOffset; }
void setCanonicalDIEOffset(uint32_t Offset) { CanonicalDIEOffset = Offset; }
+ llvm::StringRef getCanonicalName() const { return Name; }
bool isDefinedInClangModule() const { return DefinedInClangModule; }
void setDefinedInClangModule(bool Val) { DefinedInClangModule = Val; }
@@ -115,6 +116,7 @@ class DeclContext {
uint16_t Tag = dwarf::DW_TAG_compile_unit;
unsigned DefinedInClangModule : 1;
StringRef Name;
+ StringRef NameForUniquing;
StringRef File;
const DeclContext &Parent;
DWARFDie LastSeenDIE;
@@ -180,7 +182,7 @@ struct DeclMapInfo : private DenseMapInfo<DeclContext *> {
return RHS == LHS;
return LHS->QualifiedNameHash == RHS->QualifiedNameHash &&
LHS->Line == RHS->Line && LHS->ByteSize == RHS->ByteSize &&
- LHS->Name.data() == RHS->Name.data() &&
+ LHS->NameForUniquing.data() == RHS->NameForUniquing.data() &&
LHS->File.data() == RHS->File.data() &&
LHS->Parent.QualifiedNameHash == RHS->Parent.QualifiedNameHash;
}
diff --git a/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp b/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp
index 8637b55c78f9c..0d0cf02684003 100644
--- a/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp
+++ b/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp
@@ -151,9 +151,58 @@ static bool isTypeTag(uint16_t Tag) {
return false;
}
+/// Recurse through the input DIE's canonical references until we find a
+/// DW_AT_name.
+llvm::StringRef DWARFLinker::DIECloner::getCanonicalDIEName(DWARFDie Die, const DWARFFile &File, CompileUnit *Unit) {
+ std::optional<DWARFFormValue> Ref;
+
+ auto GetDieName = [](const DWARFDie &D) -> llvm::StringRef {
+ auto NameForm = D.find(llvm::dwarf::DW_AT_name);
+ if (!NameForm)
+ return {};
+
+ auto NameOrErr = NameForm->getAsCString();
+ if (!NameOrErr) {
+ llvm::consumeError(NameOrErr.takeError());
+ return {};
+ }
+
+ return *NameOrErr;
+ };
+
+ llvm::StringRef Name = GetDieName(Die);
+ if (!Name.empty())
+ return Name;
+
+ while (true) {
+ if (!(Ref = Die.find(llvm::dwarf::DW_AT_specification))
+ && !(Ref = Die.find(llvm::dwarf::DW_AT_abstract_origin)))
+ break;
+
+ Die = Linker.resolveDIEReference(File, CompileUnits, *Ref, Die, Unit);
+ assert (Unit);
+
+ unsigned SpecIdx = Unit->getOrigUnit().getDIEIndex(Die);
+ CompileUnit::DIEInfo &SpecInfo = Unit->getInfo(SpecIdx);
+ if (SpecInfo.Ctxt && SpecInfo.Ctxt->hasCanonicalDIE()) {
+ if (!SpecInfo.Ctxt->getCanonicalName().empty()) {
+ Name = SpecInfo.Ctxt->getCanonicalName();
+ break;
+ }
+ }
+
+ Name = GetDieName(Die);
+ if (!Name.empty())
+ break;
+ }
+
+ return Name;
+}
+
bool DWARFLinker::DIECloner::getDIENames(const DWARFDie &Die,
AttributesInfo &Info,
OffsetsStringPool &StringPool,
+ const DWARFFile &File, CompileUnit &Unit,
bool StripTemplate) {
// This function will be called on DIEs having low_pcs and
// ranges. As getting the name might be more expansive, filter out
@@ -161,12 +210,19 @@ bool DWARFLinker::DIECloner::getDIENames(const DWARFDie &Die,
if (Die.getTag() == dwarf::DW_TAG_lexical_block)
return false;
+ // The mangled name of an specification DIE will by virtue of the
+ // uniquing algorithm be the same as the one it got uniqued into.
+ // So just use the input DIE's linkage name.
if (!Info.MangledName)
if (const char *MangledName = Die.getLinkageName())
Info.MangledName = StringPool.getEntry(MangledName);
+ // For subprograms with linkage names, we unique on the linkage name,
+ // so DW_AT_name's may differ between the input and canonical DIEs.
+ // Use the name of the canonical DIE.
if (!Info.Name)
- if (const char *Name = Die.getShortName())
+ if (llvm::StringRef Name = getCanonicalDIEName(Die, File, &Unit);
+ !Name.empty())
Info.Name = StringPool.getEntry(Name);
if (!Info.MangledName)
@@ -1939,7 +1995,7 @@ DIE *DWARFLinker::DIECloner::cloneDIE(const DWARFDie &InputDIE,
// accelerator tables too. For now stick with dsymutil's behavior.
if ((Info.InDebugMap || AttrInfo.HasLowPc || AttrInfo.HasRanges) &&
Tag != dwarf::DW_TAG_compile_unit &&
- getDIENames(InputDIE, AttrInfo, DebugStrPool,
+ getDIENames(InputDIE, AttrInfo, DebugStrPool, File, Unit,
Tag != dwarf::DW_TAG_inlined_subroutine)) {
if (AttrInfo.MangledName && AttrInfo.MangledName != AttrInfo.Name)
Unit.addNameAccelerator(Die, AttrInfo.MangledName,
@@ -1962,7 +2018,7 @@ DIE *DWARFLinker::DIECloner::cloneDIE(const DWARFDie &InputDIE,
} else if (Tag == dwarf::DW_TAG_imported_declaration && AttrInfo.Name) {
Unit.addNamespaceAccelerator(Die, AttrInfo.Name);
} else if (isTypeTag(Tag) && !AttrInfo.IsDeclaration) {
- bool Success = getDIENames(InputDIE, AttrInfo, DebugStrPool);
+ bool Success = getDIENames(InputDIE, AttrInfo, DebugStrPool, File, Unit);
uint64_t RuntimeLang =
dwarf::toUnsigned(InputDIE.find(dwarf::DW_AT_APPLE_runtime_class))
.value_or(0);
diff --git a/llvm/lib/DWARFLinker/Classic/DWARFLinkerDeclContext.cpp b/llvm/lib/DWARFLinker/Classic/DWARFLinkerDeclContext.cpp
index c9c8dddce9c44..d9356bd71a55f 100644
--- a/llvm/lib/DWARFLinker/Classic/DWARFLinkerDeclContext.cpp
+++ b/llvm/lib/DWARFLinker/Classic/DWARFLinkerDeclContext.cpp
@@ -84,24 +84,28 @@ DeclContextTree::getChildDeclContext(DeclContext &Context, const DWARFDie &DIE,
break;
}
- StringRef NameRef;
+ StringRef NameForUniquing;
+ StringRef Name;
StringRef FileRef;
+ if (const char *ShortName = DIE.getShortName())
+ Name = StringPool.internString(ShortName);
+
if (const char *LinkageName = DIE.getLinkageName())
- NameRef = StringPool.internString(LinkageName);
- else if (const char *ShortName = DIE.getShortName())
- NameRef = StringPool.internString(ShortName);
+ NameForUniquing = StringPool.internString(LinkageName);
+ else if (!Name.empty())
+ NameForUniquing = Name;
- bool IsAnonymousNamespace = NameRef.empty() && Tag == dwarf::DW_TAG_namespace;
+ bool IsAnonymousNamespace = NameForUniquing.empty() && Tag == dwarf::DW_TAG_namespace;
if (IsAnonymousNamespace) {
// FIXME: For dsymutil-classic compatibility. I think uniquing within
// anonymous namespaces is wrong. There is no ODR guarantee there.
- NameRef = "(anonymous namespace)";
+ NameForUniquing = "(anonymous namespace)";
}
if (Tag != dwarf::DW_TAG_class_type && Tag != dwarf::DW_TAG_structure_type &&
Tag != dwarf::DW_TAG_union_type &&
- Tag != dwarf::DW_TAG_enumeration_type && NameRef.empty())
+ Tag != dwarf::DW_TAG_enumeration_type && NameForUniquing.empty())
return PointerIntPair<DeclContext *, 1>(nullptr);
unsigned Line = 0;
@@ -140,10 +144,10 @@ DeclContextTree::getChildDeclContext(DeclContext &Context, const DWARFDie &DIE,
}
}
- if (!Line && NameRef.empty())
+ if (!Line && NameForUniquing.empty())
return PointerIntPair<DeclContext *, 1>(nullptr);
- // We hash NameRef, which is the mangled name, in order to get most
+ // We hash NameForUniquing, which is the mangled name, in order to get most
// overloaded functions resolve correctly.
//
// Strictly speaking, hashing the Tag is only necessary for a
@@ -153,7 +157,7 @@ DeclContextTree::getChildDeclContext(DeclContext &Context, const DWARFDie &DIE,
// FIXME: dsymutil-classic won't unique the same type presented
// once as a struct and once as a class. Using the Tag in the fully
// qualified name hash to get the same effect.
- unsigned Hash = hash_combine(Context.getQualifiedNameHash(), Tag, NameRef);
+ unsigned Hash = hash_combine(Context.getQualifiedNameHash(), Tag, NameForUniquing);
// FIXME: dsymutil-classic compatibility: when we don't have a name,
// use the filename.
@@ -161,14 +165,14 @@ DeclContextTree::getChildDeclContext(DeclContext &Context, const DWARFDie &DIE,
Hash = hash_combine(Hash, FileRef);
// Now look if this context already exists.
- DeclContext Key(Hash, Line, ByteSize, Tag, NameRef, FileRef, Context);
+ DeclContext Key(Hash, Line, ByteSize, Tag, Name, NameForUniquing, FileRef, Context);
auto ContextIter = Contexts.find(&Key);
if (ContextIter == Contexts.end()) {
// The context wasn't found.
bool Inserted;
DeclContext *NewContext =
- new (Allocator) DeclContext(Hash, Line, ByteSize, Tag, NameRef, FileRef,
+ new (Allocator) DeclContext(Hash, Line, ByteSize, Tag, Name, NameForUniquing, FileRef,
Context, DIE, U.getUniqueID());
std::tie(ContextIter, Inserted) = Contexts.insert(NewContext);
assert(Inserted && "Failed to insert DeclContext");
More information about the llvm-commits
mailing list