[Lldb-commits] [lldb] [lldb] Fix a crash when using .dwp files and make type lookup reliable with the index cache (PR #79544)

Greg Clayton via lldb-commits lldb-commits at lists.llvm.org
Thu Feb 1 13:57:50 PST 2024


https://github.com/clayborg updated https://github.com/llvm/llvm-project/pull/79544

>From 90cfa4700d8590d58cecd678ce107ba3ec8481c7 Mon Sep 17 00:00:00 2001
From: Greg Clayton <clayborg at gmail.com>
Date: Thu, 25 Jan 2024 19:21:25 -0800
Subject: [PATCH 1/4] Fix a crash when using .dwp files and make type lookup
 reliable with the index cache.

When using split DWARF with .dwp files we had an issue where sometimes the DWO file within the .dwp file would be parsed _before_ the skeleton compile unit. The DWO file expects to be able to always be able to get a link back to the skeleton compile unit. Prior to this fix, the only time the skeleton compile unit backlink would get set, was if the unit headers for the main executable have been parsed _and_ if the unit DIE was parsed in that DWARFUnit. This patch ensures that we can always get the skeleton compile unit for a DWO file by adding a function:

```
DWARFCompileUnit *DWARFUnit::GetSkeletonUnit();
```

Prior to this fix DWARFUnit had some unsafe accessors that were used to store two different things:

```
  void *DWARFUnit::GetUserData() const;
  void DWARFUnit::SetUserData(void *d);
```

This was used by SymbolFileDWARF to cache the `lldb_private::CompileUnit *` for a SymbolFileDWARF and was also used to store the `DWARFUnit *` for SymbolFileDWARFDwo. This patch clears up this unsafe usage by adding two separate accessors and ivars for this:
```
lldb_private::CompileUnit *DWARFUnit::GetLLDBCompUnit() const { return m_lldb_cu; }
void DWARFUnit::SetLLDBCompUnit(lldb_private::CompileUnit *cu) { m_lldb_cu = cu; }
DWARFCompileUnit *DWARFUnit::GetSkeletonUnit();
void DWARFUnit::SetSkeletonUnit(DWARFUnit *skeleton_unit);
```
This will stop anyone from calling `void *DWARFUnit::GetUserData() const;` and casting the value to an incorrect value.

A crash could occur in `SymbolFileDWARF::GetCompUnitForDWARFCompUnit()` when the `non_dwo_cu`, which is a backlink to the skeleton compile unit, was not set and was NULL. There is an assert() in the code, and then the code just will kill the program if the assert isn't enabled because the code looked like:
```
  if (dwarf_cu.IsDWOUnit()) {
    DWARFCompileUnit *non_dwo_cu =
        static_cast<DWARFCompileUnit *>(dwarf_cu.GetUserData());
    assert(non_dwo_cu);
    return non_dwo_cu->GetSymbolFileDWARF().GetCompUnitForDWARFCompUnit(
        *non_dwo_cu);
  }
```
This is now fixed by calling the `DWARFUnit::GetSkeletonUnit()` which will correctly always get the skeleton compile uint for a DWO file regardless of if the skeleton unit headers have been parse or if the skeleton unit DIE wasn't parsed yet.

To implement the ability to get the skeleton compile units, I added code the DWARFDebugInfo.cpp/.h that make a map of DWO ID -> skeleton DWARFUnit * that gets filled in for DWARF5 when the unit headers are parsed. The `DWARFUnit::GetSkeletonUnit()` will end up parsing the unit headers of the main executable to fill in this map if it already hasn't been done. For DWARF4 and earlier we maintain a separate map that gets filled in only for any DWARF4 compile units that have a DW_AT_dwo_id or DW_AT_gnu_dwo_id attributes. This is more expensive, so this is done lazily and in a thread safe manor. This allows us to be as efficient as possible when using DWARF5 and also be backward compatible with DWARF4 + split DWARF.

There was also an issue that stopped type lookups from succeeding in `DWARFDIE SymbolFileDWARF::GetDIE(const DIERef &die_ref)` where it directly was accessing the `m_dwp_symfile` ivar without calling the accessor function that could end up needing to locate and load the .dwp file. This was fixed by calling the `SymbolFileDWARF::GetDwpSymbolFile()` accessor to ensure we always get a valid value back if we can find the .dwp file. Prior to this fix it was down which APIs were called and if any APIs were called that loaded the .dwp file, it worked fine, but it might not if no APIs were called that did cause it to get loaded.

When we have valid debug info indexes and when the lldb index cache was enabled, this would cause this issue to show up more often.

I modified an existing test case to test that all of this works correctly and doesn't crash.
---
 .../SymbolFile/DWARF/DWARFDebugInfo.cpp       | 77 +++++++++++++++++--
 .../Plugins/SymbolFile/DWARF/DWARFDebugInfo.h |  4 +
 .../Plugins/SymbolFile/DWARF/DWARFUnit.cpp    | 27 ++++++-
 .../Plugins/SymbolFile/DWARF/DWARFUnit.h      | 28 ++++++-
 .../SymbolFile/DWARF/SymbolFileDWARF.cpp      | 32 ++++----
 .../SymbolFile/DWARF/SymbolFileDWARF.h        | 11 +++
 .../SymbolFile/DWARF/SymbolFileDWARFDwo.cpp   | 19 +++++
 .../SymbolFile/DWARF/SymbolFileDWARFDwo.h     | 11 ++-
 .../DWARF/x86/dwp-separate-debug-file.cpp     | 38 +++++++++
 9 files changed, 216 insertions(+), 31 deletions(-)

diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
index 340b9acf80d02..e59e328c6e7cb 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
@@ -81,27 +81,88 @@ void DWARFDebugInfo::ParseUnitsFor(DIERef::Section section) {
                                 : m_context.getOrLoadDebugInfoData();
   lldb::offset_t offset = 0;
   while (data.ValidOffset(offset)) {
-    llvm::Expected<DWARFUnitSP> unit_sp = DWARFUnit::extract(
+    llvm::Expected<DWARFUnitSP> expected_unit_sp = DWARFUnit::extract(
         m_dwarf, m_units.size(), data, section, &offset);
 
-    if (!unit_sp) {
+    if (!expected_unit_sp) {
       // FIXME: Propagate this error up.
-      llvm::consumeError(unit_sp.takeError());
+      llvm::consumeError(expected_unit_sp.takeError());
       return;
     }
 
+    DWARFUnitSP unit_sp = *expected_unit_sp;
+
     // If it didn't return an error, then it should be returning a valid Unit.
-    assert(*unit_sp);
-    m_units.push_back(*unit_sp);
-    offset = (*unit_sp)->GetNextUnitOffset();
+    assert((bool)unit_sp);
+
+    // Keep a map of DWO ID back to the skeleton units. Sometimes accelerator
+    // table lookups can cause the DWO files to be accessed before the skeleton
+    // compile unit is parsed, so we keep a map to allow us to match up the DWO
+    // file to the back to the skeleton compile units.
+    if (unit_sp->GetUnitType() == lldb_private::dwarf::DW_UT_skeleton) {
+      if (std::optional<uint64_t> unit_dwo_id = unit_sp->GetHeaderDWOId())
+        m_dwarf5_dwo_id_to_skeleton_unit[*unit_dwo_id] = unit_sp.get();
+    }
 
-    if (auto *type_unit = llvm::dyn_cast<DWARFTypeUnit>(unit_sp->get())) {
+    m_units.push_back(unit_sp);
+    offset = unit_sp->GetNextUnitOffset();
+
+    if (auto *type_unit = llvm::dyn_cast<DWARFTypeUnit>(unit_sp.get())) {
       m_type_hash_to_unit_index.emplace_back(type_unit->GetTypeHash(),
-                                             unit_sp.get()->GetID());
+                                             unit_sp->GetID());
     }
   }
 }
 
+DWARFUnit *DWARFDebugInfo::GetSkeletonUnit(DWARFUnit *dwo_unit) {
+  // If this isn't a DWO unit, don't try and find the skeleton unit.
+  if (!dwo_unit->IsDWOUnit())
+    return nullptr;
+
+  auto dwo_id = dwo_unit->GetDWOId();
+  if (!dwo_id.has_value())
+    return nullptr;
+
+  // Parse the unit headers so that m_dwarf5_dwo_id_to_skeleton_unit is filled
+  // in with all of the DWARF5 skeleton compile units DWO IDs since it is easy
+  // to access the DWO IDs in the DWARFUnitHeader for each DWARFUnit.
+  ParseUnitHeadersIfNeeded();
+
+  // Find the value in our cache and return it we we find it. This cache may
+  // only contain DWARF5 units.
+  auto iter = m_dwarf5_dwo_id_to_skeleton_unit.find(*dwo_id);
+  if (iter != m_dwarf5_dwo_id_to_skeleton_unit.end())
+    return iter->second;
+
+  // DWARF5 unit headers have the DWO ID and should have already been in the map
+  // so if it wasn't found in the above find() call, then we didn't find it and
+  // don't need to do the more expensive DWARF4 search.
+  if (dwo_unit->GetVersion() >= 5)
+    return nullptr;
+
+  // Parse all DWO IDs from all DWARF4 and earlier compile units that have DWO
+  // IDs. It is more expensive to get the DWO IDs from DWARF4 compile units as
+  // we need to parse the unit DIE and extract the DW_AT_dwo_id or
+  // DW_AT_GNU_dwo_id attribute values, so do this only if we didn't find our
+  // match above search and only for DWARF4 and earlier compile units.
+  llvm::call_once(m_dwarf4_dwo_id_to_skeleton_unit_once_flag, [this]() {
+    for (uint32_t i = 0, num = GetNumUnits(); i < num; ++i) {
+      if (DWARFUnit *unit = GetUnitAtIndex(i)) {
+        if (unit->GetVersion() < 5) {
+          if (std::optional<uint64_t> unit_dwo_id = unit->GetDWOId())
+            m_dwarf4_dwo_id_to_skeleton_unit[*unit_dwo_id] = unit;
+        }
+      }
+    }
+  });
+
+  // Search the DWARF4 DWO results that we parsed lazily.
+  iter = m_dwarf4_dwo_id_to_skeleton_unit.find(*dwo_id);
+  if (iter != m_dwarf4_dwo_id_to_skeleton_unit.end())
+    return iter->second;
+  return nullptr;
+}
+
 void DWARFDebugInfo::ParseUnitHeadersIfNeeded() {
   llvm::call_once(m_units_once_flag, [&] {
     ParseUnitsFor(DIERef::Section::DebugInfo);
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h
index a8b5abc3beed2..c1f0cb0203fb7 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.h
@@ -39,6 +39,7 @@ class DWARFDebugInfo {
   DWARFUnit *GetUnitContainingDIEOffset(DIERef::Section section,
                                         dw_offset_t die_offset);
   DWARFUnit *GetUnit(const DIERef &die_ref);
+  DWARFUnit *GetSkeletonUnit(DWARFUnit *dwo_unit);
   DWARFTypeUnit *GetTypeUnitForHash(uint64_t hash);
   bool ContainsTypeUnits();
   DWARFDIE GetDIE(const DIERef &die_ref);
@@ -70,6 +71,9 @@ class DWARFDebugInfo {
       m_cu_aranges_up; // A quick address to compile unit table
 
   std::vector<std::pair<uint64_t, uint32_t>> m_type_hash_to_unit_index;
+  llvm::DenseMap<uint64_t, DWARFUnit *> m_dwarf5_dwo_id_to_skeleton_unit;
+  llvm::DenseMap<uint64_t, DWARFUnit *> m_dwarf4_dwo_id_to_skeleton_unit;
+  llvm::once_flag m_dwarf4_dwo_id_to_skeleton_unit_once_flag;
 
 private:
   // All parsing needs to be done partially any managed by this class as
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
index 7a40361cdede7..23e0b8a7f2c06 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.cpp
@@ -97,7 +97,12 @@ void DWARFUnit::ExtractUnitDIEIfNeeded() {
         *m_dwo_id, m_first_die.GetOffset()));
     return; // Can't fetch the compile unit from the dwo file.
   }
-  dwo_cu->SetUserData(this);
+  // If the skeleton compile unit gets its unit DIE parsed first, then this
+  // will fill in the DWO file's back pointer to this skeleton compile unit.
+  // If the DWO files get parsed on their own first the skeleton back link
+  // can be done manually in DWARFUnit::GetSkeletonCompileUnit() which will
+  // do a reverse lookup and cache the result.
+  dwo_cu->SetSkeletonUnit(this);
 
   DWARFBaseDIE dwo_cu_die = dwo_cu->GetUnitDIEOnly();
   if (!dwo_cu_die.IsValid()) {
@@ -702,9 +707,25 @@ uint8_t DWARFUnit::GetAddressByteSize(const DWARFUnit *cu) {
 
 uint8_t DWARFUnit::GetDefaultAddressSize() { return 4; }
 
-void *DWARFUnit::GetUserData() const { return m_user_data; }
+DWARFCompileUnit *DWARFUnit::GetSkeletonUnit() {
+  if (m_skeleton_unit == nullptr && IsDWOUnit()) {
+    SymbolFileDWARFDwo *dwo =
+        llvm::dyn_cast_or_null<SymbolFileDWARFDwo>(&GetSymbolFileDWARF());
+    // Do a reverse lookup if the skeleton compile unit wasn't set.
+    if (dwo)
+      m_skeleton_unit = dwo->GetBaseSymbolFile().GetSkeletonUnit(this);
+  }
+  return llvm::dyn_cast_or_null<DWARFCompileUnit>(m_skeleton_unit);
+}
 
-void DWARFUnit::SetUserData(void *d) { m_user_data = d; }
+void DWARFUnit::SetSkeletonUnit(DWARFUnit *skeleton_unit) {
+  // If someone is re-setting the skeleton compile unit backlink, make sure
+  // it is setting it to a valid value when it wasn't valid, or if the
+  // value in m_skeleton_unit was valid, it should be the same value.
+  assert(skeleton_unit);
+  assert(m_skeleton_unit == nullptr || m_skeleton_unit == skeleton_unit);
+  m_skeleton_unit = skeleton_unit;
+}
 
 bool DWARFUnit::Supports_DW_AT_APPLE_objc_complete_type() {
   return GetProducer() != eProducerLLVMGCC;
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h
index bc225a52e1d03..9f6d127056fa5 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFUnit.h
@@ -98,8 +98,14 @@ class DWARFUnit : public UserID {
   virtual ~DWARFUnit();
 
   bool IsDWOUnit() { return m_is_dwo; }
+  /// Get the DWO ID from the DWARFUnitHeader for DWARF5, or from the unit DIE's
+  /// DW_AT_dwo_id or DW_AT_GNU_dwo_id for DWARF4 and earlier.
   std::optional<uint64_t> GetDWOId();
-
+  /// Get the DWO ID from the DWARFUnitHeader only. DWARF5 skeleton units have
+  /// the DWO ID in the compile unit header and we sometimes only want to access
+  /// this cheap value without causing the more expensive attribute fetches that
+  /// GetDWOId() uses.
+  std::optional<uint64_t> GetHeaderDWOId() { return m_header.GetDWOId(); }
   void ExtractUnitDIEIfNeeded();
   void ExtractUnitDIENoDwoIfNeeded();
   void ExtractDIEsIfNeeded();
@@ -198,9 +204,21 @@ class DWARFUnit : public UserID {
 
   static uint8_t GetDefaultAddressSize();
 
-  void *GetUserData() const;
+  lldb_private::CompileUnit *GetLLDBCompUnit() const { return m_lldb_cu; }
+
+  void SetLLDBCompUnit(lldb_private::CompileUnit *cu) { m_lldb_cu = cu; }
+
+  /// Get the skeleton compile unit for a DWO file.
+  ///
+  /// We need to keep track of the skeleton compile unit for a DWO file so
+  /// we can access it. Sometimes this value is cached when the skeleton
+  /// compile unit is first parsed, but if a .dwp file parses all of the
+  /// DWARFUnits in the file, the skeleton compile unit might not have been
+  /// parsed yet, to there might not be a backlink. This accessor handles
+  /// both cases correctly and avoids crashes.
+  DWARFCompileUnit *GetSkeletonUnit();
 
-  void SetUserData(void *d);
+  void SetSkeletonUnit(DWARFUnit *skeleton_unit);
 
   bool Supports_DW_AT_APPLE_objc_complete_type();
 
@@ -336,7 +354,9 @@ class DWARFUnit : public UserID {
   std::shared_ptr<DWARFUnit> m_dwo;
   DWARFUnitHeader m_header;
   const llvm::DWARFAbbreviationDeclarationSet *m_abbrevs = nullptr;
-  void *m_user_data = nullptr;
+  lldb_private::CompileUnit *m_lldb_cu = nullptr;
+  // If this is a DWO file, we have a backlink to our skeleton compile unit.
+  DWARFUnit *m_skeleton_unit = nullptr;
   // The compile unit debug information entry item
   DWARFDebugInfoEntry::collection m_die_array;
   mutable llvm::sys::RWMutex m_die_array_mutex;
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
index 7ab75a9ce2c6b..781f5c5a43677 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
@@ -722,8 +722,8 @@ DWARFCompileUnit *SymbolFileDWARF::GetDWARFCompileUnit(CompileUnit *comp_unit) {
 
   // The compile unit ID is the index of the DWARF unit.
   DWARFUnit *dwarf_cu = DebugInfo().GetUnitAtIndex(comp_unit->GetID());
-  if (dwarf_cu && dwarf_cu->GetUserData() == nullptr)
-    dwarf_cu->SetUserData(comp_unit);
+  if (dwarf_cu && dwarf_cu->GetLLDBCompUnit() == nullptr)
+    dwarf_cu->SetLLDBCompUnit(comp_unit);
 
   // It must be DWARFCompileUnit when it created a CompileUnit.
   return llvm::cast_or_null<DWARFCompileUnit>(dwarf_cu);
@@ -771,7 +771,7 @@ static const char *GetDWOName(DWARFCompileUnit &dwarf_cu,
 
 lldb::CompUnitSP SymbolFileDWARF::ParseCompileUnit(DWARFCompileUnit &dwarf_cu) {
   CompUnitSP cu_sp;
-  CompileUnit *comp_unit = (CompileUnit *)dwarf_cu.GetUserData();
+  CompileUnit *comp_unit = dwarf_cu.GetLLDBCompUnit();
   if (comp_unit) {
     // We already parsed this compile unit, had out a shared pointer to it
     cu_sp = comp_unit->shared_from_this();
@@ -779,7 +779,7 @@ lldb::CompUnitSP SymbolFileDWARF::ParseCompileUnit(DWARFCompileUnit &dwarf_cu) {
     if (GetDebugMapSymfile()) {
       // Let the debug map create the compile unit
       cu_sp = m_debug_map_symfile->GetCompileUnit(this, dwarf_cu);
-      dwarf_cu.SetUserData(cu_sp.get());
+      dwarf_cu.SetLLDBCompUnit(cu_sp.get());
     } else {
       ModuleSP module_sp(m_objfile_sp->GetModule());
       if (module_sp) {
@@ -792,7 +792,7 @@ lldb::CompUnitSP SymbolFileDWARF::ParseCompileUnit(DWARFCompileUnit &dwarf_cu) {
               *GetDWARFUnitIndex(dwarf_cu.GetID()), cu_language,
               eLazyBoolCalculate, std::move(support_files));
 
-          dwarf_cu.SetUserData(cu_sp.get());
+          dwarf_cu.SetLLDBCompUnit(cu_sp.get());
 
           SetCompileUnitAtIndex(dwarf_cu.GetID(), cu_sp);
         };
@@ -1675,20 +1675,20 @@ Type *SymbolFileDWARF::ResolveType(const DWARFDIE &die,
 
 CompileUnit *
 SymbolFileDWARF::GetCompUnitForDWARFCompUnit(DWARFCompileUnit &dwarf_cu) {
+
   if (dwarf_cu.IsDWOUnit()) {
-    DWARFCompileUnit *non_dwo_cu =
-        static_cast<DWARFCompileUnit *>(dwarf_cu.GetUserData());
+    DWARFCompileUnit *non_dwo_cu = dwarf_cu.GetSkeletonUnit();
     assert(non_dwo_cu);
     return non_dwo_cu->GetSymbolFileDWARF().GetCompUnitForDWARFCompUnit(
         *non_dwo_cu);
   }
   // Check if the symbol vendor already knows about this compile unit?
-  if (dwarf_cu.GetUserData() == nullptr) {
-    // The symbol vendor doesn't know about this compile unit, we need to parse
-    // and add it to the symbol vendor object.
-    return ParseCompileUnit(dwarf_cu).get();
-  }
-  return static_cast<CompileUnit *>(dwarf_cu.GetUserData());
+  CompileUnit *lldb_cu = dwarf_cu.GetLLDBCompUnit();
+  if (lldb_cu)
+    return lldb_cu;
+  // The symbol vendor doesn't know about this compile unit, we need to parse
+  // and add it to the symbol vendor object.
+  return ParseCompileUnit(dwarf_cu).get();
 }
 
 void SymbolFileDWARF::GetObjCMethods(
@@ -1750,7 +1750,7 @@ SymbolFileDWARF::GetDIE(const DIERef &die_ref) {
     }
 
     if (*file_index == DIERef::k_file_index_mask)
-      symbol_file = m_dwp_symfile.get(); // DWP case
+      symbol_file = GetDwpSymbolFile().get(); // DWP case
     else
       symbol_file = this->DebugInfo()
                         .GetUnitAtIndex(*die_ref.file_index())
@@ -1785,6 +1785,10 @@ std::optional<uint64_t> SymbolFileDWARF::GetDWOId() {
   return {};
 }
 
+DWARFUnit *SymbolFileDWARF::GetSkeletonUnit(DWARFUnit *dwo_unit) {
+  return DebugInfo().GetSkeletonUnit(dwo_unit);
+}
+
 std::shared_ptr<SymbolFileDWARFDwo>
 SymbolFileDWARF::GetDwoSymbolFileForCompileUnit(
     DWARFUnit &unit, const DWARFDebugInfoEntry &cu_die) {
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
index 6d87530acf833..60baf694b463e 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
@@ -252,6 +252,17 @@ class SymbolFileDWARF : public SymbolFileCommon {
   /// If this is a DWARF object with a single CU, return its DW_AT_dwo_id.
   std::optional<uint64_t> GetDWOId();
 
+  /// Given a DWO DWARFUnit, find the corresponding skeleton DWARFUnit
+  /// in the main symbol file. DWP files can have their DWARFUnits
+  /// parsed without the skeleton compile units having been parsed, so
+  /// sometimes we need to find the skeleton compile unit for a DWO
+  /// DWARFUnit so we can fill in this link. Currently unless the
+  /// skeleton compile unit has been parsed _and_ the Unit DIE has been
+  /// parsed, the DWO unit will not have a backward link setup correctly
+  /// which was causing crashes due to an assertion that was firing
+  /// in SymbolFileDWARF::GetCompUnitForDWARFCompUnit().
+  DWARFUnit *GetSkeletonUnit(DWARFUnit *dwo_unit);
+
   static bool DIEInDeclContext(const CompilerDeclContext &parent_decl_ctx,
                                const DWARFDIE &die,
                                bool only_root_namespaces = false);
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp
index b52cb514fb190..ea23b75c3d708 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp
@@ -160,3 +160,22 @@ void SymbolFileDWARFDwo::FindGlobalVariables(
   GetBaseSymbolFile().FindGlobalVariables(name, parent_decl_ctx, max_matches,
                                           variables);
 }
+
+bool SymbolFileDWARFDwo::GetDebugInfoIndexWasLoadedFromCache() const {
+  return GetBaseSymbolFile().GetDebugInfoIndexWasLoadedFromCache();
+}
+void SymbolFileDWARFDwo::SetDebugInfoIndexWasLoadedFromCache() {
+  GetBaseSymbolFile().SetDebugInfoIndexWasLoadedFromCache();
+}
+bool SymbolFileDWARFDwo::GetDebugInfoIndexWasSavedToCache() const {
+  return GetBaseSymbolFile().GetDebugInfoIndexWasSavedToCache();
+}
+void SymbolFileDWARFDwo::SetDebugInfoIndexWasSavedToCache() {
+  GetBaseSymbolFile().SetDebugInfoIndexWasSavedToCache();
+}
+bool SymbolFileDWARFDwo::GetDebugInfoHadFrameVariableErrors() const {
+  return GetBaseSymbolFile().GetDebugInfoHadFrameVariableErrors();
+}
+void SymbolFileDWARFDwo::SetDebugInfoHadFrameVariableErrors() {
+  return GetBaseSymbolFile().SetDebugInfoHadFrameVariableErrors();
+}
diff --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.h b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.h
index 5c4b36328cbac..d5f48f2a8ed4e 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.h
@@ -58,6 +58,15 @@ class SymbolFileDWARFDwo : public SymbolFileDWARF {
                            uint32_t max_matches,
                            VariableList &variables) override;
 
+  SymbolFileDWARF &GetBaseSymbolFile() const { return m_base_symbol_file; }
+
+  bool GetDebugInfoIndexWasLoadedFromCache() const override;
+  void SetDebugInfoIndexWasLoadedFromCache() override;
+  bool GetDebugInfoIndexWasSavedToCache() const override;
+  void SetDebugInfoIndexWasSavedToCache() override;
+  bool GetDebugInfoHadFrameVariableErrors() const override;
+  void SetDebugInfoHadFrameVariableErrors() override;
+
 protected:
   DIEToTypePtr &GetDIEToType() override;
 
@@ -77,8 +86,6 @@ class SymbolFileDWARFDwo : public SymbolFileDWARF {
                                        ConstString type_name,
                                        bool must_be_implementation) override;
 
-  SymbolFileDWARF &GetBaseSymbolFile() const { return m_base_symbol_file; }
-
   /// If this file contains exactly one compile unit, this function will return
   /// it. Otherwise it returns nullptr.
   DWARFCompileUnit *FindSingleCompileUnit();
diff --git a/lldb/test/Shell/SymbolFile/DWARF/x86/dwp-separate-debug-file.cpp b/lldb/test/Shell/SymbolFile/DWARF/x86/dwp-separate-debug-file.cpp
index cda2992604511..78476ae8d99cd 100644
--- a/lldb/test/Shell/SymbolFile/DWARF/x86/dwp-separate-debug-file.cpp
+++ b/lldb/test/Shell/SymbolFile/DWARF/x86/dwp-separate-debug-file.cpp
@@ -8,8 +8,46 @@
 // RUN: llvm-objcopy --strip-all --add-gnu-debuglink=%t.debug %t
 // RUN: %lldb %t -o "target variable a" -b | FileCheck %s
 
+// Run one time with the index cache enabled to populate the index cache. When
+// we populate the index cache we have to parse all of the DWARF debug info
+// and it is always available.
+// RUN: rm -rf %T/lldb-index-cache
+// RUN: %lldb \
+// RUN:   -O 'settings set symbols.enable-lldb-index-cache true' \
+// RUN:   -O 'settings set symbols.lldb-index-cache-path %T/lldb-index-cache' \
+// RUN:   -O 'settings set target.preload-symbols false' \
+// RUN:   -o "script lldb.target.modules[0].FindTypes('::A').GetTypeAtIndex(0)" \
+// RUN:   -o "statistics dump" \
+// RUN:   %t -b | FileCheck %s -check-prefix=CACHE
+
+// Run again after index cache was enabled, which load the index cache. When we
+// load the index cache from disk, we don't have any DWARF parsed yet and this
+// can cause us to try and access information in the .dwp directly without
+// parsing the .debug_info, but this caused crashes when the DWO files didn't
+// have a backlink to the skeleton compile unit. This test verifies that we
+// don't crash and that we can find types when using .dwp files.
+// RUN: %lldb \
+// RUN:   -O 'settings set symbols.enable-lldb-index-cache true' \
+// RUN:   -O 'settings set symbols.lldb-index-cache-path %T/lldb-index-cache' \
+// RUN:   -O 'settings set target.preload-symbols false' \
+// RUN:   -o "script lldb.target.modules[0].FindTypes('::A').GetTypeAtIndex(0)" \
+// RUN:   -o "statistics dump" \
+// RUN:   %t -b | FileCheck %s -check-prefix=CACHED
+
 // CHECK: (A) a = (x = 47)
 
+// CACHE: script lldb.target.modules[0].FindTypes('::A').GetTypeAtIndex(0)
+// CACHE: struct A {
+// CACHE-NEXT: int x;
+// CACHE-NEXT: }
+// CACHE: "totalDebugInfoIndexSavedToCache": 1
+
+// CACHED: script lldb.target.modules[0].FindTypes('::A').GetTypeAtIndex(0)
+// CACHED: struct A {
+// CACHED-NEXT: int x;
+// CACHED-NEXT: }
+// CACHED: "totalDebugInfoIndexLoadedFromCache": 1
+
 struct A {
   int x = 47;
 };

>From 6307628fb18770ebdb87f36d0213170cdd2d7a55 Mon Sep 17 00:00:00 2001
From: Greg Clayton <clayborg at gmail.com>
Date: Tue, 30 Jan 2024 19:15:16 -0800
Subject: [PATCH 2/4] Added a dwarf4 test.

---
 .../DWARF/x86/dwp-separate-debug-file.cpp     | 53 +++++++++++++++----
 1 file changed, 44 insertions(+), 9 deletions(-)

diff --git a/lldb/test/Shell/SymbolFile/DWARF/x86/dwp-separate-debug-file.cpp b/lldb/test/Shell/SymbolFile/DWARF/x86/dwp-separate-debug-file.cpp
index 78476ae8d99cd..327597975d1dd 100644
--- a/lldb/test/Shell/SymbolFile/DWARF/x86/dwp-separate-debug-file.cpp
+++ b/lldb/test/Shell/SymbolFile/DWARF/x86/dwp-separate-debug-file.cpp
@@ -1,12 +1,12 @@
 // REQUIRES: lld
 
-// RUN: %clang -target x86_64-pc-linux -gsplit-dwarf -g -c %s -o %t.o
-// RUN: ld.lld %t.o -o %t
-// RUN: llvm-dwp %t.dwo -o %t.dwp
-// RUN: rm %t.dwo
-// RUN: llvm-objcopy --only-keep-debug %t %t.debug
-// RUN: llvm-objcopy --strip-all --add-gnu-debuglink=%t.debug %t
-// RUN: %lldb %t -o "target variable a" -b | FileCheck %s
+// RUN: %clang -target x86_64-pc-linux -gsplit-dwarf -gdwarf-5 -c %s -o %t.dwarf5.o
+// RUN: ld.lld %t.dwarf5.o -o %t.dwarf5
+// RUN: llvm-dwp %t.dwarf5.dwo -o %t.dwarf5.dwp
+// RUN: rm %t.dwarf5.dwo
+// RUN: llvm-objcopy --only-keep-debug %t.dwarf5 %t.dwarf5.debug
+// RUN: llvm-objcopy --strip-all --add-gnu-debuglink=%t.dwarf5.debug %t.dwarf5
+// RUN: %lldb %t.dwarf5 -o "target variable a" -b | FileCheck %s
 
 // Run one time with the index cache enabled to populate the index cache. When
 // we populate the index cache we have to parse all of the DWARF debug info
@@ -18,7 +18,7 @@
 // RUN:   -O 'settings set target.preload-symbols false' \
 // RUN:   -o "script lldb.target.modules[0].FindTypes('::A').GetTypeAtIndex(0)" \
 // RUN:   -o "statistics dump" \
-// RUN:   %t -b | FileCheck %s -check-prefix=CACHE
+// RUN:   %t.dwarf5 -b | FileCheck %s -check-prefix=CACHE
 
 // Run again after index cache was enabled, which load the index cache. When we
 // load the index cache from disk, we don't have any DWARF parsed yet and this
@@ -32,7 +32,42 @@
 // RUN:   -O 'settings set target.preload-symbols false' \
 // RUN:   -o "script lldb.target.modules[0].FindTypes('::A').GetTypeAtIndex(0)" \
 // RUN:   -o "statistics dump" \
-// RUN:   %t -b | FileCheck %s -check-prefix=CACHED
+// RUN:   %t.dwarf5 -b | FileCheck %s -check-prefix=CACHED
+
+// Now test with DWARF4
+// RUN: %clang -target x86_64-pc-linux -gsplit-dwarf -gdwarf-4 -c %s -o %t.dwarf4.o
+// RUN: ld.lld %t.dwarf4.o -o %t.dwarf4
+// RUN: llvm-dwp %t.dwarf4.dwo -o %t.dwarf4.dwp
+// RUN: rm %t.dwarf4.dwo
+// RUN: llvm-objcopy --only-keep-debug %t.dwarf4 %t.dwarf4.debug
+// RUN: llvm-objcopy --strip-all --add-gnu-debuglink=%t.dwarf4.debug %t.dwarf4
+// RUN: %lldb %t.dwarf4 -o "target variable a" -b | FileCheck %s
+
+// Run one time with the index cache enabled to populate the index cache. When
+// we populate the index cache we have to parse all of the DWARF debug info
+// and it is always available.
+// RUN: rm -rf %T/lldb-index-cache
+// RUN: %lldb \
+// RUN:   -O 'settings set symbols.enable-lldb-index-cache true' \
+// RUN:   -O 'settings set symbols.lldb-index-cache-path %T/lldb-index-cache' \
+// RUN:   -O 'settings set target.preload-symbols false' \
+// RUN:   -o "script lldb.target.modules[0].FindTypes('::A').GetTypeAtIndex(0)" \
+// RUN:   -o "statistics dump" \
+// RUN:   %t.dwarf4 -b | FileCheck %s -check-prefix=CACHE
+
+// Run again after index cache was enabled, which load the index cache. When we
+// load the index cache from disk, we don't have any DWARF parsed yet and this
+// can cause us to try and access information in the .dwp directly without
+// parsing the .debug_info, but this caused crashes when the DWO files didn't
+// have a backlink to the skeleton compile unit. This test verifies that we
+// don't crash and that we can find types when using .dwp files.
+// RUN: %lldb \
+// RUN:   -O 'settings set symbols.enable-lldb-index-cache true' \
+// RUN:   -O 'settings set symbols.lldb-index-cache-path %T/lldb-index-cache' \
+// RUN:   -O 'settings set target.preload-symbols false' \
+// RUN:   -o "script lldb.target.modules[0].FindTypes('::A').GetTypeAtIndex(0)" \
+// RUN:   -o "statistics dump" \
+// RUN:   %t.dwarf4 -b | FileCheck %s -check-prefix=CACHED
 
 // CHECK: (A) a = (x = 47)
 

>From 039a8bf5f35282946b9b4217fba6366b49912f00 Mon Sep 17 00:00:00 2001
From: Greg Clayton <clayborg at gmail.com>
Date: Tue, 30 Jan 2024 20:07:29 -0800
Subject: [PATCH 3/4] Ran clang format.

---
 lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
index e59e328c6e7cb..ef936b203305c 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
@@ -81,8 +81,8 @@ void DWARFDebugInfo::ParseUnitsFor(DIERef::Section section) {
                                 : m_context.getOrLoadDebugInfoData();
   lldb::offset_t offset = 0;
   while (data.ValidOffset(offset)) {
-    llvm::Expected<DWARFUnitSP> expected_unit_sp = DWARFUnit::extract(
-        m_dwarf, m_units.size(), data, section, &offset);
+    llvm::Expected<DWARFUnitSP> expected_unit_sp =
+        DWARFUnit::extract(m_dwarf, m_units.size(), data, section, &offset);
 
     if (!expected_unit_sp) {
       // FIXME: Propagate this error up.

>From 025eca94790a100dab1f17297c39590bf9658517 Mon Sep 17 00:00:00 2001
From: Greg Clayton <clayborg at gmail.com>
Date: Wed, 31 Jan 2024 14:03:30 -0800
Subject: [PATCH 4/4] Address review feedback

Changes:
- make the llvm-index-cache directory based off of %t to ensure uniqueness
- log to DWARF debug info channel when we fail to parse DWARFUnitHeaders
---
 .../Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp      |  7 +++++--
 .../SymbolFile/DWARF/x86/dwp-separate-debug-file.cpp | 12 ++++++------
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
index ef936b203305c..6bcf95c11a65c 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp
@@ -24,6 +24,7 @@
 #include "DWARFDebugInfoEntry.h"
 #include "DWARFFormValue.h"
 #include "DWARFTypeUnit.h"
+#include "LogChannelDWARF.h"
 
 using namespace lldb;
 using namespace lldb_private;
@@ -81,12 +82,14 @@ void DWARFDebugInfo::ParseUnitsFor(DIERef::Section section) {
                                 : m_context.getOrLoadDebugInfoData();
   lldb::offset_t offset = 0;
   while (data.ValidOffset(offset)) {
+    const lldb::offset_t unit_header_offset = offset;
     llvm::Expected<DWARFUnitSP> expected_unit_sp =
         DWARFUnit::extract(m_dwarf, m_units.size(), data, section, &offset);
 
     if (!expected_unit_sp) {
-      // FIXME: Propagate this error up.
-      llvm::consumeError(expected_unit_sp.takeError());
+      Log *log = GetLog(DWARFLog::DebugInfo);
+      LLDB_LOG(log, "Unable to extract DWARFUnitHeader at {0:x}: {1}",
+               unit_header_offset, llvm::toString(expected_unit_sp.takeError()));
       return;
     }
 
diff --git a/lldb/test/Shell/SymbolFile/DWARF/x86/dwp-separate-debug-file.cpp b/lldb/test/Shell/SymbolFile/DWARF/x86/dwp-separate-debug-file.cpp
index 327597975d1dd..a47209931c384 100644
--- a/lldb/test/Shell/SymbolFile/DWARF/x86/dwp-separate-debug-file.cpp
+++ b/lldb/test/Shell/SymbolFile/DWARF/x86/dwp-separate-debug-file.cpp
@@ -11,10 +11,10 @@
 // Run one time with the index cache enabled to populate the index cache. When
 // we populate the index cache we have to parse all of the DWARF debug info
 // and it is always available.
-// RUN: rm -rf %T/lldb-index-cache
+// RUN: rm -rf %t.lldb-index-cache
 // RUN: %lldb \
 // RUN:   -O 'settings set symbols.enable-lldb-index-cache true' \
-// RUN:   -O 'settings set symbols.lldb-index-cache-path %T/lldb-index-cache' \
+// RUN:   -O 'settings set symbols.lldb-index-cache-path %t.lldb-index-cache' \
 // RUN:   -O 'settings set target.preload-symbols false' \
 // RUN:   -o "script lldb.target.modules[0].FindTypes('::A').GetTypeAtIndex(0)" \
 // RUN:   -o "statistics dump" \
@@ -28,7 +28,7 @@
 // don't crash and that we can find types when using .dwp files.
 // RUN: %lldb \
 // RUN:   -O 'settings set symbols.enable-lldb-index-cache true' \
-// RUN:   -O 'settings set symbols.lldb-index-cache-path %T/lldb-index-cache' \
+// RUN:   -O 'settings set symbols.lldb-index-cache-path %t.lldb-index-cache' \
 // RUN:   -O 'settings set target.preload-symbols false' \
 // RUN:   -o "script lldb.target.modules[0].FindTypes('::A').GetTypeAtIndex(0)" \
 // RUN:   -o "statistics dump" \
@@ -46,10 +46,10 @@
 // Run one time with the index cache enabled to populate the index cache. When
 // we populate the index cache we have to parse all of the DWARF debug info
 // and it is always available.
-// RUN: rm -rf %T/lldb-index-cache
+// RUN: rm -rf %t.lldb-index-cache
 // RUN: %lldb \
 // RUN:   -O 'settings set symbols.enable-lldb-index-cache true' \
-// RUN:   -O 'settings set symbols.lldb-index-cache-path %T/lldb-index-cache' \
+// RUN:   -O 'settings set symbols.lldb-index-cache-path %t.lldb-index-cache' \
 // RUN:   -O 'settings set target.preload-symbols false' \
 // RUN:   -o "script lldb.target.modules[0].FindTypes('::A').GetTypeAtIndex(0)" \
 // RUN:   -o "statistics dump" \
@@ -63,7 +63,7 @@
 // don't crash and that we can find types when using .dwp files.
 // RUN: %lldb \
 // RUN:   -O 'settings set symbols.enable-lldb-index-cache true' \
-// RUN:   -O 'settings set symbols.lldb-index-cache-path %T/lldb-index-cache' \
+// RUN:   -O 'settings set symbols.lldb-index-cache-path %t.lldb-index-cache' \
 // RUN:   -O 'settings set target.preload-symbols false' \
 // RUN:   -o "script lldb.target.modules[0].FindTypes('::A').GetTypeAtIndex(0)" \
 // RUN:   -o "statistics dump" \



More information about the lldb-commits mailing list