[Lldb-commits] [lldb] 3db7cc1 - Fix a double debug info size counting in top level stats for "statistics dump".

Greg Clayton via lldb-commits lldb-commits at lists.llvm.org
Thu Feb 10 10:55:25 PST 2022


Author: Greg Clayton
Date: 2022-02-10T10:55:18-08:00
New Revision: 3db7cc1ba41f192d9eeb33cdaf06f4e95abf6bdc

URL: https://github.com/llvm/llvm-project/commit/3db7cc1ba41f192d9eeb33cdaf06f4e95abf6bdc
DIFF: https://github.com/llvm/llvm-project/commit/3db7cc1ba41f192d9eeb33cdaf06f4e95abf6bdc.diff

LOG: Fix a double debug info size counting in top level stats for "statistics dump".

This mainly affects Darwin targets (macOS, iOS, tvOS and watchOS) when these targets don't use dSYM files and the debug info was in the .o files. All modules, including the .o files that are loaded by the debug maps, were in the global module list. This was great because it allows us to see each .o file and how much it contributes. There were virtual functions on the SymbolFile class to fetch the symtab/debug info parse and index times, and also the total debug info size. So the main executable would add all of the .o file's stats together and report them as its own data. Then the "totalDebugInfoSize" and many other "totalXXX" top level totals were all being added together. This stems from the fact that my original patch only emitted the modules for a target at the start of the patch, but as comments from the reviews came in, we switched to emitting all of the modules from the global module list.

So this patch fixes it so when we have a SymbolFileDWARFDebugMap that loads .o files, the main executable will have no debug info size or symtab/debug info parse/index times, but each .o file will have its own data as a separate module. Also, to be able to tell when/if we have a dSYM file I have added a "symbolFilePath" if the SymbolFile for the main modules path doesn't match that of the main executable. We also include a "symbolFileModuleIdentifiers" key in each module if the module does have multiple lldb_private::Module objects that contain debug info so that you can track down the information for a module and add up the contributions of all of the .o files.

Tests were added that are labeled with @skipUnlessDarwin and @no_debug_info_test that test all of this functionality so it doesn't regress.

For a module with a dSYM file, we can see the "symbolFilePath" is included:
```
  "modules": [
    {
      "debugInfoByteSize": 1070,
      "debugInfoIndexLoadedFromCache": false,
      "debugInfoIndexSavedToCache": false,
      "debugInfoIndexTime": 0,
      "debugInfoParseTime": 0,
      "identifier": 4873280600,
      "path": "/Users/gclayton/Documents/src/lldb/main/Debug/lldb-test-build.noindex/commands/statistics/basic/TestStats.test_dsym_binary_has_symfile_in_stats/a.out",
      "symbolFilePath": "/Users/gclayton/Documents/src/lldb/main/Debug/lldb-test-build.noindex/commands/statistics/basic/TestStats.test_dsym_binary_has_symfile_in_stats/a.out.dSYM/Contents/Resources/DWARF/a.out",
      "symbolTableIndexTime": 7.9999999999999996e-06,
      "symbolTableLoadedFromCache": false,
      "symbolTableParseTime": 7.8999999999999996e-05,
      "symbolTableSavedToCache": false,
      "triple": "arm64-apple-macosx12.0.0",
      "uuid": "E1F7D85B-3A42-321E-BF0D-29B103F5F2E3"
    },
```
And for the DWARF in .o file case we can see the "symbolFileModuleIdentifiers" in the executable's module stats:
```
  "modules": [
    {
      "debugInfoByteSize": 0,
      "debugInfoIndexLoadedFromCache": false,
      "debugInfoIndexSavedToCache": false,
      "debugInfoIndexTime": 0,
      "debugInfoParseTime": 0,
      "identifier": 4603526968,
      "path": "/Users/gclayton/Documents/src/lldb/main/Debug/lldb-test-build.noindex/commands/statistics/basic/TestStats.test_no_dsym_binary_has_symfile_identifiers_in_stats/a.out",
      "symbolFileModuleIdentifiers": [
        4604429832
      ],
      "symbolTableIndexTime": 7.9999999999999996e-06,
      "symbolTableLoadedFromCache": false,
      "symbolTableParseTime": 0.000112,
      "symbolTableSavedToCache": false,
      "triple": "arm64-apple-macosx12.0.0",
      "uuid": "57008BF5-A726-3DE9-B1BF-3A9AD3EE8569"
    },
```
And the .o file for 4604429832 looks like:
```
    {
      "debugInfoByteSize": 1028,
      "debugInfoIndexLoadedFromCache": false,
      "debugInfoIndexSavedToCache": false,
      "debugInfoIndexTime": 0,
      "debugInfoParseTime": 6.0999999999999999e-05,
      "identifier": 4604429832,
      "path": "/Users/gclayton/Documents/src/lldb/main/Debug/lldb-test-build.noindex/commands/statistics/basic/TestStats.test_no_dsym_binary_has_symfile_identifiers_in_stats/main.o",
      "symbolTableIndexTime": 0,
      "symbolTableLoadedFromCache": false,
      "symbolTableParseTime": 0,
      "symbolTableSavedToCache": false,
      "triple": "arm64-apple-macosx"
    }
```

Differential Revision: https://reviews.llvm.org/D119400

Added: 
    

Modified: 
    lldb/include/lldb/Symbol/SymbolFile.h
    lldb/include/lldb/Target/Statistics.h
    lldb/source/Core/Section.cpp
    lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.cpp
    lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.h
    lldb/source/Target/Statistics.cpp
    lldb/test/API/commands/statistics/basic/TestStats.py

Removed: 
    


################################################################################
diff  --git a/lldb/include/lldb/Symbol/SymbolFile.h b/lldb/include/lldb/Symbol/SymbolFile.h
index 6cdb2bfb686b..c7bb6c16f039 100644
--- a/lldb/include/lldb/Symbol/SymbolFile.h
+++ b/lldb/include/lldb/Symbol/SymbolFile.h
@@ -9,6 +9,7 @@
 #ifndef LLDB_SYMBOL_SYMBOLFILE_H
 #define LLDB_SYMBOL_SYMBOLFILE_H
 
+#include "lldb/Core/ModuleList.h"
 #include "lldb/Core/PluginInterface.h"
 #include "lldb/Core/SourceLocationSpec.h"
 #include "lldb/Symbol/CompilerDecl.h"
@@ -325,6 +326,15 @@ class SymbolFile : public PluginInterface {
   /// hasn't been indexed yet, or a valid duration if it has.
   virtual StatsDuration::Duration GetDebugInfoIndexTime() { return {}; }
 
+  /// Get the additional modules that this symbol file uses to parse debug info.
+  ///
+  /// Some debug info is stored in stand alone object files that are represented
+  /// by unique modules that will show up in the statistics module list. Return
+  /// a list of modules that are not in the target module list that this symbol
+  /// file is currently using so that they can be tracked and assoicated with
+  /// the module in the statistics.
+  virtual ModuleList GetDebugInfoModules() { return ModuleList(); }
+
   /// Accessors for the bool that indicates if the debug info index was loaded
   /// from, or saved to the module index cache.
   ///

diff  --git a/lldb/include/lldb/Target/Statistics.h b/lldb/include/lldb/Target/Statistics.h
index d2b8f746a38c..d5c15bc57e6f 100644
--- a/lldb/include/lldb/Target/Statistics.h
+++ b/lldb/include/lldb/Target/Statistics.h
@@ -100,6 +100,13 @@ struct ModuleStats {
   std::string path;
   std::string uuid;
   std::string triple;
+  // Path separate debug info file, or empty if none.
+  std::string symfile_path;
+  // If the debug info is contained in multiple files where each one is
+  // represented as a separate lldb_private::Module, then these are the
+  // identifiers of these modules in the global module list. This allows us to
+  // track down all of the stats that contribute to this module.
+  std::vector<intptr_t> symfile_modules;
   double symtab_parse_time = 0.0;
   double symtab_index_time = 0.0;
   double debug_parse_time = 0.0;

diff  --git a/lldb/source/Core/Section.cpp b/lldb/source/Core/Section.cpp
index 1660e3c92f2c..cb9d41f19b50 100644
--- a/lldb/source/Core/Section.cpp
+++ b/lldb/source/Core/Section.cpp
@@ -423,9 +423,15 @@ bool Section::ContainsOnlyDebugInfo() const {
   case eSectionTypeGoSymtab:
   case eSectionTypeAbsoluteAddress:
   case eSectionTypeOther:
+  // Used for "__dof_cache" in mach-o or ".debug" for COFF which isn't debug
+  // information that we parse at all. This was causing system files with no
+  // debug info to show debug info byte sizes in the "statistics dump" output
+  // for each module. New "eSectionType" enums should be created for dedicated
+  // debug info that has a predefined format if we wish for these sections to
+  // show up as debug info.
+  case eSectionTypeDebug:
     return false;
 
-  case eSectionTypeDebug:
   case eSectionTypeDWARFDebugAbbrev:
   case eSectionTypeDWARFDebugAbbrevDwo:
   case eSectionTypeDWARFDebugAddr:

diff  --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.cpp b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.cpp
index 08bfe37fd92f..31224ecb18c6 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.cpp
@@ -1430,53 +1430,16 @@ SymbolFileDWARFDebugMap::AddOSOARanges(SymbolFileDWARF *dwarf2Data,
   return num_line_entries_added;
 }
 
-uint64_t SymbolFileDWARFDebugMap::GetDebugInfoSize() {
-  uint64_t debug_info_size = 0;
-  ForEachSymbolFile([&](SymbolFileDWARF *oso_dwarf) -> bool {
-    ObjectFile *oso_objfile = oso_dwarf->GetObjectFile();
-    if (!oso_objfile)
-      return false; // Keep iterating
-    ModuleSP module_sp = oso_objfile->GetModule();
-    if (!module_sp)
-      return false; // Keep iterating
-    SectionList *section_list = module_sp->GetSectionList();
-    if (section_list)
-      debug_info_size += section_list->GetDebugInfoSize();
-    return false; // Keep iterating
-  });
-  return debug_info_size;
-}
-
-StatsDuration::Duration SymbolFileDWARFDebugMap::GetDebugInfoParseTime() {
-  StatsDuration::Duration elapsed(0.0);
+ModuleList SymbolFileDWARFDebugMap::GetDebugInfoModules() {
+  ModuleList oso_modules;
   ForEachSymbolFile([&](SymbolFileDWARF *oso_dwarf) -> bool {
     ObjectFile *oso_objfile = oso_dwarf->GetObjectFile();
     if (oso_objfile) {
       ModuleSP module_sp = oso_objfile->GetModule();
-      if (module_sp) {
-        SymbolFile *symfile = module_sp->GetSymbolFile();
-        if (symfile)
-          elapsed += symfile->GetDebugInfoParseTime();
-      }
-    }
-    return false; // Keep iterating
-  });
-  return elapsed;
-}
-
-StatsDuration::Duration SymbolFileDWARFDebugMap::GetDebugInfoIndexTime() {
-  StatsDuration::Duration elapsed(0.0);
-  ForEachSymbolFile([&](SymbolFileDWARF *oso_dwarf) -> bool {
-    ObjectFile *oso_objfile = oso_dwarf->GetObjectFile();
-    if (oso_objfile) {
-      ModuleSP module_sp = oso_objfile->GetModule();
-      if (module_sp) {
-        SymbolFile *symfile = module_sp->GetSymbolFile();
-        if (symfile)
-          elapsed += symfile->GetDebugInfoIndexTime();
-      }
+      if (module_sp)
+        oso_modules.Append(module_sp);
     }
     return false; // Keep iterating
   });
-  return elapsed;
+  return oso_modules;
 }

diff  --git a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.h b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.h
index 2a6232a501b4..7bff978941a5 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDebugMap.h
@@ -142,9 +142,8 @@ class SymbolFileDWARFDebugMap : public lldb_private::SymbolFile {
   // PluginInterface protocol
   llvm::StringRef GetPluginName() override { return GetPluginNameStatic(); }
 
-  uint64_t GetDebugInfoSize() override;
-  lldb_private::StatsDuration::Duration GetDebugInfoParseTime() override;
-  lldb_private::StatsDuration::Duration GetDebugInfoIndexTime() override;
+  // Statistics overrides.
+  lldb_private::ModuleList GetDebugInfoModules() override;
 
 protected:
   enum { kHaveInitializedOSOs = (1 << 0), kNumFlags };

diff  --git a/lldb/source/Target/Statistics.cpp b/lldb/source/Target/Statistics.cpp
index ebddad837d14..28800f953553 100644
--- a/lldb/source/Target/Statistics.cpp
+++ b/lldb/source/Target/Statistics.cpp
@@ -62,6 +62,15 @@ json::Value ModuleStats::ToJSON() const {
                      debug_info_index_loaded_from_cache);
   module.try_emplace("debugInfoIndexSavedToCache",
                      debug_info_index_saved_to_cache);
+  if (!symfile_path.empty())
+    module.try_emplace("symbolFilePath", symfile_path);
+
+  if (!symfile_modules.empty()) {
+    json::Array symfile_ids;
+    for (const auto symfile_id: symfile_modules)
+      symfile_ids.emplace_back(symfile_id);
+    module.try_emplace("symbolFileModuleIdentifiers", std::move(symfile_ids));
+  }
   return module;
 }
 
@@ -200,6 +209,10 @@ llvm::json::Value DebuggerStats::ReportStatistics(Debugger &debugger,
     }
     SymbolFile *sym_file = module->GetSymbolFile();
     if (sym_file) {
+
+      if (sym_file->GetObjectFile() != module->GetObjectFile())
+        module_stat.symfile_path =
+            sym_file->GetObjectFile()->GetFileSpec().GetPath();
       module_stat.debug_index_time = sym_file->GetDebugInfoIndexTime().count();
       module_stat.debug_parse_time = sym_file->GetDebugInfoParseTime().count();
       module_stat.debug_info_size = sym_file->GetDebugInfoSize();
@@ -211,6 +224,9 @@ llvm::json::Value DebuggerStats::ReportStatistics(Debugger &debugger,
           sym_file->GetDebugInfoIndexWasSavedToCache();
       if (module_stat.debug_info_index_saved_to_cache)
         ++debug_index_saved;
+      ModuleList symbol_modules = sym_file->GetDebugInfoModules();
+      for (const auto &symbol_module: symbol_modules.Modules())
+        module_stat.symfile_modules.push_back((intptr_t)symbol_module.get());
     }
     symtab_parse_time += module_stat.symtab_parse_time;
     symtab_index_time += module_stat.symtab_index_time;

diff  --git a/lldb/test/API/commands/statistics/basic/TestStats.py b/lldb/test/API/commands/statistics/basic/TestStats.py
index 99940ed5061f..77df1cda6358 100644
--- a/lldb/test/API/commands/statistics/basic/TestStats.py
+++ b/lldb/test/API/commands/statistics/basic/TestStats.py
@@ -1,5 +1,6 @@
 import lldb
 import json
+import os
 from lldbsuite.test.decorators import *
 from lldbsuite.test.lldbtest import *
 from lldbsuite.test import lldbutil
@@ -8,10 +9,6 @@ class TestCase(TestBase):
 
     mydir = TestBase.compute_mydir(__file__)
 
-    def setUp(self):
-        TestBase.setUp(self)
-        self.build()
-
     NO_DEBUG_INFO_TESTCASE = True
 
     def test_enable_disable(self):
@@ -22,6 +19,7 @@ def test_enable_disable(self):
         of LLDB and test that enabling and disabling stops expesive information
         from being gathered.
         """
+        self.build()
         target = self.createTestTarget()
 
         self.expect("statistics disable", substrs=['need to enable statistics before disabling'], error=True)
@@ -89,6 +87,7 @@ def get_target_stats(self, debug_stats):
         return None
 
     def test_expressions_frame_var_counts(self):
+        self.build()
         lldbutil.run_to_source_breakpoint(self, "// break here",
                                           lldb.SBFileSpec("main.c"))
 
@@ -158,6 +157,7 @@ def test_default_no_run(self):
           "totalSymbolTableIndexTime": 0.234,
         }
         """
+        self.build()
         target = self.createTestTarget()
         debug_stats = self.get_stats()
         debug_stat_keys = [
@@ -225,6 +225,7 @@ def test_default_with_run(self):
         }
 
         """
+        self.build()
         target = self.createTestTarget()
         lldbutil.run_to_source_breakpoint(self, "// break here",
                                           lldb.SBFileSpec("main.c"))
@@ -262,6 +263,7 @@ def test_memory(self):
         """
             Test "statistics dump" and the memory information.
         """
+        self.build()
         exe = self.getBuildArtifact("a.out")
         target = self.createTestTarget(file_path=exe)
         debug_stats = self.get_stats()
@@ -303,10 +305,18 @@ def find_module_in_metrics(self, path, stats):
                 return module
         return None
 
+    def find_module_by_id_in_metrics(self, id, stats):
+        modules = stats['modules']
+        for module in modules:
+            if module['identifier'] == id:
+                return module
+        return None
+
     def test_modules(self):
         """
             Test "statistics dump" and the module information.
         """
+        self.build()
         exe = self.getBuildArtifact("a.out")
         target = self.createTestTarget(file_path=exe)
         debug_stats = self.get_stats()
@@ -394,6 +404,7 @@ def test_breakpoints(self):
         }
 
         """
+        self.build()
         target = self.createTestTarget()
         self.runCmd("b main.cpp:7")
         self.runCmd("b a_function")
@@ -436,3 +447,108 @@ def test_breakpoints(self):
         for breakpoint in breakpoints:
             self.verify_keys(breakpoint, 'target_stats["breakpoints"]',
                              bp_keys_exist, None)
+
+
+    @skipUnlessDarwin
+    @no_debug_info_test
+    def test_dsym_binary_has_symfile_in_stats(self):
+        """
+            Test that if our executable has a stand alone dSYM file containing
+            debug information, that the dSYM file path is listed as a key/value
+            pair in the "a.out" binaries module stats. Also verify the the main
+            executable's module statistics has a debug info size that is greater
+            than zero as the dSYM contains debug info.
+        """
+        self.build(debug_info="dsym")
+        exe_name = 'a.out'
+        exe = self.getBuildArtifact(exe_name)
+        dsym = self.getBuildArtifact(exe_name + ".dSYM")
+        # Make sure the executable file exists after building.
+        self.assertEqual(os.path.exists(exe), True)
+        # Make sure the dSYM file exists after building.
+        self.assertEqual(os.path.isdir(dsym), True)
+
+        # Create the target
+        target = self.createTestTarget(file_path=exe)
+
+        debug_stats = self.get_stats()
+
+        exe_stats = self.find_module_in_metrics(exe, debug_stats)
+        # If we have a dSYM file, there should be a key/value pair in the module
+        # statistics and the path should match the dSYM file path in the build
+        # artifacts.
+        self.assertIn('symbolFilePath', exe_stats)
+        stats_dsym = exe_stats['symbolFilePath']
+
+        # Make sure main executable's module info has debug info size that is
+        # greater than zero as the dSYM file and main executable work together
+        # in the lldb.SBModule class to provide the data.
+        self.assertGreater(exe_stats['debugInfoByteSize'], 0)
+
+        # The "dsym" variable contains the bundle directory for the dSYM, while
+        # the "stats_dsym" will have the
+        self.assertIn(dsym, stats_dsym)
+        # Since we have a dSYM file, we should not be loading DWARF from the .o
+        # files and the .o file module identifiers should NOT be in the module
+        # statistics.
+        self.assertNotIn('symbolFileModuleIdentifiers', exe_stats)
+
+    @skipUnlessDarwin
+    @no_debug_info_test
+    def test_no_dsym_binary_has_symfile_identifiers_in_stats(self):
+        """
+            Test that if our executable loads debug info from the .o files,
+            that the module statistics contains a 'symbolFileModuleIdentifiers'
+            key which is a list of module identifiers, and verify that the
+            module identifier can be used to find the .o file's module stats.
+            Also verify the the main executable's module statistics has a debug
+            info size that is zero, as the main executable itself has no debug
+            info, but verify that the .o files have debug info size that is
+            greater than zero. This test ensures that we don't double count
+            debug info.
+        """
+        self.build(debug_info="dwarf")
+        exe_name = 'a.out'
+        exe = self.getBuildArtifact(exe_name)
+        dsym = self.getBuildArtifact(exe_name + ".dSYM")
+        print("carp: dsym = '%s'" % (dsym))
+        # Make sure the executable file exists after building.
+        self.assertEqual(os.path.exists(exe), True)
+        # Make sure the dSYM file doesn't exist after building.
+        self.assertEqual(os.path.isdir(dsym), False)
+
+        # Create the target
+        target = self.createTestTarget(file_path=exe)
+
+        # Force the 'main.o' .o file's DWARF to be loaded so it will show up
+        # in the stats.
+        self.runCmd("b main.cpp:7")
+
+        debug_stats = self.get_stats()
+
+        exe_stats = self.find_module_in_metrics(exe, debug_stats)
+        # If we don't have a dSYM file, there should not be a key/value pair in
+        # the module statistics.
+        self.assertNotIn('symbolFilePath', exe_stats)
+
+        # Make sure main executable's module info has debug info size that is
+        # zero as there is no debug info in the main executable, only in the
+        # .o files. The .o files will also only be loaded if something causes
+        # them to be loaded, so we set a breakpoint to force the .o file debug
+        # info to be loaded.
+        self.assertEqual(exe_stats['debugInfoByteSize'], 0)
+
+        # When we don't have a dSYM file, the SymbolFileDWARFDebugMap class
+        # should create modules for each .o file that contains DWARF that the
+        # symbol file creates, so we need to verify that we have a valid module
+        # identifier for main.o that is we should not be loading DWARF from the .o
+        # files and the .o file module identifiers should NOT be in the module
+        # statistics.
+        self.assertIn('symbolFileModuleIdentifiers', exe_stats)
+
+        symfileIDs = exe_stats['symbolFileModuleIdentifiers']
+        for symfileID in symfileIDs:
+            o_module = self.find_module_by_id_in_metrics(symfileID, debug_stats)
+            self.assertNotEqual(o_module, None)
+            # Make sure each .o file has some debug info bytes.
+            self.assertGreater(o_module['debugInfoByteSize'], 0)


        


More information about the lldb-commits mailing list