[Lldb-commits] [lldb] 7cfa74f - [lldb/DWARF] Trust CU DW_AT_low/high_pc information when building address tables

Pavel Labath via lldb-commits lldb-commits at lists.llvm.org
Thu Apr 23 07:12:54 PDT 2020


Author: Pavel Labath
Date: 2020-04-23T16:12:41+02:00
New Revision: 7cfa74fc6948cc1969244a4e800de6a728951897

URL: https://github.com/llvm/llvm-project/commit/7cfa74fc6948cc1969244a4e800de6a728951897
DIFF: https://github.com/llvm/llvm-project/commit/7cfa74fc6948cc1969244a4e800de6a728951897.diff

LOG: [lldb/DWARF] Trust CU DW_AT_low/high_pc information when building address tables

Summary:
The code in DWARFCompileUnit::BuildAddressRangeTable tries hard to avoid
relying on DW_AT_low/high_pc for compile unit range information, and
this logic is a big cause of llvm/lldb divergence in the lowest layers
of dwarf parsing code.

The implicit assumption in that code is that this information (as opposed to
DW_AT_ranges) is unreliable. However, I have not been able to verify
that assumption. It is definitely not true for all present-day
compilers (gcc, clang, icc), and it was also not the case for the
historic compilers that I have been able to get a hold of (thanks Matt
Godbolt).

All compiler included in my research either produced correct
DW_AT_ranges or .debug_aranges entries, or they produced no DW_AT_hi/lo
pc at all. The detailed findings are:
- gcc >= 4.4: produces DW_AT_ranges and .debug_aranges
- 4.1 <= gcc < 4.4: no DW_AT_ranges, no DW_AT_high_pc, .debug_aranges
  present. The upper version range here is uncertain as godbolt.org does
  not have intermediate versions.
- gcc < 4.1: no versions on godbolt.org
- clang >= 3.5: produces DW_AT_ranges, and (optionally) .debug_aranges
- 3.4 <= clang < 3.5: no DW_AT_ranges, no DW_AT_high_pc, .debug_aranges
  present.
- clang <= 3.3: no DW_AT_ranges, no DW_AT_high_pc, no .debug_aranges
- icc >= 16.0.1: produces DW_AT_ranges
- icc < 16.0.1: no functional versions on godbolt.org (some are present
  but fail to compile)

Based on this analysis, I believe it is safe to start trusting
DW_AT_low/high_pc information in dwarf as well as remove the code for
manually reconstructing range information by traversing the DIE
structure, and just keep the line table fallback. The only compilers
where this will change behavior are pre-3.4 clangs, which are almost 7
years old now. However, the functionality should remain unchanged
because we will be able to reconstruct this information from the line
table, which seems to be needed for some line-tables-only scenarios
anyway (haven't researched this too much, but at least some compilers
seem to emit DW_AT_ranges even in these situations).

In addition, benchmarks showed that for these compilers computing the
ranges via line tables is noticably faster than doing so via the DIE
tree.

Other advantages include simplifying the code base, removing some
untested code (the only test changes are recent tests with overly
reduced synthetic dwarf), and increasing llvm convergence.

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D78489

Added: 
    

Modified: 
    lldb/source/Plugins/SymbolFile/DWARF/DWARFCompileUnit.cpp
    lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
    lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.h
    lldb/test/Shell/SymbolFile/DWARF/DW_AT_loclists_base.s
    lldb/test/Shell/SymbolFile/DWARF/debug_loc.s

Removed: 
    


################################################################################
diff  --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFCompileUnit.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFCompileUnit.cpp
index 66fcbfdcc0ec..f54fe0662aa2 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFCompileUnit.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFCompileUnit.cpp
@@ -33,41 +33,27 @@ void DWARFCompileUnit::BuildAddressRangeTable(
 
   size_t num_debug_aranges = debug_aranges->GetNumRanges();
 
-  // First get the compile unit DIE only and check if it has a DW_AT_ranges
+  // First get the compile unit DIE only and check contains ranges information.
   const DWARFDebugInfoEntry *die = GetUnitDIEPtrOnly();
 
   const dw_offset_t cu_offset = GetOffset();
   if (die) {
     DWARFRangeList ranges;
     const size_t num_ranges =
-        die->GetAttributeAddressRanges(this, ranges, false);
+        die->GetAttributeAddressRanges(this, ranges, /*check_hi_lo_pc=*/true);
     if (num_ranges > 0) {
-      // This compile unit has DW_AT_ranges, assume this is correct if it is
-      // present since clang no longer makes .debug_aranges by default and it
-      // emits DW_AT_ranges for DW_TAG_compile_units. GCC also does this with
-      // recent GCC builds.
       for (size_t i = 0; i < num_ranges; ++i) {
         const DWARFRangeList::Entry &range = ranges.GetEntryRef(i);
         debug_aranges->AppendRange(cu_offset, range.GetRangeBase(),
                                    range.GetRangeEnd());
       }
 
-      return; // We got all of our ranges from the DW_AT_ranges attribute
+      return;
     }
   }
-  // We don't have a DW_AT_ranges attribute, so we need to parse the DWARF
-
-  // If the DIEs weren't parsed, then we don't want all dies for all compile
-  // units to stay loaded when they weren't needed. So we can end up parsing
-  // the DWARF and then throwing them all away to keep memory usage down.
-  ScopedExtractDIEs clear_dies(ExtractDIEsScoped());
-
-  die = DIEPtr();
-  if (die)
-    die->BuildAddressRangeTable(this, debug_aranges);
 
   if (debug_aranges->GetNumRanges() == num_debug_aranges) {
-    // We got nothing from the functions, maybe we have a line tables only
+    // We got nothing from the debug info, maybe we have a line tables only
     // situation. Check the line tables and build the arange table from this.
     SymbolContext sc;
     sc.comp_unit = m_dwarf.GetCompUnitForDWARFCompUnit(*this);

diff  --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
index 1c1acd9dd61a..3b5224cae7b2 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
@@ -819,34 +819,9 @@ const char *DWARFDebugInfoEntry::GetPubname(const DWARFUnit *cu) const {
   return name;
 }
 
-// BuildAddressRangeTable
-void DWARFDebugInfoEntry::BuildAddressRangeTable(
-    const DWARFUnit *cu, DWARFDebugAranges *debug_aranges) const {
-  if (m_tag) {
-    if (m_tag == DW_TAG_subprogram) {
-      dw_addr_t lo_pc = LLDB_INVALID_ADDRESS;
-      dw_addr_t hi_pc = LLDB_INVALID_ADDRESS;
-      if (GetAttributeAddressRange(cu, lo_pc, hi_pc, LLDB_INVALID_ADDRESS)) {
-        /// printf("BuildAddressRangeTable() 0x%8.8x: %30s: [0x%8.8x -
-        /// 0x%8.8x)\n", m_offset, DW_TAG_value_to_name(tag), lo_pc, hi_pc);
-        debug_aranges->AppendRange(cu->GetOffset(), lo_pc, hi_pc);
-      }
-    }
-
-    const DWARFDebugInfoEntry *child = GetFirstChild();
-    while (child) {
-      child->BuildAddressRangeTable(cu, debug_aranges);
-      child = child->GetSibling();
-    }
-  }
-}
-
-// BuildFunctionAddressRangeTable
-//
-// This function is very similar to the BuildAddressRangeTable function except
-// that the actual DIE offset for the function is placed in the table instead
-// of the compile unit offset (which is the way the standard .debug_aranges
-// section does it).
+/// This function is builds a table very similar to the standard .debug_aranges
+/// table, except that the actual DIE offset for the function is placed in the
+/// table instead of the compile unit offset.
 void DWARFDebugInfoEntry::BuildFunctionAddressRangeTable(
     const DWARFUnit *cu, DWARFDebugAranges *debug_aranges) const {
   if (m_tag) {
@@ -854,8 +829,6 @@ void DWARFDebugInfoEntry::BuildFunctionAddressRangeTable(
       dw_addr_t lo_pc = LLDB_INVALID_ADDRESS;
       dw_addr_t hi_pc = LLDB_INVALID_ADDRESS;
       if (GetAttributeAddressRange(cu, lo_pc, hi_pc, LLDB_INVALID_ADDRESS)) {
-        //  printf("BuildAddressRangeTable() 0x%8.8x: [0x%16.16" PRIx64 " -
-        //  0x%16.16" PRIx64 ")\n", m_offset, lo_pc, hi_pc); // DEBUG ONLY
         debug_aranges->AppendRange(GetOffset(), lo_pc, hi_pc);
       }
     }

diff  --git a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.h b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.h
index e12a19c13d1c..ca2dbd8a6bc0 100644
--- a/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.h
+++ b/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.h
@@ -41,9 +41,6 @@ class DWARFDebugInfoEntry {
   bool operator==(const DWARFDebugInfoEntry &rhs) const;
   bool operator!=(const DWARFDebugInfoEntry &rhs) const;
 
-  void BuildAddressRangeTable(const DWARFUnit *cu,
-                              DWARFDebugAranges *debug_aranges) const;
-
   void BuildFunctionAddressRangeTable(const DWARFUnit *cu,
                                       DWARFDebugAranges *debug_aranges) const;
 

diff  --git a/lldb/test/Shell/SymbolFile/DWARF/DW_AT_loclists_base.s b/lldb/test/Shell/SymbolFile/DWARF/DW_AT_loclists_base.s
index 831db3077abd..ca32e9930a76 100644
--- a/lldb/test/Shell/SymbolFile/DWARF/DW_AT_loclists_base.s
+++ b/lldb/test/Shell/SymbolFile/DWARF/DW_AT_loclists_base.s
@@ -58,6 +58,10 @@ lookup_loclists:
         .byte   5                       # DW_FORM_data2
         .uleb128 0x8c                   # DW_AT_loclists_base
         .byte   0x17                    # DW_FORM_sec_offset
+        .byte   17                      # DW_AT_low_pc
+        .byte   1                       # DW_FORM_addr
+        .byte   18                      # DW_AT_high_pc
+        .byte   6                       # DW_FORM_data4
         .byte   0                       # EOM(1)
         .byte   0                       # EOM(2)
         .byte   2                       # Abbreviation Code
@@ -109,6 +113,8 @@ lookup_loclists:
         .asciz  "Hand-written DWARF"    # DW_AT_producer
         .short  12                      # DW_AT_language
         .long   .Lloclists_table_base   # DW_AT_loclists_base
+        .quad   loclists                # DW_AT_low_pc
+        .long   .Lloclists_end-loclists # DW_AT_high_pc
         .byte   2                       # Abbrev [2] 0x2a:0x29 DW_TAG_subprogram
         .quad   loclists                # DW_AT_low_pc
         .long   .Lloclists_end-loclists # DW_AT_high_pc

diff  --git a/lldb/test/Shell/SymbolFile/DWARF/debug_loc.s b/lldb/test/Shell/SymbolFile/DWARF/debug_loc.s
index 0068b63744e9..c1ee8efdc9c4 100644
--- a/lldb/test/Shell/SymbolFile/DWARF/debug_loc.s
+++ b/lldb/test/Shell/SymbolFile/DWARF/debug_loc.s
@@ -153,6 +153,10 @@ f:                                      # @f
         .byte   14                      # DW_FORM_strp
         .byte   19                      # DW_AT_language
         .byte   5                       # DW_FORM_data2
+        .byte   17                      # DW_AT_low_pc
+        .byte   1                       # DW_FORM_addr
+        .byte   18                      # DW_AT_high_pc
+        .byte   6                       # DW_FORM_data4
         .byte   0                       # EOM(1)
         .byte   0                       # EOM(2)
         .byte   2                       # Abbreviation Code
@@ -210,6 +214,8 @@ f:                                      # @f
         .byte   1                       # Abbrev [1] 0xb:0x50 DW_TAG_compile_unit
         .long   .Linfo_string0          # DW_AT_producer
         .short  12                      # DW_AT_language
+        .quad   .Lfunc_begin0           # DW_AT_low_pc
+        .long   .Lfunc_end0-.Lfunc_begin0 # DW_AT_high_pc
         .byte   2                       # Abbrev [2] 0x2a:0x29 DW_TAG_subprogram
         .quad   .Lfunc_begin0           # DW_AT_low_pc
         .long   .Lfunc_end0-.Lfunc_begin0 # DW_AT_high_pc


        


More information about the lldb-commits mailing list