[llvm] 20948df - [DWARFVerifier] Fix debug_str_offsets DWARF version detection (#81303)

via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 12 09:32:14 PST 2024


Author: Felipe de Azevedo Piovezan
Date: 2024-02-12T09:32:10-08:00
New Revision: 20948df25d0401f248ec754dfa22422e20377662

URL: https://github.com/llvm/llvm-project/commit/20948df25d0401f248ec754dfa22422e20377662
DIFF: https://github.com/llvm/llvm-project/commit/20948df25d0401f248ec754dfa22422e20377662.diff

LOG: [DWARFVerifier] Fix debug_str_offsets DWARF version detection (#81303)

The DWARF 5 debug_str_offsets section starts with a header, which must
be skipped in order to access the underlying `strp`s.

However, the verifier supports some pre-standardization version of this
section (with the same section name), which does not have a header. In
this case, the offsets start on the first byte of the section. More in
[1] and [2] about this legacy section.

How does The DWARF verifier figure out which version to use? It manually
reads the **first** header in debug_info and uses that. This is wrong
when multiple debug_str_offset sections have been linked together, in
particular it is wrong in the following two cases:

1. A standard DWARF 4 object file (i.e. no debug_str_offsets) linked
with a standard DWARF 5 object file.
2. A non-standard DWARF 4 object file (i.e. containing the header-less
debug_str_offsets section) linked with a standard DWARF 5 object file.

Based on discussions in https://github.com/llvm/llvm-project/pull/81210,
the legacy version is only possible with dwo files, and dwo files cannot
mix the legacy version with the dwarf 5 version. As such, we change the
verifier to only check the debug_info header in the case of dwo files.
If it sees a dwarf 4 version, it handles it the legacy way.

Note: the modified test was technically testing an unsupported
combination of dwarf version + non-dwo sections. To see why, simply note
that the test contained no `debug_info.dwo` sections, so the call to
DWARFObject::forEachInfoDWOSections was doing nothing. We were finding
the error through the "standard version", which shouldn't happen.

[1]: https://gcc.gnu.org/wiki/DebugFission 
[2]: https://gcc.gnu.org/wiki/DebugFissionDWP

Added: 
    llvm/test/tools/llvm-dwarfdump/X86/debug-str-offsets-mixed-dwarf-4-5.yaml

Modified: 
    llvm/include/llvm/DebugInfo/DWARF/DWARFVerifier.h
    llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp
    llvm/test/tools/llvm-dwarfdump/X86/verify_invalid_str_offsets.yaml

Removed: 
    


################################################################################
diff  --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFVerifier.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFVerifier.h
index ea73664b1e46ca..c2365a4c7cf647 100644
--- a/llvm/include/llvm/DebugInfo/DWARF/DWARFVerifier.h
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFVerifier.h
@@ -360,9 +360,9 @@ class DWARFVerifier {
   ///
   /// \returns true if the .debug_line verifies successfully, false otherwise.
   bool handleDebugStrOffsets();
-  bool verifyDebugStrOffsets(
-      StringRef SectionName, const DWARFSection &Section, StringRef StrData,
-      void (DWARFObject::*)(function_ref<void(const DWARFSection &)>) const);
+  bool verifyDebugStrOffsets(std::optional<dwarf::DwarfFormat> LegacyFormat,
+                             StringRef SectionName, const DWARFSection &Section,
+                             StringRef StrData);
 
   /// Emits any aggregate information collected, depending on the dump options
   void summarize();

diff  --git a/llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp b/llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp
index b5235765c77b14..20ef59e7b4422e 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp
@@ -1880,33 +1880,38 @@ bool DWARFVerifier::handleDebugStrOffsets() {
   OS << "Verifying .debug_str_offsets...\n";
   const DWARFObject &DObj = DCtx.getDWARFObj();
   bool Success = true;
+
+  // dwo sections may contain the legacy debug_str_offsets format (and they
+  // can't be mixed with dwarf 5's format). This section format contains no
+  // header.
+  // As such, check the version from debug_info and, if we are in the legacy
+  // mode (Dwarf <= 4), extract Dwarf32/Dwarf64.
+  std::optional<DwarfFormat> DwoLegacyDwarf4Format;
+  DObj.forEachInfoDWOSections([&](const DWARFSection &S) {
+    if (DwoLegacyDwarf4Format)
+      return;
+    DWARFDataExtractor DebugInfoData(DObj, S, DCtx.isLittleEndian(), 0);
+    uint64_t Offset = 0;
+    DwarfFormat InfoFormat = DebugInfoData.getInitialLength(&Offset).second;
+    if (uint16_t InfoVersion = DebugInfoData.getU16(&Offset); InfoVersion <= 4)
+      DwoLegacyDwarf4Format = InfoFormat;
+  });
+
   Success &= verifyDebugStrOffsets(
-      ".debug_str_offsets.dwo", DObj.getStrOffsetsDWOSection(),
-      DObj.getStrDWOSection(), &DWARFObject::forEachInfoDWOSections);
+      DwoLegacyDwarf4Format, ".debug_str_offsets.dwo",
+      DObj.getStrOffsetsDWOSection(), DObj.getStrDWOSection());
   Success &= verifyDebugStrOffsets(
-      ".debug_str_offsets", DObj.getStrOffsetsSection(), DObj.getStrSection(),
-      &DWARFObject::forEachInfoSections);
+      /*LegacyFormat=*/std::nullopt, ".debug_str_offsets",
+      DObj.getStrOffsetsSection(), DObj.getStrSection());
   return Success;
 }
 
 bool DWARFVerifier::verifyDebugStrOffsets(
-    StringRef SectionName, const DWARFSection &Section, StringRef StrData,
-    void (DWARFObject::*VisitInfoSections)(
-        function_ref<void(const DWARFSection &)>) const) {
+    std::optional<DwarfFormat> LegacyFormat, StringRef SectionName,
+    const DWARFSection &Section, StringRef StrData) {
   const DWARFObject &DObj = DCtx.getDWARFObj();
-  uint16_t InfoVersion = 0;
-  DwarfFormat InfoFormat = DwarfFormat::DWARF32;
-  (DObj.*VisitInfoSections)([&](const DWARFSection &S) {
-    if (InfoVersion)
-      return;
-    DWARFDataExtractor DebugInfoData(DObj, S, DCtx.isLittleEndian(), 0);
-    uint64_t Offset = 0;
-    InfoFormat = DebugInfoData.getInitialLength(&Offset).second;
-    InfoVersion = DebugInfoData.getU16(&Offset);
-  });
 
   DWARFDataExtractor DA(DObj, Section, DCtx.isLittleEndian(), 0);
-
   DataExtractor::Cursor C(0);
   uint64_t NextUnit = 0;
   bool Success = true;
@@ -1914,8 +1919,8 @@ bool DWARFVerifier::verifyDebugStrOffsets(
     DwarfFormat Format;
     uint64_t Length;
     uint64_t StartOffset = C.tell();
-    if (InfoVersion == 4) {
-      Format = InfoFormat;
+    if (LegacyFormat) {
+      Format = *LegacyFormat;
       Length = DA.getData().size();
       NextUnit = C.tell() + Length;
     } else {

diff  --git a/llvm/test/tools/llvm-dwarfdump/X86/debug-str-offsets-mixed-dwarf-4-5.yaml b/llvm/test/tools/llvm-dwarfdump/X86/debug-str-offsets-mixed-dwarf-4-5.yaml
new file mode 100644
index 00000000000000..d10460896171d6
--- /dev/null
+++ b/llvm/test/tools/llvm-dwarfdump/X86/debug-str-offsets-mixed-dwarf-4-5.yaml
@@ -0,0 +1,57 @@
+# RUN: yaml2obj %s -o %t.o
+# RUN: llvm-dwarfdump -debug-str-offsets -verify %t.o | FileCheck %s
+
+# CHECK: Verifying .debug_str_offsets...
+# CHECK: No errors
+
+# Check that when mixing standard DWARF 4 debug information with standard DWARF
+# 5 debug information, the verifier correctly interprets the debug_str_offsets
+# section as a standards-conforming DWARF 5 section.
+
+--- !ELF
+FileHeader:
+  Class: ELFCLASS64
+  Data:  ELFDATA2LSB
+  Type:  ET_EXEC
+DWARF:
+  debug_str:
+    - 'cu1'
+    - 'cu2'
+  debug_str_offsets:
+    - Offsets:
+        - 0x0
+  debug_abbrev:
+    - Table:
+        - Code:            0x1
+          Tag:             DW_TAG_compile_unit
+          Children:        DW_CHILDREN_no
+          Attributes:
+            - Attribute:       DW_AT_name
+              Form:            DW_FORM_strp
+        - Code:            0x2
+          Tag:             DW_TAG_compile_unit
+          Children:        DW_CHILDREN_no
+          Attributes:
+            - Attribute:       DW_AT_name
+              Form:            DW_FORM_strx1
+            - Attribute:       DW_AT_str_offsets_base
+              Form:            DW_FORM_sec_offset
+  debug_info:
+    - Version:         4
+      AbbrevTableID:   0
+      AbbrOffset:      0x0
+      AddrSize:        8
+      Entries:
+        - AbbrCode:        0x1
+          Values:
+            - Value:           0x4
+    - Version:         5
+      UnitType:        DW_UT_compile
+      AbbrOffset:      0x0
+      AddrSize:        8
+      AbbrevTableID:   0
+      Entries:
+        - AbbrCode:        0x2
+          Values:
+            - Value:           0x0
+            - Value:           0x8 # str offsets base

diff  --git a/llvm/test/tools/llvm-dwarfdump/X86/verify_invalid_str_offsets.yaml b/llvm/test/tools/llvm-dwarfdump/X86/verify_invalid_str_offsets.yaml
index 37f376352a19a4..1bdc640acae6dc 100644
--- a/llvm/test/tools/llvm-dwarfdump/X86/verify_invalid_str_offsets.yaml
+++ b/llvm/test/tools/llvm-dwarfdump/X86/verify_invalid_str_offsets.yaml
@@ -13,7 +13,7 @@
 # CHECK-NEXT: error: .debug_str_offsets: contribution 0x29: length exceeds available space (contribution offset (0x29) + length field space (0x4) + length (0x5000000) == 0x500002D > section size 0x30)
 # Errors detected.
 
-# V4: error: .debug_str_offsets: contribution 0x0: index 0x2: invalid string offset *0x8 == 0x2, is neither zero nor immediately following a null character
+# V4: error: .debug_str_offsets.dwo: contribution 0x0: index 0x2: invalid string offset *0x8 == 0x2, is neither zero nor immediately following a null character
 
 
 #--- v4.yaml
@@ -23,16 +23,17 @@ FileHeader:
   Data:  ELFDATA2LSB
   Type:  ET_EXEC
 DWARF:
-  debug_str:
-    - 'foo'
-    - 'bar'
-  debug_info:
-    - Version: 4
-      AddrSize: 4
 Sections:
-  - Name: '.debug_str_offsets'
+  - Name: '.debug_info.dwo'
+    Type: SHT_PROGBITS
+    Content: "0700000004000000000004"
+  - Name: '.debug_str_offsets.dwo'
     Type: SHT_PROGBITS
     Content: "000000000400000002000000"
+  - Name: 'debug_str.dwo'
+    Type: SHT_PROGBITS
+    Content: "666F6F0062617200"
+
 
 #--- v5.yaml
 --- !ELF


        


More information about the llvm-commits mailing list