[llvm] [DWARFVerifier] Fix debug_str_offsets DWARF version detection (PR #81210)

via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 8 17:01:57 PST 2024


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-debuginfo

Author: Felipe de Azevedo Piovezan (felipepiovezan)

<details>
<summary>Changes</summary>

The DWARF 5 debug_str_offsets section starts with a header, which must be skipped in order to access the underlying `strp`s.

However, the verifier supports some pre-standardization version of this section (with the same section name), which does not have a header. In this case, the offsets start on the first byte of the section, although it's not clear where this is documented.

How does The DWARF verifier figure out which version to use? It manually reads the **first** header in debug_info and uses that. This is wrong when multiple debug_str_offset sections have been linked together, in particular it is wrong in the following two cases:

1. A standard DWARF 4 object file (i.e. no debug_str_offsets) linked with a standard DWARF 5 object file.
2. A non-standard DWARF 4 object file (i.e. containing the header-less debug_str_offsets section) linked with a standard DWARF 5 object file.

This patch provides a quick to fix case (1): we use the `MaxVersion` from the DWARFContext instead of reading it from the debug_info section manually. Since this is dealing with standard-conforming formats, which should be linked together without issues, the verifier must handle it.

Fixing case 2 would require a lot of rework, restructuring how each piece of the debug_str_offsets is visited and, since this is dealing with a non-standard format, is left for the future in case anyone cares enough about this case.

---
Full diff: https://github.com/llvm/llvm-project/pull/81210.diff


3 Files Affected:

- (modified) llvm/include/llvm/DebugInfo/DWARF/DWARFVerifier.h (+2-1) 
- (modified) llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp (+14-11) 
- (added) llvm/test/tools/llvm-dwarfdump/X86/debug-str-offsets-mixed-dwarf-4-5.yaml (+57) 


``````````diff
diff --git a/llvm/include/llvm/DebugInfo/DWARF/DWARFVerifier.h b/llvm/include/llvm/DebugInfo/DWARF/DWARFVerifier.h
index ea73664b1e46ca..6c5df409fe6de8 100644
--- a/llvm/include/llvm/DebugInfo/DWARF/DWARFVerifier.h
+++ b/llvm/include/llvm/DebugInfo/DWARF/DWARFVerifier.h
@@ -361,7 +361,8 @@ class DWARFVerifier {
   /// \returns true if the .debug_line verifies successfully, false otherwise.
   bool handleDebugStrOffsets();
   bool verifyDebugStrOffsets(
-      StringRef SectionName, const DWARFSection &Section, StringRef StrData,
+      uint8_t MaxVersion, StringRef SectionName, const DWARFSection &Section,
+      StringRef StrData,
       void (DWARFObject::*)(function_ref<void(const DWARFSection &)>) const);
 
   /// Emits any aggregate information collected, depending on the dump options
diff --git a/llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp b/llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp
index 2124ff835c5727..805f40af217e1e 100644
--- a/llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp
+++ b/llvm/lib/DebugInfo/DWARF/DWARFVerifier.cpp
@@ -1882,29 +1882,31 @@ bool DWARFVerifier::handleDebugStrOffsets() {
   const DWARFObject &DObj = DCtx.getDWARFObj();
   bool Success = true;
   Success &= verifyDebugStrOffsets(
-      ".debug_str_offsets.dwo", DObj.getStrOffsetsDWOSection(),
-      DObj.getStrDWOSection(), &DWARFObject::forEachInfoDWOSections);
+      DCtx.getMaxDWOVersion(), ".debug_str_offsets.dwo",
+      DObj.getStrOffsetsDWOSection(), DObj.getStrDWOSection(),
+      &DWARFObject::forEachInfoDWOSections);
   Success &= verifyDebugStrOffsets(
-      ".debug_str_offsets", DObj.getStrOffsetsSection(), DObj.getStrSection(),
-      &DWARFObject::forEachInfoSections);
+      DCtx.getMaxVersion(), ".debug_str_offsets", DObj.getStrOffsetsSection(),
+      DObj.getStrSection(), &DWARFObject::forEachInfoSections);
   return Success;
 }
 
 bool DWARFVerifier::verifyDebugStrOffsets(
-    StringRef SectionName, const DWARFSection &Section, StringRef StrData,
+    uint8_t MaxVersion, StringRef SectionName, const DWARFSection &Section,
+    StringRef StrData,
     void (DWARFObject::*VisitInfoSections)(
         function_ref<void(const DWARFSection &)>) const) {
   const DWARFObject &DObj = DCtx.getDWARFObj();
-  uint16_t InfoVersion = 0;
-  DwarfFormat InfoFormat = DwarfFormat::DWARF32;
+
+  std::optional<DwarfFormat> MaybeInfoFormat;
   (DObj.*VisitInfoSections)([&](const DWARFSection &S) {
-    if (InfoVersion)
+    if (MaybeInfoFormat)
       return;
     DWARFDataExtractor DebugInfoData(DObj, S, DCtx.isLittleEndian(), 0);
     uint64_t Offset = 0;
-    InfoFormat = DebugInfoData.getInitialLength(&Offset).second;
-    InfoVersion = DebugInfoData.getU16(&Offset);
+    MaybeInfoFormat = DebugInfoData.getInitialLength(&Offset).second;
   });
+  DwarfFormat InfoFormat = MaybeInfoFormat.value_or(DwarfFormat::DWARF32);
 
   DWARFDataExtractor DA(DObj, Section, DCtx.isLittleEndian(), 0);
 
@@ -1915,7 +1917,8 @@ bool DWARFVerifier::verifyDebugStrOffsets(
     DwarfFormat Format;
     uint64_t Length;
     uint64_t StartOffset = C.tell();
-    if (InfoVersion == 4) {
+    if (MaxVersion == 4) {
+      // Pre-standardization debug_str_offsets had no header.
       Format = InfoFormat;
       Length = DA.getData().size();
       NextUnit = C.tell() + Length;
diff --git a/llvm/test/tools/llvm-dwarfdump/X86/debug-str-offsets-mixed-dwarf-4-5.yaml b/llvm/test/tools/llvm-dwarfdump/X86/debug-str-offsets-mixed-dwarf-4-5.yaml
new file mode 100644
index 00000000000000..d10460896171d6
--- /dev/null
+++ b/llvm/test/tools/llvm-dwarfdump/X86/debug-str-offsets-mixed-dwarf-4-5.yaml
@@ -0,0 +1,57 @@
+# RUN: yaml2obj %s -o %t.o
+# RUN: llvm-dwarfdump -debug-str-offsets -verify %t.o | FileCheck %s
+
+# CHECK: Verifying .debug_str_offsets...
+# CHECK: No errors
+
+# Check that when mixing standard DWARF 4 debug information with standard DWARF
+# 5 debug information, the verifier correctly interprets the debug_str_offsets
+# section as a standards-conforming DWARF 5 section.
+
+--- !ELF
+FileHeader:
+  Class: ELFCLASS64
+  Data:  ELFDATA2LSB
+  Type:  ET_EXEC
+DWARF:
+  debug_str:
+    - 'cu1'
+    - 'cu2'
+  debug_str_offsets:
+    - Offsets:
+        - 0x0
+  debug_abbrev:
+    - Table:
+        - Code:            0x1
+          Tag:             DW_TAG_compile_unit
+          Children:        DW_CHILDREN_no
+          Attributes:
+            - Attribute:       DW_AT_name
+              Form:            DW_FORM_strp
+        - Code:            0x2
+          Tag:             DW_TAG_compile_unit
+          Children:        DW_CHILDREN_no
+          Attributes:
+            - Attribute:       DW_AT_name
+              Form:            DW_FORM_strx1
+            - Attribute:       DW_AT_str_offsets_base
+              Form:            DW_FORM_sec_offset
+  debug_info:
+    - Version:         4
+      AbbrevTableID:   0
+      AbbrOffset:      0x0
+      AddrSize:        8
+      Entries:
+        - AbbrCode:        0x1
+          Values:
+            - Value:           0x4
+    - Version:         5
+      UnitType:        DW_UT_compile
+      AbbrOffset:      0x0
+      AddrSize:        8
+      AbbrevTableID:   0
+      Entries:
+        - AbbrCode:        0x2
+          Values:
+            - Value:           0x0
+            - Value:           0x8 # str offsets base

``````````

</details>


https://github.com/llvm/llvm-project/pull/81210


More information about the llvm-commits mailing list