[Lldb-commits] [lldb] [lldb] Handle improperly nested blocks differently (PR #117725)

Pavel Labath via lldb-commits lldb-commits at lists.llvm.org
Tue Nov 26 07:31:50 PST 2024


https://github.com/labath created https://github.com/llvm/llvm-project/pull/117725

In 6c7f56192fa6e689ef14d32e43a785de5692e9c0 (Sep 2011) we added code to extend the range of the parent block if the child block is not contained within it.  Currently, this code doesn't work for (at least) two reasons:
- we don't re-sort the ranges of the parent block, which means that the lookup may not find the range we added (and possibly other ranges as well)
- The main address lookup mechanism (SymbolFileDWARF::ResolveSymbolContext(Address)) performs the lookup using DWARF DIE information, which doesn't contain this extra range.

The motivation for the change was bad compiler output. The commit message does not go into details, so it's hard to say if this has been fixed, but given that was 13 years ago, I think we can assume that to be the case. In fact, as far as I can tell (I haven't tried building an lldb this old) this code stopped working in
ea3e7d5ccf4f00741e4b106978bd8dab5cece3a1 (~two weeks later), when we added the requirement for the ranges to be sorted.

Given that this isn't working, and that not changing the ranges of other blocks has other nice properties (the ranges can be immutable after construction), I'm removing the parent range changing code. I'm adding a test case that makes sure we don't do something egregious (e.g., crash) in this case. Having a child block not be a subset of the parent block doesn't seem to cause problems now, but if that ever turns out to be an issue, we can always intersect the child range with that of the parent.

I'm also upgrading this message to a proper warning so we can see if this ever occurs in practice, and simplifying it so it doesn't get in the way of understanding the function.

>From 12ddfe8273339608e00946bd30218b8a407d8524 Mon Sep 17 00:00:00 2001
From: Pavel Labath <pavel at labath.sk>
Date: Tue, 26 Nov 2024 16:12:40 +0100
Subject: [PATCH] [lldb] Handle improperly nested blocks differently

In 6c7f56192fa6e689ef14d32e43a785de5692e9c0 (Sep 2011) we added code to
extend the range of the parent block if the child block is not contained
within it.  Currently, this code doesn't work for (at least) two
reasons:
- we don't re-sort the ranges of the parent block, which means that the
  lookup may not find the range we added (and possibly other ranges as
  well)
- The main address lookup mechanism
  (SymbolFileDWARF::ResolveSymbolContext(Address)) performs the lookup
  using DWARF DIE information, which doesn't contain this extra range.

The motivation for the change was bad compiler output. The commit
message does not go into details, so it's hard to say if this has been
fixed, but given that was 13 years ago, I think we can assume that to be
the case. In fact, as far as I can tell (I haven't tried building an
lldb this old) this code stopped working in
ea3e7d5ccf4f00741e4b106978bd8dab5cece3a1 (~two weeks later), when we
added the requirement for the ranges to be sorted.

Given that this isn't working, and that not changing the ranges of other
blocks has other nice properties (the ranges can be immutable after
construction), I'm removing the parent range changing code. I'm adding a
test case that makes sure we don't do something egregious (e.g., crash)
in this case. Having a child block not be a subset of the parent block
doesn't seem to cause problems now, but if that ever turns out to be an
issue, we can always intersect the child range with that of the parent.

I'm also upgrading this message to a proper warning so we can see if
this ever occurs in practice, and simplifying it so it doesn't get in
the way of understanding the function.
---
 lldb/source/Symbol/Block.cpp                  |  38 +------
 .../DWARF/x86/improperly_nested_blocks.s      | 100 ++++++++++++++++++
 2 files changed, 105 insertions(+), 33 deletions(-)
 create mode 100644 lldb/test/Shell/SymbolFile/DWARF/x86/improperly_nested_blocks.s

diff --git a/lldb/source/Symbol/Block.cpp b/lldb/source/Symbol/Block.cpp
index 8746a25e3fad5a..27fce043dfc5ed 100644
--- a/lldb/source/Symbol/Block.cpp
+++ b/lldb/source/Symbol/Block.cpp
@@ -352,39 +352,11 @@ void Block::FinalizeRanges() {
 void Block::AddRange(const Range &range) {
   Block *parent_block = GetParent();
   if (parent_block && !parent_block->Contains(range)) {
-    Log *log = GetLog(LLDBLog::Symbols);
-    if (log) {
-      ModuleSP module_sp(m_parent_scope->CalculateSymbolContextModule());
-      Function *function = m_parent_scope->CalculateSymbolContextFunction();
-      const addr_t function_file_addr =
-          function->GetAddressRange().GetBaseAddress().GetFileAddress();
-      const addr_t block_start_addr = function_file_addr + range.GetRangeBase();
-      const addr_t block_end_addr = function_file_addr + range.GetRangeEnd();
-      Type *func_type = function->GetType();
-
-      const Declaration &func_decl = func_type->GetDeclaration();
-      if (func_decl.GetLine()) {
-        LLDB_LOGF(log,
-                  "warning: %s:%u block {0x%8.8" PRIx64
-                  "} has range[%u] [0x%" PRIx64 " - 0x%" PRIx64
-                  ") which is not contained in parent block {0x%8.8" PRIx64
-                  "} in function {0x%8.8" PRIx64 "} from %s",
-                  func_decl.GetFile().GetPath().c_str(), func_decl.GetLine(),
-                  GetID(), (uint32_t)m_ranges.GetSize(), block_start_addr,
-                  block_end_addr, parent_block->GetID(), function->GetID(),
-                  module_sp->GetFileSpec().GetPath().c_str());
-      } else {
-        LLDB_LOGF(log,
-                  "warning: block {0x%8.8" PRIx64 "} has range[%u] [0x%" PRIx64
-                  " - 0x%" PRIx64
-                  ") which is not contained in parent block {0x%8.8" PRIx64
-                  "} in function {0x%8.8" PRIx64 "} from %s",
-                  GetID(), (uint32_t)m_ranges.GetSize(), block_start_addr,
-                  block_end_addr, parent_block->GetID(), function->GetID(),
-                  module_sp->GetFileSpec().GetPath().c_str());
-      }
-    }
-    parent_block->AddRange(range);
+    m_parent_scope->CalculateSymbolContextModule()->ReportWarning(
+        "block {0:x} has a range [{1:x}, {2:x}) which is not contained in the "
+        "parent block {3:x}",
+        GetID(), range.GetRangeBase(), range.GetRangeEnd(),
+        parent_block->GetID());
   }
   m_ranges.Append(range);
 }
diff --git a/lldb/test/Shell/SymbolFile/DWARF/x86/improperly_nested_blocks.s b/lldb/test/Shell/SymbolFile/DWARF/x86/improperly_nested_blocks.s
new file mode 100644
index 00000000000000..af744385993f28
--- /dev/null
+++ b/lldb/test/Shell/SymbolFile/DWARF/x86/improperly_nested_blocks.s
@@ -0,0 +1,100 @@
+## This test checks that lldb handles (corrupt?) debug info which has improperly
+## nested blocks. The behavior here is not prescriptive. We only want to check
+## that we do something "reasonable".
+
+
+# RUN: llvm-mc -triple=x86_64-pc-linux -filetype=obj %s > %t
+# RUN: %lldb %t -o "image lookup -v -s look_me_up1" \
+# RUN:   -o "image lookup -v -s look_me_up2" -o exit 2>&1 | FileCheck %s
+
+# CHECK-LABEL: image lookup -v -s look_me_up1
+# CHECK: warning: {{.*}} block 0x55 has a range [0x2, 0x4) which is not contained in the parent block 0x44
+# CHECK:    Function: id = {0x00000030}, name = "fn", range = [0x0000000000000000-0x0000000000000005)
+# CHECK:      Blocks: id = {0x00000030}, range = [0x00000000-0x00000005)
+# CHECK-NEXT:         id = {0x00000044}, range = [0x00000001-0x00000003)
+# CHECK-NEXT:         id = {0x00000055}, range = [0x00000002-0x00000004)
+# CHECK-NEXT: Symbol:
+
+# CHECK-LABEL: image lookup -v -s look_me_up2
+# CHECK:    Function: id = {0x00000030}, name = "fn", range = [0x0000000000000000-0x0000000000000005)
+# CHECK:      Blocks: id = {0x00000030}, range = [0x00000000-0x00000005)
+# CHECK-NEXT: Symbol:
+
+        .text
+        .p2align 12
+fn:
+        nop
+.Lblock1_begin:
+        nop
+.Lblock2_begin:
+look_me_up1:
+        nop
+.Lblock1_end:
+look_me_up2:
+        nop
+.Lblock2_end:
+        nop
+.Lfn_end:
+
+
+        .section        .debug_abbrev,"", at progbits
+        .byte   1                       # Abbreviation Code
+        .byte   17                      # DW_TAG_compile_unit
+        .byte   1                       # DW_CHILDREN_yes
+        .byte   37                      # DW_AT_producer
+        .byte   8                       # DW_FORM_string
+        .byte   17                      # DW_AT_low_pc
+        .byte   1                       # DW_FORM_addr
+        .byte   18                      # DW_AT_high_pc
+        .byte   1                       # DW_FORM_addr
+        .byte   0                       # EOM(1)
+        .byte   0                       # EOM(2)
+        .byte   2                       # Abbreviation Code
+        .byte   46                      # DW_TAG_subprogram
+        .byte   1                       # DW_CHILDREN_yes
+        .byte   17                      # DW_AT_low_pc
+        .byte   1                       # DW_FORM_addr
+        .byte   18                      # DW_AT_high_pc
+        .byte   1                       # DW_FORM_addr
+        .byte   3                       # DW_AT_name
+        .byte   8                       # DW_FORM_string
+        .byte   0                       # EOM(1)
+        .byte   0                       # EOM(2)
+        .byte   3                       # Abbreviation Code
+        .byte   11                      # DW_TAG_lexical_block
+        .byte   1                       # DW_CHILDREN_yes
+        .byte   17                      # DW_AT_low_pc
+        .byte   1                       # DW_FORM_addr
+        .byte   18                      # DW_AT_high_pc
+        .byte   1                       # DW_FORM_addr
+        .byte   0                       # EOM(1)
+        .byte   0                       # EOM(2)
+        .byte   0                       # EOM(3)
+
+        .section        .debug_info,"", at progbits
+.Lcu_begin0:
+        .long   .Ldebug_info_end0-.Ldebug_info_start0 # Length of Unit
+.Ldebug_info_start0:
+        .short  5                       # DWARF version number
+        .byte   1                       # DWARF Unit Type
+        .byte   8                       # Address Size (in bytes)
+        .long   .debug_abbrev           # Offset Into Abbrev. Section
+        .byte   1                       # Abbrev DW_TAG_compile_unit
+        .asciz  "Hand-written DWARF"    # DW_AT_producer
+        .quad   fn                      # DW_AT_low_pc
+        .quad   .Lfn_end                # DW_AT_high_pc
+        .byte   2                       # Abbrev DW_TAG_subprogram
+        .quad   fn                      # DW_AT_low_pc
+        .quad   .Lfn_end                # DW_AT_high_pc
+        .asciz  "fn"                    # DW_AT_name
+        .byte   3                       # Abbrev DW_TAG_lexical_block
+        .quad   .Lblock1_begin          # DW_AT_low_pc
+        .quad   .Lblock1_end            # DW_AT_high_pc
+        .byte   3                       # Abbrev DW_TAG_lexical_block
+        .quad   .Lblock2_begin          # DW_AT_low_pc
+        .quad   .Lblock2_end            # DW_AT_high_pc
+        .byte   0                       # End Of Children Mark
+        .byte   0                       # End Of Children Mark
+        .byte   0                       # End Of Children Mark
+        .byte   0                       # End Of Children Mark
+.Ldebug_info_end0:



More information about the lldb-commits mailing list