[Lldb-commits] [lldb] [lldb] Fixing edge cases in "command source" (PR #126526)
Pavel Labath via lldb-commits
lldb-commits at lists.llvm.org
Mon Feb 10 06:44:51 PST 2025
https://github.com/labath created https://github.com/llvm/llvm-project/pull/126526
While looking at how to make Function::GetEndLineSourceInfo (which is only used in "command source") work with discontinuous functions, I realized there are other corner cases that this function doesn't handle.
The code assumed that the last line entry in the function will also correspond to the last source line. This is probably true for unoptimized code, but I don't think we can rely on the optimizer to preserve this property. What's worse, the code didn't check that the last line entry belonged to the same file as the first one, so if this line entry was the result of inlining, we could end up using a line from a completely different file.
To fix this, I change the algorithm to iterate over all line entries in the function (which belong to the same file) and find the max line number out of those. This way we can naturally handle the discontinuous case as well.
This implementations is going to be slower than the previous one, but I don't think that matters, because:
- this command is only used rarely, and interactively
- we have plenty of other code which iterates through the line table
I added some basic tests for the function operation. I don't claim the tests to be comprehensive, or that the function handles all edge cases, but test framework created here could be used for testing other fixes/edge cases as well.
>From a4745b1361390140125947c63cd0afa9cc0f57e9 Mon Sep 17 00:00:00 2001
From: Pavel Labath <pavel at labath.sk>
Date: Mon, 10 Feb 2025 15:20:05 +0100
Subject: [PATCH] [lldb] Fixing edge cases in "command source"
While looking at how to make Function::GetEndLineSourceInfo (which is
only used in "command source") work with discontinuous functions, I
realized there are other corner cases that this function doesn't handle.
The code assumed that the last line entry in the function will also
correspond to the last source line. This is probably true for
unoptimized code, but I don't think we can rely on the optimizer to
preserve this property. What's worse, the code didn't check that the
last line entry belonged to the same file as the first one, so if this
line entry was the result of inlining, we could end up using a line from
a completely different file.
To fix this, I change the algorithm to iterate over all line entries in
the function (which belong to the same file) and find the max line
number out of those. This way we can naturally handle the discontinuous
case as well.
This implementations is going to be slower than the previous one, but I
don't think that matters, because:
- this command is only used rarely, and interactively
- we have plenty of other code which iterates through the line table
I added some basic tests for the function operation. I don't claim the
tests to be comprehensive, or that the function handles all edge cases,
but test framework created here could be used for testing other
fixes/edge cases as well.
---
lldb/include/lldb/Symbol/Function.h | 15 +-
lldb/source/Commands/CommandObjectSource.cpp | 14 +-
lldb/source/Symbol/Function.cpp | 43 +--
.../test/Shell/Commands/command-source-list.s | 265 ++++++++++++++++++
4 files changed, 303 insertions(+), 34 deletions(-)
create mode 100644 lldb/test/Shell/Commands/command-source-list.s
diff --git a/lldb/include/lldb/Symbol/Function.h b/lldb/include/lldb/Symbol/Function.h
index f3b956139f3c5b..ee3a8304fc5b37 100644
--- a/lldb/include/lldb/Symbol/Function.h
+++ b/lldb/include/lldb/Symbol/Function.h
@@ -15,6 +15,7 @@
#include "lldb/Expression/DWARFExpressionList.h"
#include "lldb/Symbol/Block.h"
#include "lldb/Utility/UserID.h"
+#include "lldb/lldb-forward.h"
#include "llvm/ADT/ArrayRef.h"
#include <mutex>
@@ -460,6 +461,7 @@ class Function : public UserID, public SymbolContextScope {
}
lldb::LanguageType GetLanguage() const;
+
/// Find the file and line number of the source location of the start of the
/// function. This will use the declaration if present and fall back on the
/// line table if that fails. So there may NOT be a line table entry for
@@ -473,16 +475,9 @@ class Function : public UserID, public SymbolContextScope {
void GetStartLineSourceInfo(lldb::SupportFileSP &source_file_sp,
uint32_t &line_no);
- /// Find the file and line number of the source location of the end of the
- /// function.
- ///
- ///
- /// \param[out] source_file
- /// The source file.
- ///
- /// \param[out] line_no
- /// The line number.
- void GetEndLineSourceInfo(FileSpec &source_file, uint32_t &line_no);
+ using SourceRange = Range<uint32_t, uint32_t>;
+ /// Find the file and line number range of the function.
+ llvm::Expected<std::pair<lldb::SupportFileSP, SourceRange>> GetSourceInfo();
/// Get the outgoing call edges from this function, sorted by their return
/// PC addresses (in increasing order).
diff --git a/lldb/source/Commands/CommandObjectSource.cpp b/lldb/source/Commands/CommandObjectSource.cpp
index 936783216f6ff5..81bf1769808bad 100644
--- a/lldb/source/Commands/CommandObjectSource.cpp
+++ b/lldb/source/Commands/CommandObjectSource.cpp
@@ -784,14 +784,12 @@ class CommandObjectSourceList : public CommandObjectParsed {
if (sc.block == nullptr) {
// Not an inlined function
- sc.function->GetStartLineSourceInfo(start_file, start_line);
- if (start_line == 0) {
- result.AppendErrorWithFormat("Could not find line information for "
- "start of function: \"%s\".\n",
- source_info.function.GetCString());
- return 0;
- }
- sc.function->GetEndLineSourceInfo(end_file, end_line);
+ auto expected_info = sc.function->GetSourceInfo();
+ if (!expected_info)
+ result.AppendError(llvm::toString(expected_info.takeError()));
+ start_file = expected_info->first;
+ start_line = expected_info->second.GetRangeBase();
+ end_line = expected_info->second.GetRangeEnd();
} else {
// We have an inlined function
start_file = source_info.line_entry.file_sp;
diff --git a/lldb/source/Symbol/Function.cpp b/lldb/source/Symbol/Function.cpp
index 11a43a9172ea67..2594fecb27ceb2 100644
--- a/lldb/source/Symbol/Function.cpp
+++ b/lldb/source/Symbol/Function.cpp
@@ -320,25 +320,36 @@ void Function::GetStartLineSourceInfo(SupportFileSP &source_file_sp,
}
}
-void Function::GetEndLineSourceInfo(FileSpec &source_file, uint32_t &line_no) {
- line_no = 0;
- source_file.Clear();
-
- // The -1 is kind of cheesy, but I want to get the last line entry for the
- // given function, not the first entry of the next.
- Address scratch_addr(GetAddressRange().GetBaseAddress());
- scratch_addr.SetOffset(scratch_addr.GetOffset() +
- GetAddressRange().GetByteSize() - 1);
-
+llvm::Expected<std::pair<SupportFileSP, Function::SourceRange>>
+Function::GetSourceInfo() {
+ SupportFileSP source_file_sp;
+ uint32_t start_line;
+ GetStartLineSourceInfo(source_file_sp, start_line);
LineTable *line_table = m_comp_unit->GetLineTable();
- if (line_table == nullptr)
- return;
+ if (start_line == 0 || !line_table) {
+ return llvm::createStringError(llvm::formatv(
+ "Could not find line information for function \"{0}\".", GetName()));
+ }
- LineEntry line_entry;
- if (line_table->FindLineEntryByAddress(scratch_addr, line_entry, nullptr)) {
- line_no = line_entry.line;
- source_file = line_entry.GetFile();
+ uint32_t end_line = start_line;
+ for (const AddressRange &range : GetAddressRanges()) {
+ LineEntry entry;
+ uint32_t idx;
+ if (!line_table->FindLineEntryByAddress(range.GetBaseAddress(), entry,
+ &idx))
+ continue;
+
+ addr_t end_addr = range.GetBaseAddress().GetFileAddress() + range.GetByteSize();
+ while (line_table->GetLineEntryAtIndex(idx++, entry) &&
+ entry.range.GetBaseAddress().GetFileAddress() < end_addr) {
+ // Ignore entries belonging to inlined functions or #included files.
+ if (source_file_sp->Equal(*entry.file_sp,
+ SupportFile::eEqualFileSpecAndChecksumIfSet))
+ end_line = std::max(end_line, entry.line);
+ }
}
+ return std::make_pair(std::move(source_file_sp),
+ SourceRange(start_line, end_line - start_line));
}
llvm::ArrayRef<std::unique_ptr<CallEdge>> Function::GetCallEdges() {
diff --git a/lldb/test/Shell/Commands/command-source-list.s b/lldb/test/Shell/Commands/command-source-list.s
new file mode 100644
index 00000000000000..3dbc0a196f16a6
--- /dev/null
+++ b/lldb/test/Shell/Commands/command-source-list.s
@@ -0,0 +1,265 @@
+# REQUIRES: x86
+
+# RUN: split-file %s %t
+# RUN: llvm-mc --triple=x86_64-pc-linux -filetype=obj %t/a.s -o %t/a.o
+# RUN: %lldb %t/a.o -o "settings set target.source-map . %t" -s %t/commands -o exit | FileCheck %s
+
+#--- commands
+# CASE 0: function at the start of the file
+source list -n func0
+# CHECK-LABEL: source list -n func0
+# CHECK-NEXT: File: file0.c
+# CHECK-NEXT: 1 content of file0.c:1
+# CHECK-NEXT: 2 content of file0.c:2
+# CHECK-NEXT: 3 content of file0.c:3
+# CHECK-NEXT: 4 content of file0.c:4
+# CHECK-NEXT: 5 content of file0.c:5
+# CHECK-NEXT: 6 content of file0.c:6
+# CHECK-NEXT: 7 content of file0.c:7
+# CHECK-NEXT: 8 content of file0.c:8
+# CHECK-NEXT: 9 content of file0.c:9
+# CHECK-NEXT: 10 content of file0.c:10
+
+# CASE 1: function in the middle of the file
+source list -n func1
+# CHECK-NEXT: source list -n func1
+# CHECK-NEXT: File: file0.c
+# CHECK-NEXT: 5 content of file0.c:5
+# CHECK-NEXT: 6 content of file0.c:6
+# CHECK-NEXT: 7 content of file0.c:7
+# CHECK-NEXT: 8 content of file0.c:8
+# CHECK-NEXT: 9 content of file0.c:9
+# CHECK-NEXT: 10 content of file0.c:10
+# CHECK-NEXT: 11 content of file0.c:11
+# CHECK-NEXT: 12 content of file0.c:12
+# CHECK-NEXT: 13 content of file0.c:13
+# CHECK-NEXT: 14 content of file0.c:14
+# CHECK-NEXT: 15 content of file0.c:15
+# CHECK-NEXT: 16 content of file0.c:16
+# CHECK-NEXT: 17 content of file0.c:17
+
+# CASE 2: function at the end of the file
+source list -n func2
+# CHECK-NEXT: source list -n func2
+# CHECK-NEXT: File: file0.c
+# CHECK-NEXT: 20 content of file0.c:20
+# CHECK-NEXT: 21 content of file0.c:21
+# CHECK-NEXT: 22 content of file0.c:22
+# CHECK-NEXT: 23 content of file0.c:23
+# CHECK-NEXT: 24 content of file0.c:24
+# CHECK-NEXT: 25 content of file0.c:25
+# CHECK-NEXT: 26 content of file0.c:26
+# CHECK-NEXT: 27 content of file0.c:27
+# CHECK-NEXT: 28 content of file0.c:28
+# CHECK-NEXT: 29 content of file0.c:29
+# CHECK-NEXT: 30 content of file0.c:30
+
+# CASE 3: function ends in a different file
+source list -n func3
+# CHECK-NEXT: source list -n func3
+# CHECK-NEXT: File: file0.c
+# CHECK-NEXT: 1 content of file0.c:1
+# CHECK-NEXT: 2 content of file0.c:2
+# CHECK-NEXT: 3 content of file0.c:3
+# CHECK-NEXT: 4 content of file0.c:4
+# CHECK-NEXT: 5 content of file0.c:5
+# CHECK-NEXT: 6 content of file0.c:6
+# CHECK-NEXT: 7 content of file0.c:7
+# CHECK-NEXT: 8 content of file0.c:8
+# CHECK-NEXT: 9 content of file0.c:9
+# CHECK-NEXT: 10 content of file0.c:10
+
+# CASE 4: discontinuous function
+source list -n func4
+# CHECK-NEXT: source list -n func4
+# CHECK-NEXT: File: file0.c
+# CHECK-NEXT: 1 content of file0.c:1
+# CHECK-NEXT: 2 content of file0.c:2
+# CHECK-NEXT: 3 content of file0.c:3
+# CHECK-NEXT: 4 content of file0.c:4
+# CHECK-NEXT: 5 content of file0.c:5
+# CHECK-NEXT: 6 content of file0.c:6
+# CHECK-NEXT: 7 content of file0.c:7
+# CHECK-NEXT: 8 content of file0.c:8
+# CHECK-NEXT: 9 content of file0.c:9
+# CHECK-NEXT: 10 content of file0.c:10
+
+
+#--- a.s
+ .file 0 "." "file0.c"
+ .file 1 "." "file1.c"
+ .text
+func0:
+ .loc 0 1
+ nop
+ .loc 0 5
+ nop
+.Lfunc0_end:
+
+func1:
+ .loc 0 10
+ nop
+ .loc 0 12
+ nop
+.Lfunc1_end:
+
+func2:
+ .loc 0 25
+ nop
+ .loc 0 30
+ nop
+.Lfunc2_end:
+
+func3:
+ .loc 0 1
+ nop
+ .loc 0 5
+ nop
+ .loc 1 5
+ nop
+.Lfunc3_end:
+
+func4.__part.1:
+ .loc 0 1
+ nop
+.Lfunc4.__part.1_end:
+
+.Lpadding:
+ nop
+
+func4:
+ .loc 0 5
+ nop
+.Lfunc4_end:
+
+.Ltext_end:
+
+ .section .debug_abbrev,"", at progbits
+ .byte 1 # Abbreviation Code
+ .byte 17 # DW_TAG_compile_unit
+ .byte 1 # DW_CHILDREN_yes
+ .byte 3 # DW_AT_name
+ .byte 8 # DW_FORM_string
+ .byte 37 # DW_AT_producer
+ .byte 8 # DW_FORM_string
+ .byte 19 # DW_AT_language
+ .byte 5 # DW_FORM_data2
+ .byte 17 # DW_AT_low_pc
+ .byte 1 # DW_FORM_addr
+ .byte 18 # DW_AT_high_pc
+ .byte 1 # DW_FORM_addr
+ .byte 16 # DW_AT_stmt_list
+ .byte 23 # DW_FORM_sec_offset
+ .byte 0 # EOM(1)
+ .byte 0 # EOM(2)
+ .byte 2 # Abbreviation Code
+ .byte 46 # DW_TAG_subprogram
+ .byte 0 # DW_CHILDREN_no
+ .byte 17 # DW_AT_low_pc
+ .byte 1 # DW_FORM_addr
+ .byte 18 # DW_AT_high_pc
+ .byte 1 # DW_FORM_addr
+ .byte 3 # DW_AT_name
+ .byte 8 # DW_FORM_string
+ .byte 0 # EOM(1)
+ .byte 0 # EOM(2)
+ .byte 3 # Abbreviation Code
+ .byte 46 # DW_TAG_subprogram
+ .byte 0 # DW_CHILDREN_no
+ .byte 85 # DW_AT_ranges
+ .byte 23 # DW_FORM_sec_offset
+ .byte 3 # DW_AT_name
+ .byte 8 # DW_FORM_string
+ .byte 0 # EOM(1)
+ .byte 0 # EOM(2)
+ .byte 0 # EOM(3)
+ .section .debug_info,"", at progbits
+.Lcu_begin0:
+ .long .Ldebug_info_end0-.Ldebug_info_start0 # Length of Unit
+.Ldebug_info_start0:
+ .short 5 # DWARF version number
+ .byte 1 # DWARF Unit Type
+ .byte 8 # Address Size (in bytes)
+ .long .debug_abbrev # Offset Into Abbrev. Section
+ .byte 1 # Abbrev DW_TAG_compile_unit
+ .asciz "file0.c" # DW_AT_producer
+ .asciz "Hand-written DWARF" # DW_AT_producer
+ .short 29 # DW_AT_language
+ .quad .text # DW_AT_low_pc
+ .quad .Ltext_end # DW_AT_high_pc
+ .long .Lline_table_start0 # DW_AT_stmt_list
+ .rept 4
+ .byte 2 # Abbrev DW_TAG_subprogram
+ .quad func\+ # DW_AT_low_pc
+ .quad .Lfunc\+_end # DW_AT_high_pc
+ .asciz "func\+" # DW_AT_name
+ .endr
+ .byte 3 # Abbrev DW_TAG_subprogram
+ .long .Ldebug_ranges0
+ .asciz "func4" # DW_AT_name
+ .byte 0 # End Of Children Mark
+.Ldebug_info_end0:
+
+ .section .debug_rnglists,"", at progbits
+ .long .Ldebug_list_header_end0-.Ldebug_list_header_start0 # Length
+.Ldebug_list_header_start0:
+ .short 5 # Version
+ .byte 8 # Address size
+ .byte 0 # Segment selector size
+ .long 2 # Offset entry count
+.Lrnglists_table_base0:
+ .long .Ldebug_ranges0-.Lrnglists_table_base0
+.Ldebug_ranges0:
+ .byte 6 # DW_RLE_start_end
+ .quad func4
+ .quad .Lfunc4_end
+ .byte 6 # DW_RLE_start_end
+ .quad func4.__part.1
+ .quad .Lfunc4.__part.1_end
+ .byte 0 # DW_RLE_end_of_list
+.Ldebug_list_header_end0:
+ .section .debug_line,"", at progbits
+.Lline_table_start0:
+
+#--- file0.c
+content of file0.c:1
+content of file0.c:2
+content of file0.c:3
+content of file0.c:4
+content of file0.c:5
+content of file0.c:6
+content of file0.c:7
+content of file0.c:8
+content of file0.c:9
+content of file0.c:10
+content of file0.c:11
+content of file0.c:12
+content of file0.c:13
+content of file0.c:14
+content of file0.c:15
+content of file0.c:16
+content of file0.c:17
+content of file0.c:18
+content of file0.c:19
+content of file0.c:20
+content of file0.c:21
+content of file0.c:22
+content of file0.c:23
+content of file0.c:24
+content of file0.c:25
+content of file0.c:26
+content of file0.c:27
+content of file0.c:28
+content of file0.c:29
+content of file0.c:30
+#--- file1.c
+content of file1.c:1
+content of file1.c:2
+content of file1.c:3
+content of file1.c:4
+content of file1.c:5
+content of file1.c:6
+content of file1.c:7
+content of file1.c:8
+content of file1.c:9
+content of file1.c:10
More information about the lldb-commits
mailing list