[llvm] [DWARFVerifier] Verify that DW_AT_LLVM_stmt_sequence is set correctly (PR #152807)

Peter Rong via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 2 14:59:27 PDT 2025


================
@@ -0,0 +1,1640 @@
+# Object file copied from llvm/test/tools/dsymutil/ARM/stmt-seq-macho.test
+# Then manually tempered with some of the value of the attribute
+# I hope there are easier ways to construct tests like this.
+
+# RUN: yaml2obj %s -o verify_stmt_seq.o
+# RUN: not llvm-dwarfdump -verify -debug-info verify_stmt_seq.o | FileCheck %s --check-prefix=CHECK_INVALID --implicit-check-not=error:
+# RUN: llvm-dwarfdump -debug-line -verbose -debug-info verify_stmt_seq.o | FileCheck %s --check-prefix=CHECK_DEBUG_LINE
+
+
+# CHECK_INVALID: error: DW_AT_LLVM_stmt_sequence offset 0x00000000 is not within the line table bounds [0x0000002e, 0x000000fd)
+# CHECK_INVALID: DW_AT_LLVM_stmt_sequence [DW_FORM_sec_offset]     (0x00000000)
+
+# 0xd3 would be a valid offset, if the line table wan't ill formed with two rows having the same PC (0x8c).
+# CHECK_INVALID: error: DW_AT_LLVM_stmt_sequence offset 0x000000d3 does not point to a valid sequence offset in the line table
+# CHECK_INVALID: DW_AT_LLVM_stmt_sequence [DW_FORM_sec_offset]     (0x000000d3)
+
+# CHECK_DEBUG_LINE:      0x000000d3: 05 DW_LNS_set_column (85)
+# CHECK_DEBUG_LINE-NEXT: 0x000000d5: 0a DW_LNS_set_prologue_end
+# CHECK_DEBUG_LINE-NEXT: 0x000000d6: 00 DW_LNE_set_address (0x000000000000008c)
+# CHECK_DEBUG_LINE-NEXT: 0x000000e1: 03 DW_LNS_advance_line (30)
+# CHECK_DEBUG_LINE-NEXT: 0x000000e3: 01 DW_LNS_copy
+# CHECK_DEBUG_LINE-NEXT:             0x000000000000008c     30     85      1   0             0       0  is_stmt prologue_end
+# CHECK_DEBUG_LINE-NEXT: 0x000000e4: 00 DW_LNE_end_sequence
+# CHECK_DEBUG_LINE-NEXT:             0x000000000000008c     30     85      1   0             0       0  is_stmt end_sequence
----------------
DataCorrupted wrote:

Thanks for the code pointers. 

I agree the discussion is a little bit confusing, let me summarize what we have discussed and see if we are on the same page. 

Problems we have here:

**Zero-length sequence**

AFAIK some non-compliant code you mentioned https://godbolt.org/z/efhK69TMG
 could led to a zero-length function. By "set TrapUnreachable to true, and NoTrapAfterNoReturn to true" or even make it a "hardcoded/non-optional behavior", we can resolve zero-length sequence once and for all by saying "all functions should have at lease one instruction, weather its return or trap". This transforms all zero-length problems into one-length problem to make DWARF's life easier. Am I understanding correctly?

**One-length sequence is not parsed correctly**

The problem is consecutive one-length sequences can't be manually terminated correct. Simply emitting `DW_LNE_end_sequence` won't progress  the PC of the last Row, rendering the line table incorrect.  However, `DW_AT_LLVM_stmt_sequence` is attached to one-instruction functions that we care about, and its causing DWARF to be invalid. 

Steps I propose to resolve these issues:

Step 1 (#154986): Since `DW_AT_LLVM_stmt_sequence` is blocking us the most, I'm proposing that, let's avoid emitting `DW_AT_LLVM_stmt_sequence` for these zero/one-length functions by making the thunk two-instruction long, and only emit `DW_AT_LLVM_stmt_sequence` for 2+ length functions. Thus, whatever problem zero/one-length functions have, we can leave them there and they should work just like before. 

To answer your question:

"you're suggesting skipping stmt_sequence on 2 byte lengths?", I'm suggesting skipping `DW_AT_LLVM_stmt_sequence` for functions with less than 2 instructions.

"relaxable by the linker, so what's being emitted as a 2 byte sequence is being relaxed to zero bytes?" No, the hope is to steer clear of the existing problems until we fix it.

Step 2: Fix one-instruction sequences. The ideal thing to do is something similar to `-ffunction-sections`, but I've studied it for the past week and found that it was not as easy as it seems. Sections have better support than sequences: `MCStream` can terminate a section by simply generate a new section symbol; but it cannot directly terminate the sequence by emitting `DW_LNE_end_sequence`, otherwise the PC of the last row will be the same as the first row of the next sequence.

Step 3: Eliminate zero-length sequence `set TrapUnreachable to true, and NoTrapAfterNoReturn to true` or even make them mandatory for all targets, as you mentioned above. 

Step 4: Once this is done, we can revert Step 1 and land this PR as is, basically saying "all `DW_AT_LLVM_stmt_sequence` should point to a valid sequence start at this point."




https://github.com/llvm/llvm-project/pull/152807


More information about the llvm-commits mailing list