[llvm] 2cf550a - [DebugInfo] Force early line-zero calls to have meaningful locations (#156850)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 20 02:20:51 PST 2025
Author: Jeremy Morse
Date: 2025-11-20T10:20:47Z
New Revision: 2cf550a040414aee51f7958812573723380b7a4b
URL: https://github.com/llvm/llvm-project/commit/2cf550a040414aee51f7958812573723380b7a4b
DIFF: https://github.com/llvm/llvm-project/commit/2cf550a040414aee51f7958812573723380b7a4b.diff
LOG: [DebugInfo] Force early line-zero calls to have meaningful locations (#156850)
In functions that have been seriously deformed during optimisation,
there can be call instructions with line-zero immediately after frame
setup (see C reproducer in the test added). Our previous algorithms for
prologue_end ignored these, meaning someone entering a function at
prologue_end would break-in after a function call had completed. Prefer
instead to place prologue_end and the function scope-line on the line
zero call: this isn't false (it's the first meaningful instruction of the
function) and is approximately true. Given a less than ideal function,
this is an OK solution.
Added:
llvm/test/DebugInfo/X86/no-prologue-end-after-line0-calls.mir
Modified:
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
llvm/test/DebugInfo/MIR/X86/debug-loc-0.mir
Removed:
################################################################################
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index 30db817ba3144..a50bde1c37cbb 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -2211,6 +2211,11 @@ void DwarfDebug::beginInstruction(const MachineInstr *MI) {
PrevInstLoc = DL;
}
+// Returns the position where we should place prologue_end, potentially nullptr,
+// which means "no good place to put prologue_end". Returns true in the second
+// return value if there are no setup instructions in this function at all,
+// meaning we should not emit a start-of-function linetable entry, because it
+// would be zero-lengthed.
static std::pair<const MachineInstr *, bool>
findPrologueEndLoc(const MachineFunction *MF) {
// First known non-DBG_VALUE and non-frame setup location marks
@@ -2218,6 +2223,7 @@ findPrologueEndLoc(const MachineFunction *MF) {
const auto &TII = *MF->getSubtarget().getInstrInfo();
const MachineInstr *NonTrivialInst = nullptr;
const Function &F = MF->getFunction();
+ DISubprogram *SP = const_cast<DISubprogram *>(F.getSubprogram());
// Some instructions may be inserted into prologue after this function. Must
// keep prologue for these cases.
@@ -2305,6 +2311,26 @@ findPrologueEndLoc(const MachineFunction *MF) {
return *FoundInst;
}
+ // In very rare scenarios function calls can have line zero, and we
+ // shouldn't step over such a call while trying to reach prologue_end. In
+ // these extraordinary conditions, force the call to have the scope line
+ // and put prologue_end there. This isn't ideal, but signals that the call
+ // is where execution in the function starts, and is less catastrophic than
+ // stepping over the call.
+ if (CurInst->isCall()) {
+ if (const DILocation *Loc = CurInst->getDebugLoc().get();
+ Loc && Loc->getLine() == 0) {
+ // Create and assign the scope-line position.
+ unsigned ScopeLine = SP->getScopeLine();
+ DILocation *ScopeLineDILoc =
+ DILocation::get(SP->getContext(), ScopeLine, 0, SP);
+ const_cast<MachineInstr *>(&*CurInst)->setDebugLoc(ScopeLineDILoc);
+
+ // Consider this position to be where prologue_end is placed.
+ return std::make_pair(&*CurInst, false);
+ }
+ }
+
// Try to continue searching, but use a backup-location if substantive
// computation is happening.
auto NextInst = std::next(CurInst);
diff --git a/llvm/test/DebugInfo/MIR/X86/debug-loc-0.mir b/llvm/test/DebugInfo/MIR/X86/debug-loc-0.mir
index 01862f5905f9c..71489d5a5e485 100644
--- a/llvm/test/DebugInfo/MIR/X86/debug-loc-0.mir
+++ b/llvm/test/DebugInfo/MIR/X86/debug-loc-0.mir
@@ -5,7 +5,7 @@
# CHECK: Ltmp0:
# CHECK: .loc 1 0 0
# CHECK-NOT: .loc 1 0 0
-# CHECK: .loc 1 37 1 prologue_end
+# CHECK: .loc 1 37 1
--- |
; ModuleID = '<stdin>'
diff --git a/llvm/test/DebugInfo/X86/no-prologue-end-after-line0-calls.mir b/llvm/test/DebugInfo/X86/no-prologue-end-after-line0-calls.mir
new file mode 100644
index 0000000000000..abd7eb2528cb0
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/no-prologue-end-after-line0-calls.mir
@@ -0,0 +1,132 @@
+# RUN: llc %s -start-after=livedebugvalues -o - | FileCheck %s
+#
+## Original code, compiled clang -O2 -g -c
+##
+## void ext();
+## int main(int argc, char **argv) {
+## if (argc == 1)
+## ext();
+## else
+## ext();
+## return 0;
+## }
+##
+## In the code sequence above, the call to ext is given line zero during
+## optimisation, because the code is duplicated down all function paths thus
+## gets merged. We get something like this as the output:
+##
+## 0: 50 push %rax
+## 1: 31 c0 xor %eax,%eax
+## 3: e8 00 00 00 00 call 8 <main+0x8>
+## 4: R_X86_64_PLT32 ext-0x4
+## 8: 31 c0 xor %eax,%eax
+## a: 59 pop %rcx
+## b: c3 ret
+##
+## And we could choose to set prologue_end on address 8, the clearing of the
+## return register, because it's the first "real" instruction that isn't line
+## zero. But this then causes debuggers to skip over the call instruction when
+## entering the function, which is catastrophic.
+##
+## Instead: force the call itself to have a source location (the function scope
+## line number), and put a prologue_end there. While it's not the original
+## source of the call, it's better to have a prologue_end that means we'll stop
+## in the prologue than to step over the call. This gives consumers the
+## opportunity to recognise "this is a crazy function" and act accordingly.
+##
+## Check lines: ensure that we set prologue_end. The first entry is the
+## start-of-function scope line, the second entry is the prologue_end on the
+## call.
+#
+#
+# CHECK: main:
+# CHECK-NEXT: .Lfunc_begin0:
+# CHECK-NEXT: .file 0 "/tmp/test.c"
+# CHECK-NEXT: .loc 0 2 0
+# CHECK-NEXT: .cfi_startproc
+# CHECK-NEXT: # %bb.0:
+# CHECK-NEXT: pushq %rax
+# CHECK-NEXT: .cfi_def_cfa_offset 16
+# CHECK-NEXT: .Ltmp0:
+# CHECK-NEXT: .loc 0 0 0 is_stmt 0
+# CHECK-NEXT: xorl %eax, %eax
+# CHECK-NEXT: .loc 0 2 0 prologue_end is_stmt 1
+# CHECK-NEXT: callq ext at PLT
+
+--- |
+ target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+ target triple = "x86_64-unknown-linux-gnu"
+
+ ; Function Attrs: nounwind uwtable
+ define dso_local noundef i32 @main(i32 noundef %argc, ptr noundef readnone captures(none) %argv) local_unnamed_addr !dbg !10 {
+ entry:
+ tail call void (...) @ext(), !dbg !22
+ ret i32 0, !dbg !24
+ }
+
+ declare !dbg !25 void @ext(...) local_unnamed_addr
+
+ !llvm.dbg.cu = !{!0}
+ !llvm.module.flags = !{!2, !3, !4, !5, !6, !7, !8}
+ !llvm.ident = !{!9}
+
+ !0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang", isOptimized: true, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+ !1 = !DIFile(filename: "/tmp/test.c", directory: "")
+ !2 = !{i32 7, !"Dwarf Version", i32 5}
+ !3 = !{i32 2, !"Debug Info Version", i32 3}
+ !4 = !{i32 1, !"wchar_size", i32 4}
+ !5 = !{i32 8, !"PIC Level", i32 2}
+ !6 = !{i32 7, !"PIE Level", i32 2}
+ !7 = !{i32 7, !"uwtable", i32 2}
+ !8 = !{i32 7, !"debug-info-assignment-tracking", i1 true}
+ !9 = !{!"clang"}
+ !10 = distinct !DISubprogram(name: "main", scope: !11, file: !11, line: 2, type: !12, scopeLine: 2, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: !0, retainedNodes: !18, keyInstructions: true)
+ !11 = !DIFile(filename: "/tmp/test.c", directory: "")
+ !12 = !DISubroutineType(types: !13)
+ !13 = !{!14, !14, !15}
+ !14 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+ !15 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !16, size: 64)
+ !16 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !17, size: 64)
+ !17 = !DIBasicType(name: "char", size: 8, encoding: DW_ATE_signed_char)
+ !18 = !{!19, !20}
+ !19 = !DILocalVariable(name: "argc", arg: 1, scope: !10, file: !11, line: 2, type: !14)
+ !20 = !DILocalVariable(name: "argv", arg: 2, scope: !10, file: !11, line: 2, type: !15)
+ !21 = !DILocation(line: 0, scope: !10)
+ !22 = !DILocation(line: 0, scope: !23)
+ !23 = distinct !DILexicalBlock(scope: !10, file: !11, line: 3, column: 7)
+ !24 = !DILocation(line: 7, column: 4, scope: !10, atomGroup: 2, atomRank: 1)
+ !25 = !DISubprogram(name: "ext", scope: !11, file: !11, line: 1, type: !26, spFlags: DISPFlagOptimized)
+ !26 = !DISubroutineType(types: !27)
+ !27 = !{null}
+...
+---
+name: main
+alignment: 16
+tracksRegLiveness: true
+noPhis: true
+isSSA: false
+noVRegs: true
+hasFakeUses: false
+debugInstrRef: true
+tracksDebugUserValues: true
+frameInfo:
+ stackSize: 8
+ offsetAdjustment: -8
+ maxAlignment: 1
+ adjustsStack: true
+ hasCalls: true
+ maxCallFrameSize: 0
+ isCalleeSavedInfoValid: true
+machineFunctionInfo:
+ amxProgModel: None
+body: |
+ bb.0.entry:
+ frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp
+ frame-setup CFI_INSTRUCTION def_cfa_offset 16
+ dead $eax = XOR32rr undef $eax, undef $eax, implicit-def dead $eflags, implicit-def $al, debug-location !22
+ CALL64pcrel32 target-flags(x86-plt) @ext, csr_64, implicit $rsp, implicit $ssp, implicit killed $al, implicit-def $rsp, implicit-def $ssp, debug-location !22
+ $eax = XOR32rr undef $eax, undef $eax, implicit-def dead $eflags, debug-location !24
+ $rcx = frame-destroy POP64r implicit-def $rsp, implicit $rsp, debug-location !24
+ frame-destroy CFI_INSTRUCTION def_cfa_offset 8, debug-location !24
+ RET64 $eax, debug-location !24
+...
More information about the llvm-commits
mailing list