[lld] [lld][WebAssembly] Return 0 for synthetic function offsets (PR #96134)

Heejin Ahn via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 19 19:06:33 PDT 2024


https://github.com/aheejin created https://github.com/llvm/llvm-project/pull/96134

When two or more functions' signatures differ, one of them is selected and for other signatures `unreachable` stubs are generated: https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/SymbolTable.cpp#L975 https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/SymbolTable.cpp#L852-L870

And when these `SyntheticFunction`s are generated, this constructor is used,
https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/InputChunks.h#L266-L269 which does not set its `function` field:
https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/InputChunks.h#L304 As a result, the `function` field contains a garbage value for these stub functions.

`InputFunction::getFunctionCodeOffset()` is called when relocations are resolved for `.debug_info` section to get functions' PC locations. But because these stub functions don't have their `function` field set, this function segfaults:
https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/InputChunks.h#L282

This PR initializes the field with `nullptr`, and in `InputFunction::getFunctionCodeOffset`, checks if `function` is `nullptr`, and if so, just returns 0. This function is called only for resolving relocations in the `.debug_info` section, and addresses of these stub functions, which are not the functions users wrote in the first place, are not really meaningful anyway.

>From d5ba71ee8431cad2fc8bd9bc2dddf6da805f75a7 Mon Sep 17 00:00:00 2001
From: Heejin Ahn <aheejin at gmail.com>
Date: Thu, 20 Jun 2024 01:36:20 +0000
Subject: [PATCH] [lld][WebAssembly] Return 0 for synthetic function offsets

When two or more functions' signatures differ, one of them is selected
and for other signatures `unreachable` stubs are generated:
https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/SymbolTable.cpp#L975
https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/SymbolTable.cpp#L852-L870

And when these `SyntheticFunction`s are generated, this constructor is
used,
https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/InputChunks.h#L266-L269
which does not set its `function` field:
https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/InputChunks.h#L304
As a result, the `function` field contains a garbage value for these
stub functions.

`InputFunction::getFunctionCodeOffset()` is called when relocations are
resolved for `.debug_info` section to get functions' PC locations. But
because these stub functions don't have their `function` field set, this
function segfaults:
https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/InputChunks.h#L282

This PR initializes the field with `nullptr`, and in
`InputFunction::getFunctionCodeOffset`, checks if `function` is
`nullptr`, and if so, just returns 0. This function is called only for
resolving relocations in the `.debug_info` section, and addresses of
these stub functions, which are not the functions users wrote in the
first place, are not really meaningful anyway.
---
 .../Inputs/signature-mismatch-debug-info-a.ll | 31 +++++++++++++++++++
 .../Inputs/signature-mismatch-debug-info-b.ll | 31 +++++++++++++++++++
 .../signature-mismatch-debug-info-main.ll     | 30 ++++++++++++++++++
 .../wasm/signature-mismatch-debug-info.test   |  8 +++++
 lld/wasm/InputChunks.h                        | 11 +++++--
 5 files changed, 109 insertions(+), 2 deletions(-)
 create mode 100644 lld/test/wasm/Inputs/signature-mismatch-debug-info-a.ll
 create mode 100644 lld/test/wasm/Inputs/signature-mismatch-debug-info-b.ll
 create mode 100644 lld/test/wasm/Inputs/signature-mismatch-debug-info-main.ll
 create mode 100644 lld/test/wasm/signature-mismatch-debug-info.test

diff --git a/lld/test/wasm/Inputs/signature-mismatch-debug-info-a.ll b/lld/test/wasm/Inputs/signature-mismatch-debug-info-a.ll
new file mode 100644
index 0000000000000..9ebc5c4b9c922
--- /dev/null
+++ b/lld/test/wasm/Inputs/signature-mismatch-debug-info-a.ll
@@ -0,0 +1,31 @@
+target triple = "wasm32-unknown-emscripten"
+
+define void @foo(i32 %a) !dbg !6 {
+  ret void
+}
+
+define void @test0() !dbg !10 {
+entry:
+  call void @foo(i32 3), !dbg !13
+  ret void, !dbg !14
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4}
+!llvm.ident = !{!5}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 19.0.0git", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "a.c", directory: "")
+!2 = !{i32 7, !"Dwarf Version", i32 4}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{!"clang version 19.0.0git"}
+!6 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 3, type: !7, scopeLine: 3, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0)
+!7 = !DISubroutineType(types: !8)
+!8 = !{null, !9}
+!9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!10 = distinct !DISubprogram(name: "test0", scope: !1, file: !1, line: 7, type: !11, scopeLine: 7, spFlags: DISPFlagDefinition, unit: !0)
+!11 = !DISubroutineType(types: !12)
+!12 = !{null}
+!13 = !DILocation(line: 8, column: 3, scope: !10)
+!14 = !DILocation(line: 9, column: 1, scope: !10)
diff --git a/lld/test/wasm/Inputs/signature-mismatch-debug-info-b.ll b/lld/test/wasm/Inputs/signature-mismatch-debug-info-b.ll
new file mode 100644
index 0000000000000..7b8295363c802
--- /dev/null
+++ b/lld/test/wasm/Inputs/signature-mismatch-debug-info-b.ll
@@ -0,0 +1,31 @@
+target triple = "wasm32-unknown-emscripten"
+
+define void @foo(i32 %a, i32 %b) !dbg !6 {
+  ret void
+}
+
+define void @test1() !dbg !10 {
+entry:
+  call void @foo(i32 4, i32 5), !dbg !13
+  ret void, !dbg !14
+}
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4}
+!llvm.ident = !{!5}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 19.0.0git", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "b.c", directory: "")
+!2 = !{i32 7, !"Dwarf Version", i32 4}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{!"clang version 19.0.0git"}
+!6 = distinct !DISubprogram(name: "foo", scope: !1, file: !1, line: 3, type: !7, scopeLine: 3, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0)
+!7 = !DISubroutineType(types: !8)
+!8 = !{null, !9, !9}
+!9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!10 = distinct !DISubprogram(name: "test1", scope: !1, file: !1, line: 7, type: !11, scopeLine: 7, spFlags: DISPFlagDefinition, unit: !0)
+!11 = !DISubroutineType(types: !12)
+!12 = !{null}
+!13 = !DILocation(line: 8, column: 3, scope: !10)
+!14 = !DILocation(line: 9, column: 1, scope: !10)
diff --git a/lld/test/wasm/Inputs/signature-mismatch-debug-info-main.ll b/lld/test/wasm/Inputs/signature-mismatch-debug-info-main.ll
new file mode 100644
index 0000000000000..3d2f8c2e0a941
--- /dev/null
+++ b/lld/test/wasm/Inputs/signature-mismatch-debug-info-main.ll
@@ -0,0 +1,30 @@
+target triple = "wasm32-unknown-emscripten"
+
+define i32 @main() !dbg !6 {
+entry:
+  call void @test0(), !dbg !10
+  call void @test1(), !dbg !11
+  ret i32 0, !dbg !12
+}
+
+declare void @test0()
+
+declare void @test1()
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4}
+!llvm.ident = !{!5}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C11, file: !1, producer: "clang version 19.0.0git", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, splitDebugInlining: false, nameTableKind: None)
+!1 = !DIFile(filename: "main.c", directory: "")
+!2 = !{i32 7, !"Dwarf Version", i32 4}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{!"clang version 19.0.0git"}
+!6 = distinct !DISubprogram(name: "main", scope: !1, file: !1, line: 4, type: !7, scopeLine: 4, spFlags: DISPFlagDefinition, unit: !0)
+!7 = !DISubroutineType(types: !8)
+!8 = !{!9}
+!9 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!10 = !DILocation(line: 5, column: 3, scope: !6)
+!11 = !DILocation(line: 6, column: 3, scope: !6)
+!12 = !DILocation(line: 7, column: 3, scope: !6)
diff --git a/lld/test/wasm/signature-mismatch-debug-info.test b/lld/test/wasm/signature-mismatch-debug-info.test
new file mode 100644
index 0000000000000..fe1e8c2dbe579
--- /dev/null
+++ b/lld/test/wasm/signature-mismatch-debug-info.test
@@ -0,0 +1,8 @@
+# This is a regression test that checks whether a function signature mismatch
+# in functions with debug info does not cause does not cause a segmentation
+# fault when writing .debug_info section.
+
+; RUN: llc -filetype=obj %p/Inputs/signature-mismatch-debug-info-a.ll -o %t.a.o
+; RUN: llc -filetype=obj %p/Inputs/signature-mismatch-debug-info-b.ll -o %t.b.o
+; RUN: llc -filetype=obj %p/Inputs/signature-mismatch-debug-info-main.ll -o %t.main.o
+; RUN: wasm-ld -o %t.wasm %t.a.o %t.b.o %t.main.o --export=main --no-entry
diff --git a/lld/wasm/InputChunks.h b/lld/wasm/InputChunks.h
index cf8a5249b19a0..5174439facc67 100644
--- a/lld/wasm/InputChunks.h
+++ b/lld/wasm/InputChunks.h
@@ -279,7 +279,14 @@ class InputFunction : public InputChunk {
   }
   void setExportName(std::string exportName) { this->exportName = exportName; }
   uint32_t getFunctionInputOffset() const { return getInputSectionOffset(); }
-  uint32_t getFunctionCodeOffset() const { return function->CodeOffset; }
+  uint32_t getFunctionCodeOffset() const {
+    // For generated synthetic functions, such as unreachable stubs generated
+    // for signature mismatches, 'function' reference does not exist. This
+    // function is used to get function offsets for .debug_info section, and for
+    // those generated stubs function offsets are not meaningful anyway. So just
+    // return 0 in those cases.
+    return function ? function->CodeOffset : 0;
+  }
   uint32_t getFunctionIndex() const { return *functionIndex; }
   bool hasFunctionIndex() const { return functionIndex.has_value(); }
   void setFunctionIndex(uint32_t index);
@@ -301,7 +308,7 @@ class InputFunction : public InputChunk {
     return compressedSize;
   }
 
-  const WasmFunction *function;
+  const WasmFunction *function = nullptr;
 
 protected:
   std::optional<std::string> exportName;



More information about the llvm-commits mailing list