[lld] 7d2c2af - [lld][WebAssembly] Return 0 for synthetic function offsets (#96134)

via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 21 15:56:05 PDT 2024


Author: Heejin Ahn
Date: 2024-06-21T15:56:02-07:00
New Revision: 7d2c2af0453c28d0902668523099a1f46a0bc348

URL: https://github.com/llvm/llvm-project/commit/7d2c2af0453c28d0902668523099a1f46a0bc348
DIFF: https://github.com/llvm/llvm-project/commit/7d2c2af0453c28d0902668523099a1f46a0bc348.diff

LOG: [lld][WebAssembly] Return 0 for synthetic function offsets (#96134)

When two or more functions' signatures differ, one of them is selected
and for other signatures `unreachable` stubs are generated:
https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/SymbolTable.cpp#L975
https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/SymbolTable.cpp#L852-L870

And when these `SyntheticFunction`s are generated, this constructor is
used,

https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/InputChunks.h#L266-L269
which does not set its `function` field:

https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/InputChunks.h#L304
As a result, the `function` field contains a garbage value for these
stub functions.

`InputFunction::getFunctionCodeOffset()` is called when relocations are
resolved for `.debug_info` section to get functions' PC locations. But
because these stub functions don't have their `function` field set, this
function segfaults:

https://github.com/llvm/llvm-project/blob/57778ec36c9c7e96b76a167f19dccbe00d49c9d4/lld/wasm/InputChunks.h#L282

This bug seems to be triggered when these conditions are met:
- There is a signature mismatch warning with multiple different
definitions (one definition with other declarations is not sufficient)
with weak linkage with the same name
- The 'stub' function containing unreachable has a callsite, meaning it
isn't DCE'd
- .debug_info section is generated (i.e., DWARF is used)

This PR initializes the field with `nullptr`, and in
`InputFunction::getFunctionCodeOffset`, checks if `function` is
`nullptr`, and if so, just returns 0. This function is called only for
resolving relocations in the `.debug_info` section, and addresses of
these stub functions, which are not the functions users wrote in the
first place, are not really meaningful anyway.

Added: 
    lld/test/wasm/Inputs/signature-mismatch-debug-info-a.s
    lld/test/wasm/Inputs/signature-mismatch-debug-info-b.s
    lld/test/wasm/Inputs/signature-mismatch-debug-info-main.s
    lld/test/wasm/signature-mismatch-debug-info.test

Modified: 
    lld/wasm/InputChunks.h

Removed: 
    


################################################################################
diff  --git a/lld/test/wasm/Inputs/signature-mismatch-debug-info-a.s b/lld/test/wasm/Inputs/signature-mismatch-debug-info-a.s
new file mode 100644
index 0000000000000..7da9a3de622d2
--- /dev/null
+++ b/lld/test/wasm/Inputs/signature-mismatch-debug-info-a.s
@@ -0,0 +1,22 @@
+  .functype  foo (i32) -> ()
+  .functype  test0 () -> ()
+
+  .section  .text.foo,"",@
+  .weak  foo
+  .type  foo, at function
+foo:
+  .functype  foo (i32) -> ()
+  end_function
+
+  .section  .text.test0,"",@
+  .globl  test0
+  .type  test0, at function
+test0:
+  .functype  test0 () -> ()
+  i32.const  3
+  call  foo
+  end_function
+
+  .section  .debug_info,"",@
+  .int32 foo
+  .int32 test0

diff  --git a/lld/test/wasm/Inputs/signature-mismatch-debug-info-b.s b/lld/test/wasm/Inputs/signature-mismatch-debug-info-b.s
new file mode 100644
index 0000000000000..8a1b9f8750f43
--- /dev/null
+++ b/lld/test/wasm/Inputs/signature-mismatch-debug-info-b.s
@@ -0,0 +1,23 @@
+  .functype  foo (i32, i32) -> ()
+  .functype  test1 () -> ()
+
+  .section  .text.foo,"",@
+  .weak  foo
+  .type  foo, at function
+foo:
+  .functype  foo (i32, i32) -> ()
+  end_function
+
+  .section  .text.test1,"",@
+  .globl  test1
+  .type  test1, at function
+test1:
+  .functype  test1 () -> ()
+  i32.const  4
+  i32.const  5
+  call  foo
+  end_function
+
+  .section  .debug_info,"",@
+  .int32 foo
+  .int32 test1

diff  --git a/lld/test/wasm/Inputs/signature-mismatch-debug-info-main.s b/lld/test/wasm/Inputs/signature-mismatch-debug-info-main.s
new file mode 100644
index 0000000000000..303ed68f249e4
--- /dev/null
+++ b/lld/test/wasm/Inputs/signature-mismatch-debug-info-main.s
@@ -0,0 +1,17 @@
+  .functype  test0 () -> ()
+  .functype  test1 () -> ()
+  .functype  main (i32, i32) -> (i32)
+
+  .section  .text.main,"",@
+  .globl  main
+  .type  main, at function
+main:
+  .functype  main (i32, i32) -> (i32)
+  call  test0
+  call  test1
+  i32.const  0
+  end_function
+
+  .section  .debug_info,"",@
+  .int32 test0
+  .int32 test1

diff  --git a/lld/test/wasm/signature-mismatch-debug-info.test b/lld/test/wasm/signature-mismatch-debug-info.test
new file mode 100644
index 0000000000000..71e6e4b655dfe
--- /dev/null
+++ b/lld/test/wasm/signature-mismatch-debug-info.test
@@ -0,0 +1,8 @@
+# This is a regression test that ensures a function signature mismatch in
+# functions with debug info does not cause does not cause a segmentation fault
+# when writing .debug_info section.
+
+; RUN: llvm-mc -filetype=obj -triple=wasm32-unknown-unknown %p/Inputs/signature-mismatch-debug-info-a.s -o %t.a.o
+; RUN: llvm-mc -filetype=obj -triple=wasm32-unknown-unknown %p/Inputs/signature-mismatch-debug-info-b.s -o %t.b.o
+; RUN: llvm-mc -filetype=obj -triple=wasm32-unknown-unknown %p/Inputs/signature-mismatch-debug-info-main.s -o %t.main.o
+; RUN: wasm-ld -o %t.wasm %t.a.o %t.b.o %t.main.o --export=main --no-entry

diff  --git a/lld/wasm/InputChunks.h b/lld/wasm/InputChunks.h
index cf8a5249b19a0..5174439facc67 100644
--- a/lld/wasm/InputChunks.h
+++ b/lld/wasm/InputChunks.h
@@ -279,7 +279,14 @@ class InputFunction : public InputChunk {
   }
   void setExportName(std::string exportName) { this->exportName = exportName; }
   uint32_t getFunctionInputOffset() const { return getInputSectionOffset(); }
-  uint32_t getFunctionCodeOffset() const { return function->CodeOffset; }
+  uint32_t getFunctionCodeOffset() const {
+    // For generated synthetic functions, such as unreachable stubs generated
+    // for signature mismatches, 'function' reference does not exist. This
+    // function is used to get function offsets for .debug_info section, and for
+    // those generated stubs function offsets are not meaningful anyway. So just
+    // return 0 in those cases.
+    return function ? function->CodeOffset : 0;
+  }
   uint32_t getFunctionIndex() const { return *functionIndex; }
   bool hasFunctionIndex() const { return functionIndex.has_value(); }
   void setFunctionIndex(uint32_t index);
@@ -301,7 +308,7 @@ class InputFunction : public InputChunk {
     return compressedSize;
   }
 
-  const WasmFunction *function;
+  const WasmFunction *function = nullptr;
 
 protected:
   std::optional<std::string> exportName;


        


More information about the llvm-commits mailing list