[llvm] [WebAssembly] Handle symbols in `.init_array` sections (PR #119127)
George Stagg via llvm-commits
llvm-commits at lists.llvm.org
Sun Dec 8 05:21:14 PST 2024
https://github.com/georgestagg created https://github.com/llvm/llvm-project/pull/119127
Follow on from #111008.
Consider the following C code, which supplies a function `init()` and an array written to the `.init_array` section so that the function runs before `main()`.
```c
void init() {}
typedef void (*fn)(void);
__attribute__((section(".init_array"))) fn p_init[1] = { &init };
int main() { return 0; }
```
Note that a label `p_init` is also provided for the array. I'm not sure if there exists a way to do this in C without additionally creating the `p_init` symbol.
In any case, compile this code to find:
```
$ EM_LLVM_ROOT=[...]/llvm-project/build/bin emcc bug.c -o bug.js
wasm-ld: error: [...]/emscripten_temp_fgsxu6w6/bug_0.o: invalid data segment index: 0
```
What is happening is the array to be written to `.init_array` is being handled specially by `WasmObjectWriter.cpp`:
https://github.com/llvm/llvm-project/blob/4e0ba801ea2267d80ff875bdc40984da32db774d/llvm/lib/MC/WasmObjectWriter.cpp#L1485-L1487
This existing code skips writing out the array in a data segment. Later this is indeed dealt with by writing each function in the array into a Wasm custom linking section, in the `InitFunctions` subsection.
Unfortunately, the symbol `p_init` is also written to the Wasm custom linking section as a `wasm::WASM_SYMBOL_TYPE_DATA`, pointing to a data segment that now does not exist. As far as I can tell, we don't really indend to ever reference this symbol, the pattern is just used in C (and a similar construct can be used in Rust) to ensure that the `.init_array` array exists. So, while we could just throw an error here it's not ideal as it breaks this useful pattern for life-before-main.
In #111008 I suggested a change to simply not write the `p_init` symbol to the custom linking section of the Wasm object. But, after some prompting by @sbc100 it seems this is not a good idea, since it breaks code that does happen to reference the `p_init` symbol for whatever reason.
In this patch I instead remove the skip, so that data segments are also written for `.init_array` sections. This works around the problem above, since the `p_init` entry now references a data section that exists. I don't think the data written there is useful, but the change at least avoids the error message above. It also fixes related issues when you have multiple `.init_array` sections.
Some tests related to the change have been modified to reflect the additional emitted data, and a new test `init-array-label.s` has been added to test the specific problem above.
>From 0426b8dd567e61f5d0c249ccec36177de4636899 Mon Sep 17 00:00:00 2001
From: George Stagg <george.stagg at posit.co>
Date: Sat, 7 Dec 2024 16:44:21 +0000
Subject: [PATCH] WasmObjectWriter: Handle symbols in .init_array sections
---
llvm/lib/MC/WasmObjectWriter.cpp | 4 -
llvm/test/MC/WebAssembly/global-ctor-dtor.ll | 24 +++++-
llvm/test/MC/WebAssembly/init-array-label.s | 86 ++++++++++++++++++++
llvm/test/MC/WebAssembly/init-array.s | 15 ++++
4 files changed, 123 insertions(+), 6 deletions(-)
create mode 100644 llvm/test/MC/WebAssembly/init-array-label.s
diff --git a/llvm/lib/MC/WasmObjectWriter.cpp b/llvm/lib/MC/WasmObjectWriter.cpp
index a66c5713ff8a6e..04466ae0fddf13 100644
--- a/llvm/lib/MC/WasmObjectWriter.cpp
+++ b/llvm/lib/MC/WasmObjectWriter.cpp
@@ -1482,10 +1482,6 @@ uint64_t WasmObjectWriter::writeOneObject(MCAssembler &Asm,
LLVM_DEBUG(dbgs() << "Processing Section " << SectionName << " group "
<< Section.getGroup() << "\n";);
- // .init_array sections are handled specially elsewhere.
- if (SectionName.starts_with(".init_array"))
- continue;
-
// Code is handled separately
if (Section.isText())
continue;
diff --git a/llvm/test/MC/WebAssembly/global-ctor-dtor.ll b/llvm/test/MC/WebAssembly/global-ctor-dtor.ll
index f1ec71da1ebb64..b1b39ce02cfe85 100644
--- a/llvm/test/MC/WebAssembly/global-ctor-dtor.ll
+++ b/llvm/test/MC/WebAssembly/global-ctor-dtor.ll
@@ -63,7 +63,7 @@ declare void @func3()
; CHECK-NEXT: Value: 1
; CHECK-NEXT: Functions: [ 5, 7 ]
; CHECK-NEXT: - Type: DATACOUNT
-; CHECK-NEXT: Count: 1
+; CHECK-NEXT: Count: 3
; CHECK-NEXT: - Type: CODE
; CHECK-NEXT: Relocations:
; CHECK-NEXT: - Type: R_WASM_FUNCTION_INDEX_LEB
@@ -111,6 +111,18 @@ declare void @func3()
; CHECK-NEXT: Opcode: I32_CONST
; CHECK-NEXT: Value: 0
; CHECK-NEXT: Content: '01040000'
+; CHECK-NEXT: - SectionOffset: 15
+; CHECK-NEXT: InitFlags: 0
+; CHECK-NEXT: Offset:
+; CHECK-NEXT: Opcode: I32_CONST
+; CHECK-NEXT: Value: 4
+; CHECK-NEXT: Content: '0000000000000000'
+; CHECK-NEXT: - SectionOffset: 28
+; CHECK-NEXT: InitFlags: 0
+; CHECK-NEXT: Offset:
+; CHECK-NEXT: Opcode: I32_CONST
+; CHECK-NEXT: Value: 12
+; CHECK-NEXT: Content: '0000000000000000'
; CHECK-NEXT: - Type: CUSTOM
; CHECK-NEXT: Name: linking
; CHECK-NEXT: Version: 2
@@ -174,7 +186,15 @@ declare void @func3()
; CHECK-NEXT: - Index: 0
; CHECK-NEXT: Name: .data.global1
; CHECK-NEXT: Alignment: 3
-; CHECK-NEXT: Flags: [ ]
+; CHECK-NEXT: Flags: [ ]
+; CHECK-NEXT: - Index: 1
+; CHECK-NEXT: Name: .init_array.42
+; CHECK-NEXT: Alignment: 2
+; CHECK-NEXT: Flags: [ ]
+; CHECK-NEXT: - Index: 2
+; CHECK-NEXT: Name: .init_array
+; CHECK-NEXT: Alignment: 2
+; CHECK-NEXT: Flags: [ ]
; CHECK-NEXT: InitFunctions:
; CHECK-NEXT: - Priority: 42
; CHECK-NEXT: Symbol: 9
diff --git a/llvm/test/MC/WebAssembly/init-array-label.s b/llvm/test/MC/WebAssembly/init-array-label.s
new file mode 100644
index 00000000000000..4fe0b64b3bc512
--- /dev/null
+++ b/llvm/test/MC/WebAssembly/init-array-label.s
@@ -0,0 +1,86 @@
+# RUN: llvm-mc -triple=wasm32-unknown-unknown -filetype=obj < %s | obj2yaml | FileCheck %s
+
+init1:
+ .functype init1 () -> ()
+ end_function
+
+init2:
+ .functype init2 () -> ()
+ end_function
+
+ .section .init_array,"",@
+ .globl p_init1
+ .p2align 2, 0x0
+p_init1:
+ .section .init_array,"",@
+ .p2align 2, 0
+ .int32 init1
+ .size p_init1, 4
+
+ .section .init_array,"",@
+ .globl p_init2
+ .p2align 2, 0x0
+p_init2:
+ .section .init_array,"",@
+ .p2align 2
+ .int32 init2
+ .size p_init2, 4
+
+# CHECK: - Type: FUNCTION
+# CHECK-NEXT: FunctionTypes: [ 0, 0 ]
+# CHECK-NEXT: - Type: DATACOUNT
+# CHECK-NEXT: Count: 1
+# CHECK-NEXT: - Type: CODE
+# CHECK-NEXT: Functions:
+# CHECK-NEXT: - Index: 0
+# CHECK-NEXT: Locals: []
+# CHECK-NEXT: Body: 0B
+# CHECK-NEXT: - Index: 1
+# CHECK-NEXT: Locals: []
+# CHECK-NEXT: Body: 0B
+# CHECK-NEXT: - Type: DATA
+# CHECK-NEXT: Segments:
+# CHECK-NEXT: - SectionOffset: 6
+# CHECK-NEXT: InitFlags: 0
+# CHECK-NEXT: Offset:
+# CHECK-NEXT: Opcode: I32_CONST
+# CHECK-NEXT: Value: 0
+# CHECK-NEXT: Content: '0000000000000000'
+# CHECK-NEXT: - Type: CUSTOM
+# CHECK-NEXT: Name: linking
+# CHECK-NEXT: Version: 2
+# CHECK-NEXT: SymbolTable:
+# CHECK-NEXT: - Index: 0
+# CHECK-NEXT: Kind: FUNCTION
+# CHECK-NEXT: Name: init1
+# CHECK-NEXT: Flags: [ BINDING_LOCAL ]
+# CHECK-NEXT: Function: 0
+# CHECK-NEXT: - Index: 1
+# CHECK-NEXT: Kind: FUNCTION
+# CHECK-NEXT: Name: init2
+# CHECK-NEXT: Flags: [ BINDING_LOCAL ]
+# CHECK-NEXT: Function: 1
+# CHECK-NEXT: - Index: 2
+# CHECK-NEXT: Kind: DATA
+# CHECK-NEXT: Name: p_init1
+# CHECK-NEXT: Flags: [ ]
+# CHECK-NEXT: Segment: 0
+# CHECK-NEXT: Size: 4
+# CHECK-NEXT: - Index: 3
+# CHECK-NEXT: Kind: DATA
+# CHECK-NEXT: Name: p_init2
+# CHECK-NEXT: Flags: [ ]
+# CHECK-NEXT: Segment: 0
+# CHECK-NEXT: Offset: 4
+# CHECK-NEXT: Size: 4
+# CHECK-NEXT: SegmentInfo:
+# CHECK-NEXT: - Index: 0
+# CHECK-NEXT: Name: .init_array
+# CHECK-NEXT: Alignment: 2
+# CHECK-NEXT: Flags: [ ]
+# CHECK-NEXT: InitFunctions:
+# CHECK-NEXT: - Priority: 65535
+# CHECK-NEXT: Symbol: 0
+# CHECK-NEXT: - Priority: 65535
+# CHECK-NEXT: Symbol: 1
+# CHECK-NEXT: ...
diff --git a/llvm/test/MC/WebAssembly/init-array.s b/llvm/test/MC/WebAssembly/init-array.s
index e79fb453ec12a3..ab8be11b0ff4a2 100644
--- a/llvm/test/MC/WebAssembly/init-array.s
+++ b/llvm/test/MC/WebAssembly/init-array.s
@@ -18,6 +18,8 @@ init2:
# CHECK: - Type: FUNCTION
# CHECK-NEXT: FunctionTypes: [ 0, 0 ]
+# CHECK-NEXT: - Type: DATACOUNT
+# CHECK-NEXT: Count: 1
# CHECK-NEXT: - Type: CODE
# CHECK-NEXT: Functions:
# CHECK-NEXT: - Index: 0
@@ -26,6 +28,14 @@ init2:
# CHECK-NEXT: - Index: 1
# CHECK-NEXT: Locals: []
# CHECK-NEXT: Body: 0B
+# CHECK-NEXT: - Type: DATA
+# CHECK-NEXT: Segments:
+# CHECK-NEXT: - SectionOffset: 6
+# CHECK-NEXT: InitFlags: 0
+# CHECK-NEXT: Offset:
+# CHECK-NEXT: Opcode: I32_CONST
+# CHECK-NEXT: Value: 0
+# CHECK-NEXT: Content: '0000000000000000'
# CHECK-NEXT: - Type: CUSTOM
# CHECK-NEXT: Name: linking
# CHECK-NEXT: Version: 2
@@ -40,6 +50,11 @@ init2:
# CHECK-NEXT: Name: init2
# CHECK-NEXT: Flags: [ BINDING_LOCAL ]
# CHECK-NEXT: Function: 1
+# CHECK-NEXT: SegmentInfo:
+# CHECK-NEXT: - Index: 0
+# CHECK-NEXT: Name: .init_array
+# CHECK-NEXT: Alignment: 2
+# CHECK-NEXT: Flags: [ ]
# CHECK-NEXT: InitFunctions:
# CHECK-NEXT: - Priority: 65535
# CHECK-NEXT: Symbol: 0
More information about the llvm-commits
mailing list