[llvm] [WebAssembly][Object]Use file offset as function symbol address for linked files (PR #76198)

Derek Schuff via llvm-commits llvm-commits at lists.llvm.org
Tue Jan 2 14:52:09 PST 2024


https://github.com/dschuff updated https://github.com/llvm/llvm-project/pull/76198

>From 0297b93a6d0996e138559d11982638c717acde31 Mon Sep 17 00:00:00 2001
From: Derek Schuff <dschuff at chromium.org>
Date: Thu, 21 Dec 2023 14:56:03 -0800
Subject: [PATCH 1/2] [WebAssembly][Object]Use file offset as function symbol
 address for linked files

WebAssembly doesn't have a single virtual memory space the way other object
formats or architectures do, so "addresses" mean different things depending
on the context.
Function symbol addresses in object files are offsets from the start of the code
section. This is good for linking and relocation. However when dealing with
linked binaries, offsets from the start of the file/module are more often
used (e.g. for stack traces in browsers), and are more useful for use
cases like binary size attribution. This PR changes Object to use
the file offset instead of the section offset for function symbols, but
only for linked (non-DSO) files.
---
 llvm/lib/Object/WasmObjectFile.cpp            | 14 +++-
 llvm/test/tools/llvm-nm/wasm/linked.yaml      | 74 ++++++++++++++++++
 .../wasm/linked-symbol-table.yaml             | 75 +++++++++++++++++++
 3 files changed, 159 insertions(+), 4 deletions(-)
 create mode 100644 llvm/test/tools/llvm-nm/wasm/linked.yaml
 create mode 100644 llvm/test/tools/llvm-objdump/wasm/linked-symbol-table.yaml

diff --git a/llvm/lib/Object/WasmObjectFile.cpp b/llvm/lib/Object/WasmObjectFile.cpp
index dfe86a45df3227..99c6f1a8fbd4ee 100644
--- a/llvm/lib/Object/WasmObjectFile.cpp
+++ b/llvm/lib/Object/WasmObjectFile.cpp
@@ -1662,10 +1662,16 @@ Expected<StringRef> WasmObjectFile::getSymbolName(DataRefImpl Symb) const {
 Expected<uint64_t> WasmObjectFile::getSymbolAddress(DataRefImpl Symb) const {
   auto &Sym = getWasmSymbol(Symb);
   if (Sym.Info.Kind == wasm::WASM_SYMBOL_TYPE_FUNCTION &&
-      isDefinedFunctionIndex(Sym.Info.ElementIndex))
-    return getDefinedFunction(Sym.Info.ElementIndex).CodeSectionOffset;
-  else
-    return getSymbolValue(Symb);
+      isDefinedFunctionIndex(Sym.Info.ElementIndex)) {
+    // For object files, use the section offset. For linked files, use the file
+    // offset
+    uint32_t Adjustment = isRelocatableObject() || isSharedObject()
+                              ? 0
+                              : Sections[CodeSection].Offset;
+    return getDefinedFunction(Sym.Info.ElementIndex).CodeSectionOffset +
+           Adjustment;
+  }
+  return getSymbolValue(Symb);
 }
 
 uint64_t WasmObjectFile::getWasmSymbolValue(const WasmSymbol &Sym) const {
diff --git a/llvm/test/tools/llvm-nm/wasm/linked.yaml b/llvm/test/tools/llvm-nm/wasm/linked.yaml
new file mode 100644
index 00000000000000..992c1811743b7a
--- /dev/null
+++ b/llvm/test/tools/llvm-nm/wasm/linked.yaml
@@ -0,0 +1,74 @@
+# RUN: yaml2obj %s -o %t.wasm
+# RUN: llvm-nm %t.wasm | FileCheck %s
+
+# CHECK: 0000009f T my_func_export
+# CHECK-NEXT: 0000002a D my_global_export
+# CHECK-NEXT: 00000000 D my_table_export
+
+--- !WASM
+FileHeader:
+  Version:         0x1
+Sections:
+  - Type:            TYPE
+    Signatures:
+      - Index:           0
+        ParamTypes:      []
+        ReturnTypes:     []
+  - Type:            IMPORT
+    Imports:
+      - Module:          env
+        Field:           foo
+        Kind:            FUNCTION
+        SigIndex:        0
+      - Module:          env
+        Field:           bar
+        Kind:            GLOBAL
+        GlobalType:      I32
+        GlobalMutable:   true
+      - Module:          env
+        Field:           memory
+        Kind:            MEMORY
+        Memory:
+          Minimum:         0x1
+  - Type:            FUNCTION
+    FunctionTypes:   [ 0 ]
+  - Type:            TABLE
+    Tables:
+      - Index:           0
+        ElemType:        FUNCREF
+        Limits:
+          Flags:           [ HAS_MAX ]
+          Minimum:         0x1
+          Maximum:         0x1
+  - Type:            GLOBAL
+    Globals:
+      - Index:           1
+        Mutable:         false
+        Type:            I32
+        InitExpr:
+          Opcode:          I32_CONST
+          Value:           42
+  - Type:            EXPORT
+    Exports:
+      - Name:            my_func_export
+        Kind:            FUNCTION
+        Index:           1
+      - Name:            my_global_export
+        Kind:            GLOBAL
+        Index:           1
+      - Name:            my_table_export
+        Kind:            TABLE
+        Index:           0
+  - Type:            CODE
+    Functions:
+      - Index:           1
+        Locals:
+        Body:            00
+  - Type:            DATA
+    Segments:
+      - SectionOffset:   0
+        InitFlags:       0
+        Offset:
+          Opcode:          I32_CONST
+          Value:           0
+        Content:         ''
diff --git a/llvm/test/tools/llvm-objdump/wasm/linked-symbol-table.yaml b/llvm/test/tools/llvm-objdump/wasm/linked-symbol-table.yaml
new file mode 100644
index 00000000000000..6dd949a441496c
--- /dev/null
+++ b/llvm/test/tools/llvm-objdump/wasm/linked-symbol-table.yaml
@@ -0,0 +1,75 @@
+# RUN: yaml2obj %s -o %t.wasm
+# RUN: llvm-objdump -t %t.wasm | FileCheck %s
+#
+# CHECK:      SYMBOL TABLE:
+# CHECK-NEXT: 0000009f g F CODE my_func_export
+# CHECK-NEXT: 0000002a g O DATA my_global_export
+# CHECK-NEXT: 00000000 g   TABLE my_table_export
+
+--- !WASM
+FileHeader:
+  Version:         0x1
+Sections:
+  - Type:            TYPE
+    Signatures:
+      - Index:           0
+        ParamTypes:      []
+        ReturnTypes:     []
+  - Type:            IMPORT
+    Imports:
+      - Module:          env
+        Field:           foo
+        Kind:            FUNCTION
+        SigIndex:        0
+      - Module:          env
+        Field:           bar
+        Kind:            GLOBAL
+        GlobalType:      I32
+        GlobalMutable:   true
+      - Module:          env
+        Field:           memory
+        Kind:            MEMORY
+        Memory:
+          Minimum:         0x1
+  - Type:            FUNCTION
+    FunctionTypes:   [ 0 ]
+  - Type:            TABLE
+    Tables:
+      - Index:           0
+        ElemType:        FUNCREF
+        Limits:
+          Flags:           [ HAS_MAX ]
+          Minimum:         0x1
+          Maximum:         0x1
+  - Type:            GLOBAL
+    Globals:
+      - Index:           1
+        Mutable:         false
+        Type:            I32
+        InitExpr:
+          Opcode:          I32_CONST
+          Value:           42
+  - Type:            EXPORT
+    Exports:
+      - Name:            my_func_export
+        Kind:            FUNCTION
+        Index:           1
+      - Name:            my_global_export
+        Kind:            GLOBAL
+        Index:           1
+      - Name:            my_table_export
+        Kind:            TABLE
+        Index:           0
+  - Type:            CODE
+    Functions:
+      - Index:           1
+        Locals:
+        Body:            00
+  - Type:            DATA
+    Segments:
+      - SectionOffset:   0
+        InitFlags:       0
+        Offset:
+          Opcode:          I32_CONST
+          Value:           0
+        Content:         ''

>From 1d62fdae3d9c64973c94d27afa6ed366dcdd0064 Mon Sep 17 00:00:00 2001
From: Derek Schuff <dschuff at chromium.org>
Date: Tue, 2 Jan 2024 14:51:56 -0800
Subject: [PATCH 2/2] Add detail to comment

---
 llvm/lib/Object/WasmObjectFile.cpp | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/llvm/lib/Object/WasmObjectFile.cpp b/llvm/lib/Object/WasmObjectFile.cpp
index 99c6f1a8fbd4ee..91703c86364aed 100644
--- a/llvm/lib/Object/WasmObjectFile.cpp
+++ b/llvm/lib/Object/WasmObjectFile.cpp
@@ -1663,8 +1663,10 @@ Expected<uint64_t> WasmObjectFile::getSymbolAddress(DataRefImpl Symb) const {
   auto &Sym = getWasmSymbol(Symb);
   if (Sym.Info.Kind == wasm::WASM_SYMBOL_TYPE_FUNCTION &&
       isDefinedFunctionIndex(Sym.Info.ElementIndex)) {
-    // For object files, use the section offset. For linked files, use the file
-    // offset
+    // For object files, use the section offset. The linker relies on this.
+    // For linked files, use the file offset. This behavior matches the way
+    // browsers print stack traces and is useful for binary size analysis.
+    // (see https://webassembly.github.io/spec/web-api/index.html#conventions)
     uint32_t Adjustment = isRelocatableObject() || isSharedObject()
                               ? 0
                               : Sections[CodeSection].Offset;



More information about the llvm-commits mailing list