[PATCH] D44028: [WebAssembly] Handle weak undefined functions with a synthetic stub

Rui Ueyama via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Mar 7 10:14:16 PST 2018


ruiu added a comment.

> The address of a function is its table index in the indirection function call table.  For weak undefined functions (or null functions) this will be zero.  However the following call is *direct* call, not a call_indirect.  And direct calls are more efficient we do want continue to use direct calls wherever we can.

I see. Thank you for the explanation!

> Yes, its for catching that specific case.   In this case the linker needs insert some valid function index into the direct call instruction.  It it doesn't insert a valid function index with the correct signature, the module will fail to validate (i.e. will fail to load).   We could generate a better synthetic function with a nice error message.   However, that would add complexity to the code generation, and I think we are already on shakey ground adding any kind of code generation to the linker.  Also, simply crashing is what existing platforms do, so I think that can/should be expected.  At least, I don't think we should block this change on making an nice error message, unless you want to push the code generation part into the compiler instead?

It shouldn't block this change indeed. I was just trying to understand this patch.

I'm not too worried that we generate wasm instructions in the linker. Unlike other file formats, wasm supports and will support only one machine instruction -- that's the wasm instructions. So, if something is easy  or natural to implement in a linker instead of a compiler, I think we shouldn't hesitate to do that.



================
Comment at: wasm/MarkLive.cpp:78-85
+      Symbol *Sym = C->File->getSymbol(Reloc.Index);
+      // Don't mark functions live if we're taking the address of a function
+      // that won't actually go in the function table (has index zero).  This
+      // is the case for some synthetic functions.
+      if ((Reloc.Type == R_WEBASSEMBLY_TABLE_INDEX_SLEB ||
+           Reloc.Type == R_WEBASSEMBLY_TABLE_INDEX_I32) &&
+          cast<FunctionSymbol>(Sym)->hasNullTableIndex())
----------------
Adding a special rule for a weak symbol seems a bit odd to me because they are orthogonal in concept. What if you create functions after you garbage-collect symbols? Then you can remove this logic, can't you?


================
Comment at: wasm/SymbolTable.cpp:36
 
+static const uint8_t UNREACHABLE_FN[] = {
+    0x03 /* ULEB length */, 0x00 /* ULEB num locals */,
----------------
I believe in LLVM style global variables are written in the same way as local variables. So it is UnreachableFn.


================
Comment at: wasm/SymbolTable.cpp:41
+
+void SymbolTable::handleWeakUndefines() {
+  DenseMap<WasmSignature, InputFunction *> UndefinedFunctions;
----------------
This needs comment. Please describe the semantics of the weak undefined functions in wasm and what we are doing in this function.


================
Comment at: wasm/SymbolTable.cpp:42
+void SymbolTable::handleWeakUndefines() {
+  DenseMap<WasmSignature, InputFunction *> UndefinedFunctions;
+  for (Symbol *Sym : SymVector) {
----------------
I'm not too worried about the use of a hash table in this function because I believe the number of weak symbols in a program is small. But I'd like to reiterate that doing something like this (use a hash table for all symbols) is in general discouraged in lld as it makes the linker noticeably slow.


================
Comment at: wasm/Symbols.h:65
   bool isHidden() const;
+  uint32_t getFlags() const { return Flags; }
 
----------------
You are not using this function.


================
Comment at: wasm/Symbols.h:126-128
+  bool hasNullTableIndex() const {
+    return hasTableIndex() && getTableIndex() == 0;
+  }
----------------
I'd remove this function and inline it because it's too small and both `hasTableIndex` and `getTableIndex` is externally available.


Repository:
  rLLD LLVM Linker

https://reviews.llvm.org/D44028





More information about the llvm-commits mailing list