[PATCH] D90948: [WebAssembly] call_indirect issues table number relocs

Andy Wingo via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jan 27 00:41:01 PST 2021


wingo added inline comments.


================
Comment at: llvm/lib/MC/WasmObjectWriter.cpp:530-531
+    // symbol has a symtab entry.
+    if (HasReferenceTypes)
+      Sym->setUsedInReloc();
     // Any time we have a TABLE_INDEX relocation against a function symbol, we
----------------
tlively wrote:
> What happens if reference-types is not enabled? It looks like we would still have a table number relocation and a table symbol, but the table symbol isn't marked as used. I assume the addition of `if (WasmSym->isNoStrip())` above causes the symbol not to be emitted in this case, but is the relocation still emitted?
> 
> Overall the WasmObjectWriter seems like the wrong place to be checking target features. The encoding of a Wasm object's contents shouldn't depend on whether reference-types is enabled; it's the contents themselves that should depend on whether reference-types is enabled. Can we encapsulate all the target feature checking in codegen so that we only emit table number expressions/relocations/symbols if reference-types is enabled?
> 
Let's say the compilation unit contains a function pointer but no `call_indirect`.  In that case the object file will need an `__indirect_function_table` import.  However, there is no explicit relocation against `__indirect_function_table`; this code handles `TABLE_INDEX` relocations, which are for the index of a function pointer in a table, but which don't explicitly mention which table.  Hence the `setNoStrip()` on the `__indirect_function_table` symbol -- even if it's not the target of a relocation, the table import should be written to the output file.  This result is the same whether reference types are enabled or not.

Additionally -- if & only if reference-types are enabled -- the table symbol will need to be written to the symbol table.  We indicate that by abusing the `setUsedInReloc` flag.  That feels like a hack in a way but perhaps a benign one.

Backing up a bit, though, and taking this opportunity to answer questions posed farther down -- when do we know if a table symbol should be emitted to the symbol table?  The writer and assembler layers don't have much access to features, which can change during the compilation unit.  Generally speaking it's the code that emits instructions that knows whether a feature is enabled or not.  So for example when we emit `call_indirect`, we emit either a proper symbol reference or a hard-coded reference to table 0, depending on the feature flag. 

But function pointers are a special case, because their symbol references only name their address space (the table) implicitly.  What we really want to do here, in all cases, is emit a `__indirect_function_table` import if and only if it is needed.  Whether it goes in the symbol table or not is a side question.  The normal way for the compiler to know if a symbol is needed is, well, explicit references: is it used in a relocation.  But here since there is no explicit reference, we make a hack and explicitly add the symbol to the "GC root set" via `setNoStrip`.

Do you see an earlier place in codegen or so that would be more appropriate for handling this logic?  I tried to do so but did not succeed.  It's not just function pointers in code; it's also static data.  I found that only at this lower level would I know for sure whether there were a function pointer in code or not.


================
Comment at: llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp:478
+    MCSymbolWasm *Sym = GetOrCreateFunctionTableSymbol(getContext(), TableName);
+    if (STI->checkFeatures("+reference-types")) {
+      auto *Val = MCSymbolRefExpr::create(Sym, getContext());
----------------
tlively wrote:
> I have a similar comment here. IMO, if the assembly contains a table symbol, it should be unconditionally parsed as containing a table symbol. It should be the responsibility of the producer of the assembly to not use table symbols if they want to be compatible with older tools.
> 
> I'm not sure what to do about the special case of `__indirect_function_table`, though. When was that introduced?
So it's OK to have `.s` files that use features but don't check / declare them?  Fine by me.

Still, even without reference types, emitting `call_indirect` needs to ensure the presence of its function table -- `__indirect_function_table` by default.  So there does need to be some logic here AFAIU.


================
Comment at: llvm/lib/Target/WebAssembly/WebAssemblyFastISel.cpp:883
+      // ensure the table is live.
+      Table->setNoStrip();
+      MIB.addImm(0);
----------------
tlively wrote:
> Why do we need a table symbol at all in this case (and also in DAG ISel), especially since it can't be emitted?
The table symbol has two functions:
1. Ensure that the table import is emitted to the object file.
2. When reference types are enabled, to allow the table reference in the `call_indirect` to be relocated, via the symbol table entry and appropriate `TABLE_NUMBER` relocations.

So the answer to your question is (1).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90948/new/

https://reviews.llvm.org/D90948



More information about the llvm-commits mailing list