[PATCH] D90948: [WebAssembly] call_indirect issues table number relocs

Thu Jan 28 01:16:19 PST 2021

tlively added a comment.

In D90948#2527413 <https://reviews.llvm.org/D90948#2527413>, @wingo wrote:

> In D90948#2525488 <https://reviews.llvm.org/D90948#2525488>, @sbc100 wrote:
>
>> I'm beginning to (re-)wonder if we should leave the current "call_indirect" as special in that it always implicitly refers the magic table zero..  then we would introduce a new instruction for `call_indirect_explict` that could be used if and only if reference types is enabled and would have the relocation.   Data and function addresses will always be special in llvm so I think its OK to reflect that if it makes things simpler.   I guess it would make things simpler in some ways and more complex in other?   (Sorry if I'm re-litigating prior decisions here).
>
> I am having a hard time seeing what a new instruction would buy us.  I would think there is already a sufficiently clear difference between `call_indirect` with an immediate table operand -- which was the case before I started, and is still the case now if reference-types is disabled -- and `call_indirect` with a symbol reference.  There would only need to be a relocation in the latter case.

I agree that we just need a single `call_indirect` that can take either a constant zero table operand or a symbolic table operand. Both types of operands need to be able to coexist in the same link or even in the same object file to make MVP object files linkable with reference-types object files. I suggest that codegen only produce symbolic table operands to refer to user-defined funcref tables, not the default MVP indirect call table (table 0) because that will give us the nice property that objects will contain new table symbols and relocations only if they //use// reference-types features, not just if they //enable// reference-types features. This property should also remove the need to do any form of feature detection in the WasmObjectWriter.

> In D90948#2526457 <https://reviews.llvm.org/D90948#2526457>, @tlively wrote:
>
>> I like the idea of treating table zero as a magical table that is assumed to always exist and that all "normal" function pointers and indirect calls implicitly refer to, even when reference-types is enabled. That's how the table used to work, right? This would solve the problem of making sure the table is live when there are call_indirects or function pointers present in an object -- table 0 would always be live.
>
> I think this goes against what was already done in D91637 <https://reviews.llvm.org/D91637> and D92840 <https://reviews.llvm.org/D92840>.  If individual object files can signal their use of the indirect function table via the presence or absence of the `__indirect_function_table` import, the linker can then only include it in the final linking stage when needed -- a minor win.  I can see the point in treating `__indirect_function_table` as a good default, and which in the absence of reference-types will be table number 0, but reserving table 0 only complicates things when reference types *are* enabled -- because then you have two kinds of tables.

Yes, I agree that it's better not to emit the table at all if it's not needed. I'm not totally clear on whether those patches to emit the table only when necessary are ABI-breaking changes, though. Do I understand correctly that they are not ABI-breaking because wasm-ld previously ignored the presence of the table import and that those patches signal whether the table is used by the presence of the import in the object file and not via new symbol or relocation types? If so, I agree that table 0 (i.e. '__indirect_function_table') should be live only if necessary. I think the crux of the matter is whether there should be two kinds of tables, though. AFAICT, wasm-ld needs to support two kinds of tables (hard-coded 0 and symbolic) if we want to keep supporting the MVP object ABI. Since we need to support both types of tables anyway, it seems simpler to let usage rather than enabled features determine when each kind of table is used. Normal function pointers and indirect calls to function pointers should use the hard-coded table 0, just as they do in MVP object files and all indirect calls into user-defined funcref tables should use symbolic tables. This scheme prevents the MC layer from having to do any feature detection and maintains compatibility with the MVP ABI.

In D90948#2527421 <https://reviews.llvm.org/D90948#2527421>, @wingo wrote:

> Just to be clear -- here is my use case.
>
> https://github.com/Igalia/ref-cpp/blob/master/milestones/m3/test.S#L76
>
> WAT version here:
>
> https://github.com/Igalia/ref-cpp/blob/master/milestones/m3/test.wat#L47
>
> Basically I am implementing a side table mapping externref to index, in webassembly instead of in JS.  I want to have named table symbols so I can have different mappings, and to be able to table.get / table.set / etc on them, but to do this I need to solve the table linking problem.  But to solve table linking I had to fix call_indirect.  That's why I'm here.  I can see how address spaces may relate to this on an IR level, but on the MC level (where I am right now) I think I can probably ignore it.

I'm very excited for this all to come together :) You're right that address spaces won't show up in the MC layer at all. I mentioned them only to point out that the choice of whether to use a hard-coded table 0 or a symbolic table operand for call_indirect can be done in codegen rather than in the MC layer.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90948/new/

https://reviews.llvm.org/D90948