[llvm-dev] Debugging a potential bug when generating wasm32

Ömer Sinan Ağacan via llvm-dev llvm-dev at lists.llvm.org
Thu Aug 6 07:29:13 PDT 2020


Thanks, I joined Discord as well.

Here's an update: I was able to come up with a tiny C program which is compiled
to a similar "GOT.func" import. C code:

    __attribute__ ((visibility("default")))
    int c_fn_2(int x, int y)
    {
        return x + y;
    }

    __attribute__ ((visibility("default")))
    int (*c_fn(void)) (int x, int y)
    {
        return &c_fn_2;
    }

Compile with:

    clang-10 -fPIC --target=wasm32-unknown-emscripten test.c -c -o test.o -O
    wasm2wat test.o > test.wat

Generated wat:

    (module
      (type (;0;) (func (param i32 i32) (result i32)))
      (type (;1;) (func (result i32)))
      (import "env" "__linear_memory" (memory (;0;) 0))
      (import "env" "__indirect_function_table" (table (;0;) 0 funcref))
      (import "GOT.func" "c_fn_2" (global (;0;) (mut i32)))
      (func $c_fn_2 (type 0) (param i32 i32) (result i32)
        local.get 1
        local.get 0
        i32.add)
      (func $c_fn (type 1) (result i32)
        global.get 0))

Now if I remove the visibility attribute in `c_fn_2` the GOT.func import
disappears:

    (module
      (type (;0;) (func (param i32 i32) (result i32)))
      (type (;1;) (func (result i32)))
      (import "env" "__linear_memory" (memory (;0;) 0))
      (import "env" "__indirect_function_table" (table (;0;) 0 funcref))
      (import "env" "__table_base" (global (;0;) i32))
      (func $c_fn_2 (type 0) (param i32 i32) (result i32)
        local.get 1
        local.get 0
        i32.add)
      (func $c_fn (type 1) (result i32)
        global.get 0
        i32.const 0
        i32.add))

Comparing these two programs I think I see a potential answer to my question of
why this GOT.func import is needed. I can't find documentation how what these
attributes mean exactly (the closest one I could find is [1]), but I think
without `visibility("default")` the symbol is not visible in other compilation
units (here I think "compilation unit" means a module in Wasm, though it may
also be a C compilation unit, I'm not sure), in other words it's DSO-local. When
it's DSO-local the importing modules do not need to know about the function's
table index as they can never refer to it directly (e.g. the C expression
`c_fn_2` doesn't make sense in the importing modules).

When we add `visibility("default")` and make c_fn_2 visible in other modules we
need to be able to compare return value of `c_fn` with the value of symbol
`c_fn_2`, as that pointer equality must hold according to C standard.

The approach taken here is we expect the symbol's table index to be defined in
the *loading module*, and import that index with that `GOT.func` import. The
host (e.g. wasmtime or the browser) is then responsible for generating a table
index for this function in the loading module and providing that "GOT.func"
import in load time. In importing module code we use that index for c_fn_2, so
if the importing module does something like `c_fn_2 == c_fn()` that's evaluated
to `true` as expected.

Can anyone deny or confirm that this is indeed the reason for that `GOT.func`
import?

Thanks,

Ömer

[1]: https://clang.llvm.org/docs/LTOVisibility.html

Thomas Lively <tlively at google.com>, 6 Ağu 2020 Per, 00:54 tarihinde şunu yazdı:
>
> +Sam Clegg Is the expert on this dynamic linking stuff.
>
> A lot of us WebAssembly toolchain folks have been hanging out in the WebAssembly Discord, especially on the #emscripten channel, so that would be another good place to ask future WebAssembly-specific questions.
>
> On Tue, Aug 4, 2020 at 9:06 PM Ömer Sinan Ağacan via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>
>> Hi,
>>
>> Sorry if you've seen this message before on llvm.discourse.group or elsewhere --
>> I've been trying to get to the bottom of this for a while now and asked about
>> this in a few different platforms before.
>>
>> I'm currently trying to debug a bug in a LLVM-generated Wasm code. The bug could
>> be in the code that generates LLVM (rustc) or in the LLVM, I'm not sure yet.
>> LLVM IR and Wasm can be seen in [1].
>>
>> The problem is this line:
>>
>>     (import "GOT.func"
>> "_ZN4core3fmt3num3imp52_$LT$impl$u20$core..fmt..Display$u20$for$u20$i32$GT$3fmt17h9ba9fea9cadf7bd5E"
>> (global (;3;) (mut i32)))
>>
>> The same symbol is already imported from "env" in the same module:
>>
>>     (import "env"
>> "_ZN5core3fmt3num3imp52_$LT$impl$u20$core..fmt..Display$u20$for$u20$i32$GT$3fmt17h9ba9fea9cadf7bd5E"
>> (func (;4;) (type 1)))
>>
>> So there's no need to import it from "GOT.func" and I want to get rid of that
>> "GOT.func" import.
>>
>> This LLVM IR is generated when compiling Rust code to a "staticlib", which is
>> supposed to include *all* dependencies of the code so that it'll be linkable
>> with code for other languages. Because of the "GOT.func" import this module is
>> not linkable, it needs to resolve that "GOT.func" import in runtime using
>> dynamic linking for Wasm [2].
>>
>> I'm trying to understand whether this is a rustc bug or an LLVM bug. I'm using
>> LLVM 10 downloaded from the official web page and rustc nightly. I can build
>> LLVM from source and use it, but I don't have any experience in LLVM code base.
>> Questions:
>>
>> - Given a reference to a symbol, how does LLVM decide how to import it?
>>   Currently I see these uses of the problematic symbol in LLVM IR:
>>
>>   - `store i8* bitcast (i1 (i32*, %"core::fmt::Formatter"*)*
>> @"_ZN4core3fmt3num3imp52_$LT$impl$u20$core..fmt..Display$u20$for$u20$i32$GT$3fmt17h9ba9fea9cadf7bd5E"
>> to i8*), i8** %11, align 4`
>>
>>   - `store i8* bitcast (i1 (i32*, %"core::fmt::Formatter"*)*
>> @"_ZN4core3fmt3num3imp52_$LT$impl$u20$core..fmt..Display$u20$for$u20$i32$GT$3fmt17h9ba9fea9cadf7bd5E"
>> to i8*), i8** %14, align 4`
>>
>>   - `store i8* bitcast (i1 (i32*, %"core::fmt::Formatter"*)*
>> @"_ZN4core3fmt3num3imp52_$LT$impl$u20$core..fmt..Display$u20$for$u20$i32$GT$3fmt17h9ba9fea9cadf7bd5E"
>> to i8*), i8** %17, align 4`
>>
>>   - `declare zeroext i1
>> @"_ZN4core3fmt3num3imp52_$LT$impl$u20$core..fmt..Display$u20$for$u20$i32$GT$3fmt17h9ba9fea9cadf7bd5E"(i32*
>> noalias readonly align 4 dereferenceable(4), %"core::fmt::Formatter"*
>> align 4 dereferenceable(36)) unnamed_addr #1`
>>
>>   First three look very similar so I'm guessing the first three are causing one
>>   of those imports, and the last one is causing the other import, but I'm not
>>   sure which one is generating which import. Any ideas?
>>
>> - Any suggestions on how to debug this? Just knowing which line in the LLVM IR
>>   listed above causes this "GOT.func" import would be helpful.
>>
>> Thanks,
>>
>> Ömer
>>
>> [1]: https://gist.github.com/osa1/4c672fe8998c8e8768cf9f7c014c61d8
>> [2]: https://github.com/WebAssembly/tool-conventions/blob/master/DynamicLinking.md
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


More information about the llvm-dev mailing list