[PATCH] D42095: [WebAssembly] Symbol changes #3: Make Clang object file tables match LLD output. NFC.
Nicholas Wilson via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed Jan 24 06:37:56 PST 2018
ncw added inline comments.
================
Comment at: test/MC/WebAssembly/weak-alias.ll:197
+; CHECK-NEXT: Value: 16
+; CHECK-NEXT: Content: '01000000'
+; CHECK-NEXT: - Type: CUSTOM
----------------
ncw wrote:
> ncw wrote:
> > sbc100 wrote:
> > > ncw wrote:
> > > > sbc100 wrote:
> > > > > sbc100 wrote:
> > > > > > ncw wrote:
> > > > > > > sbc100 wrote:
> > > > > > > > How as this change generated a new data segment? I guess maybe it was zero before so now it was elided?
> > > > > > > >
> > > > > > > > It looks to me like we really want to two different function addresses here for the address of the function itself and the address of the weak alias. Its not until link time that these two things become the same thing. Maybe it doesn't really matter as long as the linker can distinguish them, but it makes sense to be that there would be two different function addresses in this object file.
> > > > > > > Ah, the test data before was skipping the section - see the "CHECK" in the original file, rather than "CHECK-NEXT". I thought it was useful to expand the test to assert on the section that was being skipped; it was odd that previously it was deliberately not examining the final segment.
> > > > > > >
> > > > > > > "It's not until link time that they become the same thing" -> but the table in the object file is linked. The "provisional" values that are used for relocations in the object file are a preview of the values that would get written out. I think it makes sense to use the same rules for writing out provisional values as we do for writing out the actual values in LLD. Hence we don't put the exact same function twice in the table - there's really no point. Seeing a function appear multiple times in the table doesn't make the output any easier to read.
> > > > > > But for intermediate output such as this (or the lld -r) we are still waiting for the a potential strong version of the symbol, so the weak alias and the function in points too are still district (both here and in lld -r).
> > > > > >
> > > > > Hmm.. maybe i'll expand this test expectation now as a separate change to make this change more explicit.
> > > > Yes - LLD might well resolve the indexes separately. But it doesn't seem to me that there's any added clarity in making the object file different to the linked output here (ie we're just outputting the same table that you'd get if you run LLD to link the single object file).
> > > >
> > > > The output you're requesting is going to look like this. It doesn't seem to me to assist in examining the object file to put the same function twice in the table:
> > > >
> > > > ```
> > > > (table 3 anyfunc)
> > > > (elem (i32.const 1) $func $func)
> > > > (memory $0 1)
> > > > (export "memory" (memory $0))
> > > > (export "func" (func $func))
> > > > (export "funcAlias" (func $func)) // Weak alias for func
> > > > (func $func (type blah)
> > > > )
> > > > (func $anotherFunc (type blah)
> > > > // A call to the alias and a call to the non-alias...
> > > > // But there's no way to tell which is which! The alias is weakly-
> > > > // defined, so it's not an import
> > > > (call_indirect 1)
> > > > (call_indirect 2)
> > > > )
> > > > ```
> > > >
> > > > Of course the other alternative is just to output a totally empty table in object files!
> > > I'm taking a look at the `lld -r` output now. I'm sure we can agree that the clang output should match that of `lld -r` more that it matches just `lld` right?
> > >
> > > It does looks like `lld -r` currently doesn't do the useful thing I was expecting.
> > >
> > > The output if `wasm-objdump -x -d -r` a lot more useful to me that anything in llvm right now (until we can disassembly wasm files, which is coming soon). Hopefully we will have better tools that objtoyaml to use in our test cases at some point.
> > >
> > > In this case I modified `lld/test/wasm/weak-alias.ll` to use `--relocatable` and I got the following:
> > >
> > > ```
> > > 0000cd <call_alias_ptr>:
> > > 0000d1: 23 80 80 80 80 00 | get_global 0
> > > 0000d2: R_WEBASSEMBLY_GLOBAL_INDEX_LEB 0
> > > 0000d7: 41 10 | i32.const 16
> > > 0000d9: 6b | i32.sub
> > > 0000da: 22 00 | tee_local 0
> > > 0000dc: 24 80 80 80 80 00 | set_global 0
> > > 0000dd: R_WEBASSEMBLY_GLOBAL_INDEX_LEB 0
> > > 0000e2: 20 00 | get_local 0
> > > 0000e4: 41 80 80 80 80 00 | i32.const 0
> > > 0000e5: R_WEBASSEMBLY_TABLE_INDEX_SLEB 1
> > > 0000ea: 36 02 08 | i32.store 2 8
> > > 0000ed: 10 81 80 80 80 00 | call 1 <direct_fn>
> > > 0000ee: R_WEBASSEMBLY_FUNCTION_INDEX_LEB 1 <direct_fn>
> > > 0000f3: 21 01 | set_local 1
> > > 0000f5: 20 00 | get_local 0
> > > 0000f7: 41 10 | i32.const 16
> > > 0000f9: 6a | i32.add
> > > 0000fa: 24 80 80 80 80 00 | set_global 0
> > > 0000fb: R_WEBASSEMBLY_GLOBAL_INDEX_LEB 0
> > > 000100: 20 01 | get_local 1
> > > 000102: 0b | end
> > > ```
> > > and
> > > ```
> > > 000103 <call_direct_ptr>:
> > > 000107: 23 80 80 80 80 00 | get_global 0
> > > 000108: R_WEBASSEMBLY_GLOBAL_INDEX_LEB 0
> > > 00010d: 41 10 | i32.const 16
> > > 00010f: 6b | i32.sub
> > > 000110: 22 00 | tee_local 0
> > > 000112: 24 80 80 80 80 00 | set_global 0
> > > 000113: R_WEBASSEMBLY_GLOBAL_INDEX_LEB 0
> > > 000118: 20 00 | get_local 0
> > > 00011a: 41 81 80 80 80 00 | i32.const 1
> > > 00011b: R_WEBASSEMBLY_TABLE_INDEX_SLEB 1
> > > 000120: 36 02 08 | i32.store 2 8
> > > 000123: 10 81 80 80 80 00 | call 1 <direct_fn>
> > > 000124: R_WEBASSEMBLY_FUNCTION_INDEX_LEB 1 <direct_fn>
> > > 000129: 21 01 | set_local 1
> > > 00012b: 20 00 | get_local 0
> > > 00012d: 41 10 | i32.const 16
> > > 00012f: 6a | i32.add
> > > 000130: 24 80 80 80 80 00 | set_global 0
> > > 000131: R_WEBASSEMBLY_GLOBAL_INDEX_LEB 0
> > > 000136: 20 01 | get_local 1
> > > 000138: 0b | end
> > >
> > > ```
> > >
> > > I was hoping that the two R_WEBASSEMBLY_FUNCTION_INDEX_LEB entries where would be different, as I think they should be. So I think that might be bug in the `--relocatable` output generation.
> > Yes, I agree Clang should most closely match `lld -r` - but I was hoping that *both* of them would match normal LLD output for the table (ie one table entry per "real" function). If there is to be any difference between object files and linked files in terms of the table that's output, Clang and `lld -r` should match.
> >
> > The relocations should definitely be different, and reference the two different symbols directly. But since those two symbols each reference the same (Wasm) function index, I personally would find it odd to have multiple table indices for the same Wasm function.
> >
> > I think you're right that there is a bug in the `lld -r` output for relocations, but it's worth checking whether #5 fixes it? The handling of symbols vs functions in LLD is generally a bit patchy, and I think #5 fixes the overall approach.
> >
> > This is after all cosmetic, I really don't mind what the outcome is! I was trying to make Clang, `lld -r`, and normal LLD make a matching table, so that we don't have unnecessary variability.
> Aha - wait! I must have been very sleepy last night to forget why `lld -r` doesn't work with aliases!
>
> It's because your "double symbol fix" (so that aliases are both imported and exported) with the "alt index" isn't present in LLD. WasmObjectWriter hackily writes out the weak symbols twice, but LLD only has support for reading in twice-declared symbols, not for writing them out.
>
> There really isn't any point fixing this. Why introduce new code in LLD at this stage, for writing out the import-and-export format, when we're just days away from switching to using symbol indexes, at which point the need for doing that goes away.
>
> So that's why the fix isn't split out of patch #5.
>
> I see you've added a test - that's great, I'll check that #5 does indeed fix that test.
Oops - when I said "there really isn't any point fixing this", what I meant was, there really isn't any point fixing LLD so that `lld -r` output matches what Clang //currently// outputs. Namely, I don't see any point adding code to LLD to make the `-r` output use the imports/exports strategy that Clang currently has.
Of course it's worth fixing in the long run, with the symtab changes.
I've tested with #5, and can confirm that LLD does indeed write out the expected relocations:
```
0000cc <call_alias_ptr>:
0000d0: 23 80 80 80 80 00 | get_global 0
0000d1: R_WEBASSEMBLY_GLOBAL_INDEX_LEB 0
0000d6: 41 10 | i32.const 16
0000d8: 6b | i32.sub
0000d9: 22 00 | tee_local 0
0000db: 24 80 80 80 80 00 | set_global 0
0000dc: R_WEBASSEMBLY_GLOBAL_INDEX_LEB 0
0000e1: 20 00 | get_local 0
0000e3: 41 81 80 80 80 00 | i32.const 1
0000e4: R_WEBASSEMBLY_TABLE_INDEX_SLEB 7
0000e9: 36 02 08 | i32.store 2 8
0000ec: 10 81 80 80 80 00 | call 1 <direct_fn>
0000ed: R_WEBASSEMBLY_FUNCTION_INDEX_LEB 7
0000f2: 21 01 | set_local 1
0000f4: 20 00 | get_local 0
0000f6: 41 10 | i32.const 16
0000f8: 6a | i32.add
0000f9: 24 80 80 80 80 00 | set_global 0
0000fa: R_WEBASSEMBLY_GLOBAL_INDEX_LEB 0
0000ff: 20 01 | get_local 1
000101: 0b | end
000102 <call_direct_ptr>:
000106: 23 80 80 80 80 00 | get_global 0
000107: R_WEBASSEMBLY_GLOBAL_INDEX_LEB 0
00010c: 41 10 | i32.const 16
00010e: 6b | i32.sub
00010f: 22 00 | tee_local 0
000111: 24 80 80 80 80 00 | set_global 0
000112: R_WEBASSEMBLY_GLOBAL_INDEX_LEB 0
000117: 20 00 | get_local 0
000119: 41 81 80 80 80 00 | i32.const 1
00011a: R_WEBASSEMBLY_TABLE_INDEX_SLEB 2
00011f: 36 02 08 | i32.store 2 8
000122: 10 81 80 80 80 00 | call 1 <direct_fn>
000123: R_WEBASSEMBLY_FUNCTION_INDEX_LEB 2 <call_direct>
000128: 21 01 | set_local 1
00012a: 20 00 | get_local 0
00012c: 41 10 | i32.const 16
00012e: 6a | i32.add
00012f: 24 80 80 80 80 00 | set_global 0
000130: R_WEBASSEMBLY_GLOBAL_INDEX_LEB 0
000135: 20 01 | get_local 1
000137: 0b | end
```
We have two instances of `call 1 <direct_fn>`, but the relocations are different indices. Clearly wasm-objdump is going to have to be updated to understand the new relocation format... maybe wait until the symtab format is stable.
Repository:
rL LLVM
https://reviews.llvm.org/D42095
More information about the llvm-commits
mailing list