[PATCH] D43391: [WebAssembly] Separate out InputGlobal from InputChunk

Mon Feb 19 10:38:30 PST 2018

sbc100 added a comment.

In https://reviews.llvm.org/D43391#1011480, @ncw wrote:

> I had a great idea on this last night - I think I've been misrepresenting globals somewhat. This came to me after Andreas Rossberg answered a question I had about globals.
>
> They're actually more like //functions// than data! Our rather, they are used/accessed as data at runtime, but the Globals section of the Wasm file actually contains a function body for each global, which contains executable code that's run through the interpreter to give the global its initial value.
>
> I kept saying "globals are like a chunk in every way except that they don't have relocations" - but that's wrong, they //should// have relocations! The "body" for a global can contain a get_global instruction, which requires a relocation for its operand, so what's really been misleading is that thelinking conventions for wasm simply forgot to mention that the linker needs to handle a "reloc.GLOBAL" section.
>
> I know what you're thinking - "Nick, the clang front-end currently doesn't emit globals that contain a get_global instruction, so we don't need to process relocations for them".
>
> But the linking conventions should still specify it, regardless of the fact that the clang front-end doesn't yet emit them. One thing's sure, fPIC and shared-objects and threading will find more uses for globals than we had before.
>
> More to the point - the "wasmy" way to think of globals is as a runtime data register, which is packaged in the wasm file with a function body used to initialize it. That could be reflected in LLD's model. (A typical function body for a global is something like a single "ret i32 <immediate>" instruction, or "const.i32 NNN" in wasm assembly, but a few other forms are legal.)
>
> To conclude: I'm not trying to complicate things for globals, I'm just trying to use exactly the same tried-and-tested abstraction for them, that LLD is already right now using for functions and segments, rather than try and make up something new.
>
> And since globals can contain relocations after all, they look a lot more like functions; treating them as chunks like the rest really should give us basically the least amount of code overall, as well as ensuring the various symbol types work in the same way for "free".
>
> Edit - I'll see if I have any time on Monday to confirm my conjecture on the potential simplification of the code, by updating this PR to model globals as a chunk like functions. I had been intending that later on as a tidy-up, before realising that globals would actually benefit from relocation processing.

I think the important distinction is "OutputSection" vs "SyntheticOutputSection".  The former is constructed in parallel based on memcpy+relocations.  The latter is created from scratch by the linker.  You are suggesting that we might want to make the global sections into non-synthetic section.    We might want to do that one day, but I don't think we should for this initial patch.   I think we are already getting a bit ahead of ourselves by even including first class globals in this patch (I was hoping to switch to symbol table without introducing a new symbol type but I don't really think that can be avoided).

Repository:
  rLLD LLVM Linker

https://reviews.llvm.org/D43391