[PATCH] D43264: [WebAssembly] Add explicit symbol table

Nicholas Wilson via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Feb 15 17:38:40 PST 2018

ncw added a comment.

In https://reviews.llvm.org/D43264#1009526, @sbc100 wrote:

> To make this even more complicated to follow, we previously modeled data symbols as wasm globals...  i.e. we use the wasm globals in the object format to store the addresses in linear memory of data symbols.   A little confusing, and one reason why we are making this change to have a more explicit

> We could simply store the wasm global data in the DefinedGlobal itself, rather then having it point to an InputGlobal.  I think the problem there is when more than one symbol aliases the same global.   But, we don't have the case yet so perhaps we can push that down the line a little.   Let me see if I can simplify things.

I'm not sure it would be simpler to put the definition (initialiser) for the symbol in the symbol itself. As you say, it's conceptually wrong because it doesn't work with aliases (unused in this case, but it demonstates the model wouldn't be right); secondly it breaks the link between defined symbols and their "chunk" containing the initializer; and thirdly, worst, prevents unifying the handling of globals/data (eg with comdat for globals, then symbols could then be initialized in the same way for all types).

I think we should aim to have globals work in the same way in LLD as data/function symbols, to keep the mental model simple, rather than arbitrarily constraing them with different resolution semantics (eg coding up a way of defining global symbols that doesn't allow aliases).

In https://reviews.llvm.org/D43264#1009514, @ruiu wrote:

> - Globals variables basically consists of types and indices

And an initializer! That's what distinguishes a defined global from an imported/undefined one, in the same way that the initializer (bss or data section) does for a data symbol, or the function body does for a function symbol.

> It looks more and more like global variables are just symbols with indices. Do you agree? I believe it can naturally be modeled as a symbol.
> That makes me wonder what the current DefinedGlobal represents. Is this a symbol that lives in the linear memory or global variable?

DefinedGlobal is a symbol representing storage that's only accessible via its global index, rather than via a linear memory address. "Defined" means the Wasm file allocates that storage, rather than importing the storage from another DSO. We definitely need the distinction between defined and undefined for globals.

The key takeaway is that a global is a storage/memory slot that can be allocated by a binary, just like a data symbol in ELF, it's only the addressing model that's different (via index rather than RAM address).

Thanks for looking into this!

  rLLD LLVM Linker


More information about the llvm-commits mailing list