[llvm-dev] [LLD] Adding WebAssembly support to lld

Rui Ueyama via llvm-dev llvm-dev at lists.llvm.org
Wed Jul 12 15:23:36 PDT 2017


On Wed, Jul 12, 2017 at 11:31 AM, Sam Clegg <sbc at chromium.org> wrote:

> On Mon, Jul 10, 2017 at 4:13 PM, Rui Ueyama via llvm-dev
> <llvm-dev at lists.llvm.org> wrote:
> > Sorry for the belated response. I was on vacation last week. A couple of
> > thoughts on this patch and the story of webassembly linking.
>
> And I'm about to be on (mostly) vacation for next 3 weeks :)
>
> >
> > - This patch adds a wasm support as yet another major architecture
> besides
> > ELF and COFF. That is fine and actually aligned to the design principle
> of
> > the current lld. Wasm is probably more different than ELF against COFF,
> and
> > the reason why we separated COFF and ELF was because they are different
> > enough that it is easier to handle them separately rather than writing a
> > complex compatibility layer for the two. So that is I think the right
> design
> > chocie. That being said, some files are unnecessarily copied to all
> targets.
> > Particularly, Error.{cpp,h} and Memory.{h,cpp} need to be merged because
> > they are mostly identical.
>
> I concur.  However, would you accept the wasm port landing first, and
> then factoring some kind of library out of the 3 backends after that?
>  Personally I would prefer to land the initial version without
> touching the ELF/COFF backends and refactor in a second pass.


Yes, we can do that later.

> - I can imagine that you would eventually want to support two modes of
> wasm
> > object files. In one form, object files are represented in the compact
> > format using LEB128 encoding, and the linker has to decode and re-encode
> > LEB128 instruction streams. In the other form, they are still in LEB128
> but
> > uses full 5 bytes for 4-byte numbers, so that you can just concatenate
> them
> > without decoding/re-encoding. Which mode do you want to make default? The
> > latter should be much faster than the former (or the former is probably
> > unnecessarily slow), and because the regular compile-link-run cycle is
> very
> > important for developers, I'd guess that making the latter default is a
> > reasonable choice, although this patch implements the former. What do you
> > think about it?
>
> Yes, currently relocatable wasm files (as produced by clang) use fixed
> width LEB128 (padded to five bytes) for any relocation targets.  This
> allows the linker to trivially apply relocations and blindly
> concatenate data a code sections.  We specifically avoid any
> instruction decoding in the linker.   The plan is to add a optional
> pass over the generated code section of an executable file to compress
> the relocation targets to their normal LEB128 size.  Whether or not to
> make this the default is TBD.


Does this strategy make sense?

 - make compilers always emit fixed-width LEB128, so that linkers can link
them just by concatenating them and applying relocations,
 - make the linker emit fixed-width LEB128 by default as well, so that it
can create executables as fast as it can just, and
 - write an optional re-encoder which decodes and re-encodes fixed-width
LEB128 to "compress" the final output.

The third one can be an internal linker pass which is invoked when you pass
-O1 or something to the linker, but conceptually it is separated from the
"main" linker.

The rationale behind this strategy is that

- Developers usually want to create outputs as fast as linkers can.
Creating final executables for shipping is (probably by far) less frequent.
I also expect that, if wasm will be successful, you'll be compiling and
linking large programs using wasm as a target (on a successful platform,
people start doing something incredible/crazy in general), so the toolchain
performance will matter. You want to optimize it for regular
compile-link-debug cycle.
- Creating an output just by concatenating input file sections is I believe
easier than decoding and re-encoding LEB128 fields. So I think we want to
construct the linker based on that design, instead of directly emitting
variable-size LEB128 fields.


> - Storing the length and a hash value for each symbol in the symbol table
> > may speed up linking. We've learned that finding terminating NULs and
> > computing hash values for symbols is time-consuming process in the
> linker.
>
> Yes, I imagine we could even share some of the core symbol table code
> via the above mentioned library?
>
> >
> >
> >
> > On Thu, Jul 6, 2017 at 3:38 PM, Rafael Avila de Espindola via llvm-dev
> > <llvm-dev at lists.llvm.org> wrote:
> >>
> >> Dan Gohman <sunfish at mozilla.com> writes:
> >>
> >> >> Sorry, I meant why that didn't work with ELF (or what else didn't).
> >> >>
> >> >
> >> > The standard executable WebAssembly format does not use ELF, for
> >> > numerous
> >> > reasons, most visibly that ELF is designed for sparse decoding --
> >> > headers
> >> > contain offsets to arbitrary points in the file, while WebAssembly's
> >> > format
> >> > is designed for streaming decoding. Also, as Sam mentioned, there are
> a
> >> > lot
> >> > of conceptual differences. In ELF, virtual addresses are a pervasive
> >> > organizing principle; in WebAssembly, it's possible to think about
> >> > various
> >> > index spaces as virtual address spaces, but not all
> >> > address-space-oriented
> >> > assumptions apply.
> >>
> >> I can see why you would want your own format for distribution. My
> >> question was really about using ELF for the .o files.
> >>
> >> > It would also be possible for WebAssembly to use ELF ET_REL files just
> >> > for
> >> > linking, however telling LLVM and other tools to target ELF tends to
> >> > lead
> >> > them to assume that the final output is ELF and rely on ELF-specific
> >> > features.
> >>
> >> Things like "the dynamic linker implements copy relocations"?
> >>
> >> Cheers,
> >> Rafael
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> >
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170712/f14e32ef/attachment.html>


More information about the llvm-dev mailing list