<div dir="ltr">For the record, I left review comments to <a href="https://reviews.llvm.org/D34851">https://reviews.llvm.org/D34851</a>.</div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jul 12, 2017 at 4:36 PM, Sam Clegg <span dir="ltr"><<a href="mailto:sbc@google.com" target="_blank">sbc@google.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="HOEnZb"><div class="h5">On Wed, Jul 12, 2017 at 3:23 PM, Rui Ueyama <<a href="mailto:ruiu@google.com">ruiu@google.com</a>> wrote:<br>

> On Wed, Jul 12, 2017 at 11:31 AM, Sam Clegg <<a href="mailto:sbc@chromium.org">sbc@chromium.org</a>> wrote:<br>

>><br>

>> On Mon, Jul 10, 2017 at 4:13 PM, Rui Ueyama via llvm-dev<br>

>> <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br>

>> > Sorry for the belated response. I was on vacation last week. A couple of<br>

>> > thoughts on this patch and the story of webassembly linking.<br>

>><br>

>> And I'm about to be on (mostly) vacation for next 3 weeks :)<br>

>><br>

>> ><br>

>> > - This patch adds a wasm support as yet another major architecture<br>

>> > besides<br>

>> > ELF and COFF. That is fine and actually aligned to the design principle<br>

>> > of<br>

>> > the current lld. Wasm is probably more different than ELF against COFF,<br>

>> > and<br>

>> > the reason why we separated COFF and ELF was because they are different<br>

>> > enough that it is easier to handle them separately rather than writing a<br>

>> > complex compatibility layer for the two. So that is I think the right<br>

>> > design<br>

>> > chocie. That being said, some files are unnecessarily copied to all<br>

>> > targets.<br>

>> > Particularly, Error.{cpp,h} and Memory.{h,cpp} need to be merged because<br>

>> > they are mostly identical.<br>

>><br>

>> I concur.  However, would you accept the wasm port landing first, and<br>

>> then factoring some kind of library out of the 3 backends after that?<br>

>>  Personally I would prefer to land the initial version without<br>

>> touching the ELF/COFF backends and refactor in a second pass.<br>

><br>

><br>

> Yes, we can do that later.<br>

><br>

>> > - I can imagine that you would eventually want to support two modes of<br>

>> > wasm<br>

>> > object files. In one form, object files are represented in the compact<br>

>> > format using LEB128 encoding, and the linker has to decode and re-encode<br>

>> > LEB128 instruction streams. In the other form, they are still in LEB128<br>

>> > but<br>

>> > uses full 5 bytes for 4-byte numbers, so that you can just concatenate<br>

>> > them<br>

>> > without decoding/re-encoding. Which mode do you want to make default?<br>

>> > The<br>

>> > latter should be much faster than the former (or the former is probably<br>

>> > unnecessarily slow), and because the regular compile-link-run cycle is<br>

>> > very<br>

>> > important for developers, I'd guess that making the latter default is a<br>

>> > reasonable choice, although this patch implements the former. What do<br>

>> > you<br>

>> > think about it?<br>

>><br>

>> Yes, currently relocatable wasm files (as produced by clang) use fixed<br>

>> width LEB128 (padded to five bytes) for any relocation targets.  This<br>

>> allows the linker to trivially apply relocations and blindly<br>

>> concatenate data a code sections.  We specifically avoid any<br>

>> instruction decoding in the linker.   The plan is to add a optional<br>

>> pass over the generated code section of an executable file to compress<br>

>> the relocation targets to their normal LEB128 size.  Whether or not to<br>

>> make this the default is TBD.<br>

><br>

><br>

> Does this strategy make sense?<br>

><br>

>  - make compilers always emit fixed-width LEB128, so that linkers can link<br>

> them just by concatenating them and applying relocations,<br>

>  - make the linker emit fixed-width LEB128 by default as well, so that it<br>

> can create executables as fast as it can just, and<br>

>  - write an optional re-encoder which decodes and re-encodes fixed-width<br>

> LEB128 to "compress" the final output.<br>

><br>

> The third one can be an internal linker pass which is invoked when you pass<br>

> -O1 or something to the linker, but conceptually it is separated from the<br>

> "main" linker.<br>

<br>

</div></div>IIUC that is exactly the strategy I am suggesting.   Perhaps my<br>

description of it was less clear.   The currently implement does this,<br>

 with caveat that the final (optional) compression phase is not yet<br>

implemented :)<br>

<div class="HOEnZb"><div class="h5"><br>

><br>

> The rationale behind this strategy is that<br>

><br>

> - Developers usually want to create outputs as fast as linkers can. Creating<br>

> final executables for shipping is (probably by far) less frequent. I also<br>

> expect that, if wasm will be successful, you'll be compiling and linking<br>

> large programs using wasm as a target (on a successful platform, people<br>

> start doing something incredible/crazy in general), so the toolchain<br>

> performance will matter. You want to optimize it for regular<br>

> compile-link-debug cycle.<br>

> - Creating an output just by concatenating input file sections is I believe<br>

> easier than decoding and re-encoding LEB128 fields. So I think we want to<br>

> construct the linker based on that design, instead of directly emitting<br>

> variable-size LEB128 fields.<br>

><br>

><br>

>> > - Storing the length and a hash value for each symbol in the symbol<br>

>> > table<br>

>> > may speed up linking. We've learned that finding terminating NULs and<br>

>> > computing hash values for symbols is time-consuming process in the<br>

>> > linker.<br>

>><br>

>> Yes, I imagine we could even share some of the core symbol table code<br>

>> via the above mentioned library?<br>

>><br>

>> ><br>

>> ><br>

>> ><br>

>> > On Thu, Jul 6, 2017 at 3:38 PM, Rafael Avila de Espindola via llvm-dev<br>

>> > <<a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a>> wrote:<br>

>> >><br>

>> >> Dan Gohman <<a href="mailto:sunfish@mozilla.com">sunfish@mozilla.com</a>> writes:<br>

>> >><br>

>> >> >> Sorry, I meant why that didn't work with ELF (or what else didn't).<br>

>> >> >><br>

>> >> ><br>

>> >> > The standard executable WebAssembly format does not use ELF, for<br>

>> >> > numerous<br>

>> >> > reasons, most visibly that ELF is designed for sparse decoding --<br>

>> >> > headers<br>

>> >> > contain offsets to arbitrary points in the file, while WebAssembly's<br>

>> >> > format<br>

>> >> > is designed for streaming decoding. Also, as Sam mentioned, there are<br>

>> >> > a<br>

>> >> > lot<br>

>> >> > of conceptual differences. In ELF, virtual addresses are a pervasive<br>

>> >> > organizing principle; in WebAssembly, it's possible to think about<br>

>> >> > various<br>

>> >> > index spaces as virtual address spaces, but not all<br>

>> >> > address-space-oriented<br>

>> >> > assumptions apply.<br>

>> >><br>

>> >> I can see why you would want your own format for distribution. My<br>

>> >> question was really about using ELF for the .o files.<br>

>> >><br>

>> >> > It would also be possible for WebAssembly to use ELF ET_REL files<br>

>> >> > just<br>

>> >> > for<br>

>> >> > linking, however telling LLVM and other tools to target ELF tends to<br>

>> >> > lead<br>

>> >> > them to assume that the final output is ELF and rely on ELF-specific<br>

>> >> > features.<br>

>> >><br>

>> >> Things like "the dynamic linker implements copy relocations"?<br>

>> >><br>

>> >> Cheers,<br>

>> >> Rafael<br>

>> >> ______________________________<wbr>_________________<br>

>> >> LLVM Developers mailing list<br>

>> >> <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

>> >> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

>> ><br>

>> ><br>

>> ><br>

>> > ______________________________<wbr>_________________<br>

>> > LLVM Developers mailing list<br>

>> > <a href="mailto:llvm-dev@lists.llvm.org">llvm-dev@lists.llvm.org</a><br>

>> > <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" rel="noreferrer" target="_blank">http://lists.llvm.org/cgi-bin/<wbr>mailman/listinfo/llvm-dev</a><br>

>> ><br>

><br>

><br>

</div></div></blockquote></div><br></div>