[LLVMdev] Emscripten: LLVM => JavaScript

Fri Dec 16 19:47:00 PST 2011

On Fri, Dec 16, 2011 at 7:14 PM, Alon Zakai <azakai at mozilla.com> wrote:
>
>
> ----- Original Message -----
>> From: "Eli Friedman" <eli.friedman at gmail.com>
>> To: "Alon Zakai" <azakai at mozilla.com>
>> Cc: llvmdev at cs.uiuc.edu
>> Sent: Thursday, December 15, 2011 7:02:34 PM
>> Subject: Re: [LLVMdev] Emscripten: LLVM => JavaScript
>> On Thu, Dec 15, 2011 at 4:10 PM, Alon Zakai <azakai at mozilla.com>
>> wrote:
>> > On that topic, I see there is an LLVM users page,
>> >
>> > http://llvm.org/Users.html
>> >
>> > - what is the procedure for suggesting adding a project to
>> > there?
>>
>> Send a patch to llvm-commits.
>
> Thanks, I'll do that.
>
>>
>> > The third issue I want to raise is regarding closer
>> > integration with LLVM. Right now, Emscripten uses unmodified
>> > LLVM and Clang, parsing their normal output. There are
>> > however some reasons for integrating more closely, in
>> > particular Emscripten has a problem when all LLVM
>> > optimizations are run. This is not always important for
>> > performance, as a safe subset exists, and we do our own
>> > JS-level optimizations later which overlap somewhat. However,
>> > it would be nice to be able to run all the LLVM optimizations.
>> > The problems we have there are
>> >
>> > 1. i64s and doubles can be on 32-bit alignment, which is
>> >   a problem for a JavaScript implementation with typed arrays
>> >   with a shared buffer, since unaligned reads/writes there
>> >   are impossible to do in a quick way. This can happen
>> >   without optimizations, but is more common there due to
>> >   the next point.
>> >
>> >   I've been told by Rafael Ávila de Espíndola that for this,
>> >   I would need an Emscripten target in LLVM. Would that be
>> >   upstreamable? (With or without Emscripten itself, preferably
>> >   with?)
>>
>> Adding a Emscripten target to clang would be fine. Note that clang
>> might generate unaligned loads anyway, but specifying an appropriate
>> target will ensure it doesn't use such loads unless they are
>> necessary.
>
> In what situation would unaligned loads be necessary? I was
> hoping that unless the code literally did something crazy like
> a load of an 8-byte value from a hardcoded 4-byte aligned
> address (like 0x4), then otherwise "normal" C/C++ code would
> always end up aligned. Is that correct?

For normal unoptimized code, yes, everything should end up aligned.
If you're compiling random C code, you're likely to run into code does
"something crazy" (like using "__attribute__((packed))") occasionally,
though.  Also, the optimizer will sometimes turn a memcpy into an
unaligned load+store, or a pair of small loads into an unaligned load.

>>
>> > 2. Optimization sometimes generates types like i288, which
>> >   Emscripten currently doesn't handle. From an optimizing
>> >   perspective, it isn't yet clear if it would be faster to
>> >   try to directly implement those, or to just break them up
>> >   into more manageable native (32-bit) sizes. Note that even
>> >   i64 is somewhat challenging to implement in a fast way
>> >   on JavaScript, since that environment is really a 32-bit
>> >   one, so it would be best to never do things like combine
>> >   two 32-bit writes into one 64-bit write. It would be nice
>> >   to have an option in LLVM to process the IR/bitcode back
>> >   into having only target-native types, is that possible?
>>
>> All the LLVM targets which use the common code generation
>> infrastructure have access to the legalizer, which handles that sort
>> of thing. It would in theory be possible to write an equivalent that
>> does most of that work on IR, but it's a substantial amount of work
>> without any obvious benefit for existing targets.
>>
>
> Ok, I guess that means I'll need to implement a legalizer. The
> simplest thing would probably be for me to do it in Emscripten,
> because the Emscripten IR is a simpler subset of LLVM IR (and
> I'm already familiar with the codebase). But if it would be
> useful for LLVM to have an IR pass that does legalization,
> I'd consider doing it in LLVM. Thoughts?

I don't think it would be very useful for the in-tree backends unless
we make major changes to the way instruction selection works;
legalization is closely integrated with other transformations.  That
said, the question does come up periodically on llvmdev; if you are
willing to write something, I'm sure some people would appreciate it.

-Eli