[LLVMdev] MC-JIT Design

Mon Nov 15 10:15:09 PST 2010

Hi all,

As promised, here is the rough design of the upcoming MC-JIT*.
Feedback appreciated!

(*) To be clear, we are only calling it the MC-JIT until we have
finished killing the old one. When I say JIT below, I mean the MC-JIT.
I basically am ignoring completely the existing JIT. I will keep
things API compatible whenever possible, of course.

I see two main design directions for the JIT:

--

#1 (aka MCJIT) - We make a new MCJITStreamer which communicates with
the JIT engine to arrange to plop code in the right place and update
various state information.

This is the most obvious approach, is roughly similar to the way the
existing JIT works, and this is the way the proposed MC-JIT patches
work (see MCJITState object).

It also happens to not be the approach I want to take. :)

#2 (aka FOOJIT) - MC grows a new "pure" backend, which is designed
around representing everything that "can be run" on a target platform.
This is very connected to the inherent capabilities of the hardware /
OS, and is usually a superset** of what the native object format
(Mach-O, ELF, COFF) can represent.

The "pure" backend defines a hard (but non-stable) object file format
which is more or less a direct encoding of the native MC APIs (it is
not stable, so it can directly encode things like FixupKind enum
values).

I don't have a name for this format, so for now I will call it FOO.

The "MC-JIT" then becomes something more like a "FOO-JIT". It is
architected as a consumer of "FOO" object files over time. The basic
architecture is quite simple:
 (a) Load a module, emit it as a "FOO" object.
 (b) Load the object into a worklist, scan for undefined symbols,
dynamically emit more "FOO" modules.
 (c) Iterate until no undefined symbols remain.
 (d) Execute code -- if we hit a lazy compilation callback, go back to (a).

(**) It more or less *must* be a superset, since object formats
usually don't bother to represent things which can't be run. Features
which require OS emulation is an obvious exception. As concrete
example, consider the implementation of thread local storage. Each
platform typically will chose an implementation approach and limit its
format to supporting that, but the hardware itself supports many more
implementation approaches.

--

I apologize if my description is a bit terse, but I hope the basic
infrastructure comes through. I will make some pretty diagrams for it
at some point (hopefully before the next dev mtg, hahaha hmmm....).

Here are the reasons I want to follow approach #2:

1. It makes the JIT process look much more like the standard
compilation process. In fact, from the FOOJIT's perspective, it could
even run the compiler out of process to produce "FOO" object files,
with no real change in behavior.

This has two main implications:
 a. We are leveraging much more of the existing infrastructure.
 b. We can use more of the existing tools to test and debug the JIT.

2. It forces us to treat the JIT as a separate "subtarget".
 a. In reality, this is already true. The compiler needs to know it is
targeting a JIT in terms of what features are available (indirect
stubs? exception tables? thread local storage?), but the current
design papers over this. This design forces us to acknowledge that
fact up front, and should make the architecture more understandable.

3. It eases testing and debugging.
 a. We can build new tools to test the FOOJIT, for example, a tool
that just loads a couple FOO object files and runs them, but without
needing to do codegen. Since we can already use the existing tools to
work with the FOO objects, this basically gives us a new testing entry
point into the JIT.

--

Some caveats of this design:

1. The initial implementation will probably work very much as
described, it will actually write "FOO" object files to memory and
load them.

In practice, we would like to avoid the performance overhead of this
copy. My plan here is that eventually we would have multiple
implementations of the FOO object writer, one of which would write to
the serialized form, but another would splat directly into the process
memory.

We would allow other fancy things following the same approach, for
example allow the JIT to pin symbols to their actual addresses, so
that the assembler can do the optimal relaxation for where the code is
actually landing in memory.

2. It requires some more up front work, in that there is more stuff to
build. However, I feel it is a much stronger design, so I expect this
to pay off relatively quickly.

3. Some JIT-tricks become a bit less obvious. For example, in a JIT,
it is natural when seeing a symbol undefined "bar" to go ahead and see
if you can find "bar" and immediately generated code for it. You can't
do that in the FOOJIT model, because you won't know "bar" is undefined
until you read the object back.

However, in practice one needs to be careful about recursion and
reentrancy, so you have to take care when trying to do things like
this. The FOOJIT forces such tricks to go through a proper API (the
FOO object) which I end up seeing as a feature, not a bug.
--

And, a final word on API compatibility:

As mentioned before, I have no *plan* to break any existing public
interface to the JIT. The goal is that we eventually have a strict
superset of the current functionality.

The actual plan will be to roll out the FOOJIT in tree with some
option to allow clients to easily pick the implementation. A tentative
goal would be to have the FOOJIT working well enough in 2.9 so that
clients can test it against the released LLVM, and that for 2.10
(*grin*) we can make it the default.

Thoughts?

 - Daniel