[LLVMdev] MC-JIT Design

Tue Nov 16 12:17:14 PST 2010

On 11/16/2010 4:55 AM, Conrado Miranda wrote:
> As previous mentioned here, I think the best design would be to JIT
> code fast (using the FOO type) and then allow the user to build to
> some other format later if he/she wants. Reloading pre-JITed functions
> is a feature I'd like to see, because sometimes you have to JIT fast
> an inefficient function just to get it working and later optimize it.
> If you could save the functions for latter use would be a major
> improvement.
>
> And I know I don't engage lots of talks here (usually I'm just a
> reader), but I'm trying to build a game based on JIT compilation for
> everything, including add-ons, patches and user scripts. So I just
> follow the JIT part of LLVM, but if there is anything I can help, I'd
> be glad.
>
> Miranda.
>

in my own VM effort (not LLVM based) I have been (for a very long time) 
working typically by producing object files in memory, and then 
"linking" them however is needed.

yeah, even for JIT, I usually actually produce both textual ASM, convert 
this into object files (via an "assembler" library), and link these (via 
a "linker" library, which shares the same DLL/SO as the assembler for 
historical reasons).

some people have complained to me that all this would be too slow, but 
in practice I have had nowhere near the levels of extreme code-spewing 
to where this would actually effect much (and, meanwhile, textual ASM is 
much nicer to work with IMO).

with some tweaks, it is possible to process in excess of 15MB of textual 
ASM per second, which seems plenty good enough (though with default 
settings it is a little slower, around 2MB/s, due to supporting ASM 
macros and using multiple-passes to compact jumps and similar).

currently all this is x86 and x86-64 only...

my assembler also uses a variant of NASM's syntax. basic syntax is about 
the same, but the preprocessor is different and many minor differences 
exist (including some extensions), but it is possible to write code 
which works with both (with some care).

GC'ed JIT is also supported (where the linker links the objects into 
GC'ed executable memory). this is mostly used for one-off executable 
objects (typically implementing closures and special purpose thunks, 
which are usually used as C function pointers).

I am aware of the SELinux issue, but haven't fully added support for it 
yet (lower priority, as I mostly develop on/for Windows...). mostly it 
would be done via using a software write barrier to redirect writes to 
the alternate memory address (or similar).

single-mapping would still be used on systems supporting 
read/write/execute memory.

typically, I am using COFF internally, even on Linux and similar.

caching object files to disk is done by several of my frontends, because 
yes, it is sort of pointless to endlessly recompile the same code every 
time the app starts or similar (especially since my C compiler is slow...).

or such...

> On Tue, Nov 16, 2010 at 8:47 AM, Olivier Meurant
> <meurant.olivier at gmail.com>  wrote:
>> On Tue, Nov 16, 2010 at 9:39 AM, James Molloy<James.Molloy at arm.com>  wrote:
>>> Hi,
>>>
>>> I've been watching the MC-JIT progress for some time, and #2 certainly
>>> looks like the best idea to me. I think however you've missed an important
>>> selling point of the "FOOJIT" architecture:
>>>
>>> * The use of a custom object file format directly enables the use of
>>> ahead-of-time compilation (using the JIT to recompile dynamically). Not only
>>> this but it allows the resaving of any functions that may have been
>>> JIT-optimised during runtime so they can be used immediately next run.
>>>
>>> This, coincidentally, is something that I was pondering on a way to try to
>>> crowbar into the current JIT (was thinking along the lines of parsing
>>> relocatable ELF into memory and running a link step manually, then
>>> "informing" the JIT about the memory object...)
>>>
>> I have "MCJIT"-like code in my own project (sadly not open-source...)
>> writing code in memory (without memory relocation informations) or in file
>> (with relocation informations). This allow to reload code from previous run,
>> or even to have a powerful server preparing code and clients executing code.
>> This not really tied to first or second proposition. And even for creating a
>> "FOOJIT" format, you need a FOOJITStreamer, a FOOJITObjectWriter and
>> probably a raw_ostream interface to write in memory. Seems really similar to
>> the first set of patchs to me.
>>
>> What need to be done :
>> - We need to define a FOOJIT format. Maybe we can focus on having a FOOJIT
>> format only for "fast path" now, and adding relocations, symbols later ?
>> - We need to discuss on mapping for external relocations (I really want to
>> cut dependency between runtime JIT and GlobalValue* : I want to run JITed
>> functions without a module)
>>
>>
>> What if FOOJIT format is used for "fast path" only (as the current jit
>> works) and we use ELF/MachO/COFF for more complex task "fast path" +
>> reloading binary on next run ? JIT users will have to choose faster but one
>> shot "FOOJIT" format or maybe slower but reusable "ELF/MachO/COFF" format ?
>>
>>
>> Olivier.
>>
>>
>>