[LLVMdev] Scheme + LLVM JIT

Wed May 4 22:24:41 PDT 2005

Hi, Alexander!

On Wed, May 04, 2005 at 11:59:06PM -0400, Alexander Friedman wrote:
> I am in the preliminary stages of adding a JIT compiler to a sizable
> Scheme system (PLT Scheme).

Cool!

> The original plan was to use GNU Lightning, but 1) it seems to be
> dead, and 2) LLVM has already done a huge amount of stuff that I would
> have had to write (poorly) from scratch.  

Maybe we can use you for a testimonial... :)

> At the moment, LLVM seems to be the ideal choice for implementing the
> Scheme JIT, but there are problems that need to be addressed first.  I
> hope you guys can help me with these - I'll list them in descending
> order of importance.

Sounds good, I'll do my best.

> Tail Call Elimination:
> 
> I've read over the "Random llvm notes", and see that you guys have
> though about this already.
> 
> However, the note dates from last year, so I am wondering if there is
> an implementation in the works. If no one is working on this or is
> planning to work on this in the near future, I would be willing to
> give it a shot if I was given some direction as to where to start.

To the best of my knowledge, this has not been done and no one has
announced their intent to work on it, so if you are interested, you'd be
more than welcome to do so.

> I have looked over the JIT documentation (which is a bit sparse) and
> the examples. So far I am completely unclear as to what the JIT
> compiler actually does with the code that is passed to it.

A target runs the passes listed in the method
<target>JITInfo::addPassesToJITCompile() to emit machine code.

> To be more precise, does the JIT perform all of the standard llvm
> optimizations on the code, or does it depend on it's client to do so
> himself? Are there some examples of that?

No, the JIT performs no optimizations.  The method I mentioned above
just lowers the constructs the instruction selector cannot handle (yet)
or things that the target does not have support for.  About the only
thing the JIT does (in X86) is eliminate unreachable blocks (dead code).
Then, it's passed on to the instruction selector which creates machine
code and some peephole optimizations are ran, then prolog/epilog are
inserted.  I glossed over the x86 floating point details, but you get
the idea.

The use case scenario is usually like this:

llvm-gcc/llvm-g++ produces very simple, brain-dead code for a given
C/C++ file.  It does not create SSA form, but creates stack allocations
for all variables.  This makes it easier to write a front-end.  We
turned off all optimizations in GCC and so the code produced by the
C/C++ front-end is really not pretty.

Then, gccas is run on each LLVM assembly file, and gccas is basically an
optimizing assembler, it runs the optimizations listed in
llvm/tools/gccas/gccas.cpp which you can inspect.

Once all the files for a program are compiled to bytecode, they are
linked with gccld, which is an optimizing linker, which does a lot of
interprocedural optimization, and creates the final bytecode file.

After this, you can use llc or lli (JIT) on the resulting bytecode, and
llc or lli don't have to do any optimizations, because they have already
been performed.

> If it does indeed optimize the input, does it attempt to do global
> optimizations on the functions (intraprocedural register allocation,
> inlining, whatever)?

The default register allocator in use for most platforms is a
linear-scan register allocator, and the SparcV9 backend uses a
graph-coloring register allocator.  However, the JIT performs no
inlining, as mentioned above.

> Does it re-do these optimizations when functions are added/ removed/
> changed? Are there parameters to tune the compiler's aggressiveness?

There is a JIT::recompileAndRelinkFunction() method, but it doesn't
optimize the code.

> Does there happen to be a C interface to the jit ? Our scheme impl
> has a good FFI, but it doesn't do C++. If not, this is no big deal,
> and i'll just write something myself.

No, this would have to be added.

> While the sources of llvm are not that big, the project builds very
> slowly into something very large. Someone already asked about what is
> the minimum needed for just a JIT compiler, and I think I have a
> vague idea of what needs to tweaked. However, I want to minimize the
> changes I make to my llvm tree. 

llvm/examples/HowToUseJIT pretty much has the minimal support one needs
for a JIT, but if you can make it even smaller, I'd be interested.

> [...] configure script seems to ignore my directives. For examle, it
> always builds all architectures, ...

Are you using a release or CVS version?  Support for this just went into
CVS recently, so you should check it out and see if it works for you.
If you *are* using CVS, are you saying you used `configure
-enable-target=[blah]' and it compiled and linked them all?  In that
case, it's a bug, so please post your results over here:

http://llvm.cs.uiuc.edu/PR518

> ... and it always statically links each binary.

Yes, that is currently the default method of building libraries and
tools.  If you were to make all the libraries shared, you would be doing
the same linking/relocating at run-time every time you start the tool.

There is support for loading target backends, etc. from shared objects
with -load, and we may move to the model of having shared objects for
targets in the future, but at present, they are static.

Having more shared libraries may speed up link time, but I suspect will
negatively impact run time.

-- 
Misha Brukman :: http://misha.brukman.net :: http://llvm.cs.uiuc.edu