[LLVMdev] Scheme + LLVM JIT

Thu May 5 09:41:21 PDT 2005

On Thu, 5 May 2005, Misha Brukman wrote:
> On Thu, May 05, 2005 at 03:46:58AM -0400, Alexander Friedman wrote:
>> On May  5, Misha Brukman wrote:
>>> To the best of my knowledge, this has not been done and no one has
>>> announced their intent to work on it, so if you are interested,
>>> you'd be more than welcome to do so.
>>
>> My C++ knowledge is completely non-existant, but so far I've had a
>> surprisingly easy time reading the source. This change seems somewhat
>> involved - I will have to implement different calling conventions -
>> ie, passing a return-address to the callee, etc. Who is the right
>> person to talk to abot this?
>
> The notes you refer to belong to Chris Lattner, but you should just post
> your questions on llvmdev and someone will answer them.  The benefits
> are that you may get your response faster than emailing someone
> directly, you may get multiple perspectives, and the discussion is
> archived for future LLVMers who are looking for some similar advice.

I agree with misha.  This should definately be discussed on-list if 
possible.

>> Ok, this makes sense. However, am I correct in assuming that the
>> interaprocedural optimizations performed in gccas will make it
>> problematic to call 'JIT::recompileAndRelinkFunction()' . For example,
>> suppose I run run some module that looks like

...

>> through all of those optimizations. Will the result nessisarily have a
>> bar() function?
>
> You are correct, it may get inlined.
>
>> If inlining is enabled, replacing bar might have no effect if it's
>> inlined in foo.
>
> True.

Yes, this is an important point.

We build LLVM to be as modular as possible, which means that you get to 
choose exactly which pieces you want to put together into your program. 
If you're interested in doing function-level replacement, you basically 
have to avoid *all interprocedural optimizations*.  There may be ways 
around this in specific cases, but anything that assumes something about a 
function that has been replaced will need to be updated.  I don't think 
there is anything LLVM-specific about this problem though.

> However, let's step back for a second.  I am talking about what effect
> gccas/gccld will have on code generated by some front-end.  Presumably,
> you want to write a single stand-alone JIT that will take scheme -> LLVM
> -> native code via JIT.  Hence, gccas/gccld optimization selection
> doesn't really apply to you.  You can write your own "tool" that will
> use the JIT libraries and the optimizations of your choosing, if you so
> desire.

Yup exactly, you get to choose exactly what you want to use :)

>> If there are complications like this, how much of a performance gain
>> do the interprocedural opts give?

This is impossible to say: it totally depends on the program.  I know that 
some real-world C codes are sped up by 30-50% in some cases, but others 
are not sped up at all.  I can't say for scheme programs, but I expect 
that the situation would be similar.

>> Also, compileAndRelink (F) seems to update references in call sites of
>> F. Does this mean that every function call incurs an extra 'load' , or
>> is there some cleverer solution?
>
> We don't track all the call sites.  Instead, what recompile and relink
> does is adds an unconditional branch from the old function (in native
> code) to the new one (in native code again), so what this does is add an
> extra indirection to all subsequent calls to that function, but not an
> extra load.
>
> One cleverer solution would be to actually track all the call sites, but
> then if recompile-and-relink is called rarely, it would be an extra
> overhead, not a gain, so it would slow down the common case.

Actually this is not clear, it might be a win to do this.  I don't think 
anyone has pounded on the replace function functionality enough for this 
to show up though.

> Another cleverer solution would be to overwrite the machine code
> in-place with the code for the new function, but then the problem is
> that we lay out the code for functions sequentially in memory so as to
> not waste space, and hence, each recompilation of the function better
> fit into the place of the old one, or else we might run into the code
> region of the next function.  This means that we then have to break up
> the code region for a function into multiple code sections, possibly
> quite far apart in memory, and this leads to more issues.

Also, if the function is currently being executed by a stack frame higher 
up on the stack, when we got back to that function, chaos would be 
unleashed :)

>> Finally, if I jit-compile several modules, can they reference each
>> other's functions? If this is answered somewhere in the docs, I
>> appologize.
>
> At present, I am not quite sure that the JIT will accept two different
> Modules, most tools (except the linkers) assume a single Module that is
> given to them.  I have not used two Modules with the JIT and I haven't
> seen anyone else do that, so it maybe a limitation or it just may need
> some extention to support it, I'm not sure.

I don't think lli supports this (yet!), but there is no fundamental reason 
why it could not be extended, and the JIT library might already work.  I'm 
not sure.

>> It's not the linking/relocating that's the problem. The problem is
>> that each binary winds up being rather large. However, since these
>> tools don't need to be distributed or compiled for my purposes, I
>> guess i'm not really worried about it.
>
> Compiling optimized binaries rather than debug build would save quite a
> bit of space in the binaries.  Other than that, I'm not really sure,
> except maybe to compile LLVM with LLVM and then use its aggressive
> optimizations and dead-code elimination transformations? :)

Like misha said, please try compiling with 'make ENABLE_OPTIMIZED=1'. 
This will produce files in the llvm/Release/bin directory which much 
smaller than the debug files (e.g. opt goes from 72M -> 4M without debug 
info).

-Chris

-- 
http://nondot.org/sabre/
http://llvm.cs.uiuc.edu/