[LLVMdev] PHP Zend LLVM extension (SoC)
gordonhenriksen at mac.com
Tue Apr 22 16:17:33 PDT 2008
On Apr 22, 2008, at 18:44, Nuno Lopes wrote:
> PHP has a Google Summer of Code project approved to create an LLVM
> extension for the PHP's VM (Zend). (http://code.google.com/soc/2008/php/appinfo.html?csaid=73D5F5E282F9163F
> ). I'll be mentoring that project (and the student is CC'ed).
> Although I've already contributed a few patches to clang, I haven't
> hacked LLVM much, so I would like to gather some advise before
> misleading the student too much :P
This is very exciting!
> So my idea is to use the current PHP parser to produce PHP bytecode
> and then convert the PHP bytecode to LLVM's bitcode. The extra pass
> to create PHP bytecode seems necessary for now, as it makes things
> simpler in the PHP end. The first step would be to convert the PHP
> bytecode to LLVM by just producing function calls to the PHP
> interpreter opcode handlers. This has two advantages: it's a simple
> task and we can put something working fast. The disadvantage is that
> it would only bypass the opcode dispatcher, leaving no much room for
As far as I know, this is exactly how Apple's OpenGL shader JIT works
in Mac OS X. Unfortunately, LLVM will rarely make dramatic changes to
your memory representation, so this probably won't be as effective as
it is in the OpenGL context. (LLVM will only do aggregate->scalar
memory reorganizations; it probably won't be able to prove this safe
for a dynamic language very often.) Your challenge in generating very-
fast code would likely be one of type inference.
> In the second phase, we would start to inline some simple PHP
> bytecodes, like arithmetic operations and so on, by dumping LLVM
> assembly instead of calling the opcode handler. Eventually we could
> reach a point that no opcode handlers are necessary.
> So does this looks like a sane thing? Any helpful advise? Other
> question: After having the LLVM assembly, how should the binary code
> be produced, loaded to memory, and then executed? I assume we can
> link directly to the LLVM code generation and optimization libs. And
> does it support dumping the code directly to the memory so that we
> can run it from there without much magic (and then cache it
You can use the facilities of ExecutionEngine to run code in-memory
without ever touching the filesystem. The LLVM tutorial has
information on how to do this.
You'll probably want to provide your opcode handlers as an LLVM IR
module. Your JIT can start up and “seed” the execution environment
with the predefined handlers, then progressively incorporate more
functions into the module as execution progresses.
Hope that helps,
More information about the llvm-dev