[LLVMdev] PHP Zend LLVM extension (SoC)

Tue Apr 22 15:44:34 PDT 2008

Hi,

PHP has a Google Summer of Code project approved to create an LLVM extension
for the PHP's VM (Zend).
(http://code.google.com/soc/2008/php/appinfo.html?csaid=73D5F5E282F9163F).
I'll be mentoring that project (and the student is CC'ed).
Although I've already contributed a few patches to clang, I haven't hacked
LLVM much, so I would like to gather some advise before misleading the
student too much :P

So my idea is to use the current PHP parser to produce PHP bytecode and then
convert the PHP bytecode to LLVM's bitcode. The extra pass to create PHP
bytecode seems necessary for now, as it makes things simpler in the PHP end.
The first step would be to convert the PHP bytecode to LLVM by just
producing function calls to the PHP interpreter opcode handlers. This has
two advantages: it's a simple task and we can put something working fast.
The disadvantage is that it would only bypass the opcode dispatcher, leaving
no much room for optimizations.
In the second phase, we would start to inline some simple PHP bytecodes,
like arithmetic operations and so on, by dumping LLVM assembly instead of
calling the opcode handler. Eventually we could reach a point that no opcode
handlers are necessary.

So does this looks like a sane thing? Any helpful advise?
Other question: After having the LLVM assembly, how should the binary code
be produced, loaded to memory, and then executed? I assume we can link
directly to the LLVM code generation and optimization libs. And does it
support dumping the code directly to the memory so that we can run it from
there without much magic (and then cache it somewhere)?

Thanks,
Nuno