[LLVMdev] PHP Zend LLVM extension (SoC)
Razvan Aciu
admin at kam.ro
Thu Apr 24 01:08:13 PDT 2008
Hi Nuno,
this can be a great project. Some PHP opcodes can be optimised a lot by llvm
(like branches or function calls) while others like operations on variables
can't be so easy optimized due to the dynamic nature of PHP. For the latest
maybe you can use some automatic type inference, like the ones used in
languages like Haskell, but this is is a big project and there are also
mixed cases like adding a number to a string. I think for these you can use
for now the PHP handlers. Even so, I feel that the speed gain will be
considerable.
Another thing you can do with only a little more work is to create an
abstraction layer between the webserver module and the content source,
abstraction layer which will work only with LLVM compiled files (.bc). In
that scenario you can compile PHP files to LLVM .bc file format. These files
can also be used as a cache, thus eliminating future parsing and compiling
times. The speed gain can be very high, because for very much accessed sites
some pages are needed hundreds of times per minute. The generated .bc files
will call where needed the handlers from the PHP runtime and libraries.
On long term this abstraction layer, which in fact is a webserver module,
can be used with many frontends which will generate .bc code from different
source languages (now Ruby, Python, Lua, etc comes into my mind),
transforming all the thing into a framework similar with the ones based on
.class or .NET cli formats. This of course can be done if the .bc format is
mature and stable, else it can only be used as a cache.
Good luck,
Razvan
> Hi,
>
> PHP has a Google Summer of Code project approved to create an LLVM
> extension
> for the PHP's VM (Zend).
> (http://code.google.com/soc/2008/php/appinfo.html?csaid=73D5F5E282F9163F).
> I'll be mentoring that project (and the student is CC'ed).
> Although I've already contributed a few patches to clang, I haven't hacked
> LLVM much, so I would like to gather some advise before misleading the
> student too much :P
>
> So my idea is to use the current PHP parser to produce PHP bytecode and
> then
> convert the PHP bytecode to LLVM's bitcode. The extra pass to create PHP
> bytecode seems necessary for now, as it makes things simpler in the PHP
> end.
> The first step would be to convert the PHP bytecode to LLVM by just
> producing function calls to the PHP interpreter opcode handlers. This has
> two advantages: it's a simple task and we can put something working fast.
> The disadvantage is that it would only bypass the opcode dispatcher,
> leaving
> no much room for optimizations.
> In the second phase, we would start to inline some simple PHP bytecodes,
> like arithmetic operations and so on, by dumping LLVM assembly instead of
> calling the opcode handler. Eventually we could reach a point that no
> opcode
> handlers are necessary.
>
> So does this looks like a sane thing? Any helpful advise?
> Other question: After having the LLVM assembly, how should the binary code
> be produced, loaded to memory, and then executed? I assume we can link
> directly to the LLVM code generation and optimization libs. And does it
> support dumping the code directly to the memory so that we can run it from
> there without much magic (and then cache it somewhere)?
>
>
> Thanks,
> Nuno
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
More information about the llvm-dev
mailing list