[LLVMdev] Is LLVM appropriate for implementing a shell interpreter?

Thu Mar 24 08:05:00 PDT 2011

Mu Qiao wrote:

> Hi devs,
> 
> We are implementing a library that interprets shell scripts so that
> other programs could efficiently talk to bash. We'd like to hear your
> advice on whether LLVM is appropriate for us. Here are our considerations:
> 
> In most cases our library will interpret each script just once. Our
> current approach is using a manual implementation based on ANTLR and
> C++, so actually we are executing the scripts while interpreting. If we
> turn to LLVM, we need to first compile it into LLVM IR, then into native
> code. Our guess is this may be slower than our current approach. Is that
> true?
> 
> Anyway, we do have several scripts that need to be sourced and reused
> while interpreting others. We guess this is where LLVM could help. LLVM
> optimized code for those scripts should run faster than our manual
> implementation. So the overall performance could be improved.
> 
> Could you please point out if we are wrong? Thanks.
> 

I'm currently implementing such a thing (interactive shell / compiled 
scripts (only the former of which is currently being implemented)). 

LLVM apparently has one problem regarding this: Its context caches all 
constants ever created and doesn't free them, until the LLVMContext object 
is free'ed itself. 

So if your shell for example is connected to a pipe, accepting generated 
scripts or something in possibly fast succession, you will have problems 
with inputs such as:

    print 0   # 0 is cached and never free'ed
    print 1   # 1 is cached and never free'ed
    ...

We haven't tried to tackle this problem yet. But we probably need to.