[LLVMdev] Methodology for interfacing LLVM JITed code with C++

Austin Robison arobison at cs.utah.edu
Wed Sep 12 01:15:56 PDT 2007


Hi all,

I'm currently working on a C-like scripting language compiler backend 
designed to emit LLVM code.  This code will be loaded into LLVM's JIT at 
runtime and will make calls into a C++ library (including calling 
virtual methods on C++ objects).  The translation from our AST to an 
llvm::Module is fairly straightforward, the difficulty, however, comes 
in generating the appropriate LLVM code to call into the C++ code.  We 
see two main issues, name mangling and virtual functions.

For example, lets say this scripting language contains a function foo:

void foo()
{
  int x = builtin_func();
}


We would like for builtin_func() to generate code that calls a C++ 
function.  One option we see for implementing this is to write C++ code 
that contains stub functions, such as:

extern "C"
int call_builtin_func()
{
  return SomeCXXFunction();
}

or for something with method calls:

extern "C"
void call_func(Object* obj, ArgType* arg)
{
 obj->func(arg);  // func is a virtual method
}


and then compile this file with llvm-g++ to bitcode.  At runtime, the 
JIT would load both our scripting code and this stub bitcode, and link 
everything together with the LTO.  This hopefully alleviates the need to 
manually tweak name mangled symbols when generating the LLVM code for 
the scripting language and will automatically generate llvm type 
information for the C++ classes and types that live in the library.  
There will no doubt also be some build system trickery needed to glue 
everything together with appropriate symbol names as well.

Other than name-mangling, calling virtual functions presents a problem.  
It would seem that llvm-g++ is required to generate the vtable layouts 
for C++ classes so our generated LLVM code can grab the appropriate 
function pointer to make a call.  But what is the right way to patch 
everything together so that our compiler can output LLVM code that can 
call these virtual C++ methods?

So my question is whether there exists a conventional technique for 
interfacing custom generated (non llvm-g++) LLVM code with C++ code?  
And as a side question, how do the target triple and data layout of the 
module play into this?  I assume they must match the C++ ABI 
definitions.  llvm-config can get the target-triple, but what about the 
native C++ data layout?  It seems like this use case would come up 
often, but I haven't been able to find any discussion or documentation 
related to it.

Thanks!

Austin



More information about the llvm-dev mailing list