[LLVMdev] Tool for run-time code generation?

Fri Jul 16 22:20:05 PDT 2010

On Jul 16, 2010, at 9:30 PM, Nick Lewycky wrote:

> I strongly recommend that anyone doing this sort of specialization not to write a system that generates C code as strings and then parses it, unless you happen to be starting with a system that already prints C.
> 
> Instead, break the chunks of C you would generate into functions and compile those ahead-of-time. At run time you use llvm only (no clang) to generate a series of function calls into those functions.
> 
> Then you can play tricks with that. Instead of fully compiling those functions ahead of time (ie. to .o files), you can compile them into .bc and create an llvm Module out of it, either by loading it from a file or by using 'llc -march=cpp' to create C++ code using the LLVM API that produces said module when run. With your run-time generated code and the library code in the same Module, you can run the inliner before the other optimizers.
> 
> Alternately, if your chunks of C are very small you should may find it easy to just produce LLVM IR in memory using the LLVM API directly. See the LLVM language at llvm.org/docs/LangRef.html and the IRBuilder athttp://llvm.org/doxygen/classllvm_1_1IRBuilder.html .
> 
> Either of these techniques avoids the need to use clang at run-time, or spend time generating large strings just to re-parse them. Since the optimizers are all in LLVM proper, you should get the exact same assembly out.
> 
> Nick

In general, the suggestion makes sense. In my case, however, some of the content is actually human-generated and is somewhat free-form. So, some parseable entry language is a requirement and I chose C99 since it seemed to be well supported by clang. The auto-generated part comes into play only as glue that binds the C code variables with the rest of the parent process' runtime.

Also, any overhead of parsing and compiling is well amortized over subsequent processing time (millions of records).

Cheers,
Vlad