[LLVMdev] Tool for run-time code generation?
nicholas at mxc.ca
Fri Jul 16 19:30:31 PDT 2010
> I happen to be using LLVM for just this reason. I process large volumes
> of data records with schemas that are only known at runtime and/or can
> change dynamically as various transforms are applied to such records at
> various stages.
> To this end, I auto-generate C99 syntax at run time, parse it using
> clang, do some AST transformations, compile using LLVM JIT, and then
> execute within the same (C++) process. As a point of comparison, I've
> done similar things with Java bytecode and while the JVM approach was
> [much] easier learning curve- and documentation-wise, it is hard to
> complain about the level of control you get with LLVM. It is like having
> a WISS (When I Say So) dynamic compiler producing -03-level native code.
> In my case, I target x86-64 and had only minor trouble supporting the
> same toolchain on Linux and Darwin.
I strongly recommend that anyone doing this sort of specialization not
to write a system that generates C code as strings and then parses it,
unless you happen to be starting with a system that already prints C.
Instead, break the chunks of C you would generate into functions and
compile those ahead-of-time. At run time you use llvm only (no clang) to
generate a series of function calls into those functions.
Then you can play tricks with that. Instead of fully compiling those
functions ahead of time (ie. to .o files), you can compile them into .bc
and create an llvm Module out of it, either by loading it from a file or
by using 'llc -march=cpp' to create C++ code using the LLVM API that
produces said module when run. With your run-time generated code and the
library code in the same Module, you can run the inliner before the
Alternately, if your chunks of C are very small you should may find it
easy to just produce LLVM IR in memory using the LLVM API directly. See
the LLVM language at llvm.org/docs/LangRef.html and the IRBuilder at
Either of these techniques avoids the need to use clang at run-time, or
spend time generating large strings just to re-parse them. Since the
optimizers are all in LLVM proper, you should get the exact same
> So, the approach is definitely workable, but I must warn about the
> non-trivial amount of effort required to figure things out in both clang
> and LLVM codebases. For example, how to traverse or mutate ASTs produced
> by clang is pretty much a FAQ on the clang list but there is no good
> documentation addressing this very common use case.
> On Jul 16, 2010, at 10:47 AM, David Piepgrass wrote:
>> Using C++ code, I would like to generate code at run-time (the same
>> way .NET code can use dynamic methods or compiled expressions) in
>> order to obtain very high performance code (to read binary data
>> records whose formats are only known at run-time.) I need to target
>> x86 (Win32) and ARM (WinCE).
>> Can LLVM be used for this purpose, or would something else work
>> better? Are there any open-source projects that have done this, that I
>> could look to as an example?
>> *David Piepgrass, E.I.T. *
>> Software Developer
>> *Mentor Engineering Inc. <http://www.mentoreng.com/> *
>> 10, 2175 - 29th Street NE
>> Calgary, AB, Canada T1Y 7H8
>> Ph: (403) 777-3760 ext. 490 Fax: (403) 777-3769
>> *What are the costs of speeding & idling in your fleet?** **
>> ** Watch this short demo to find out
>> <http://www.mentoreng.com/speed-idle/speed-idle-demo.html> *
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>> http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
More information about the llvm-dev