[LLVMdev] Using LLVM to serialize object state -- and performance
Paul J. Lucas
paul at lucasmail.org
Fri Oct 26 16:16:49 PDT 2012
I have a legacy C++ application that constructs a tree of C++ objects (an iterator tree to implement a query language). I am trying to use LLVM to "serialize" the state of this tree to disk for later loading and execution (or "compile" it to disk, if you prefer).
Each of the C++ iterator objects now has a codegen() member function that adds to the LLVM code of an llvm::Function. The LLVM code generated is a sequence of instructions to set up the arguments for and call the constructor of each C++ object. (I am using C "thunks" that provide a C API to LLVM to make C++ class constructor calls.) Hence, all the LLVM code taken together into a single "reconstitute" function are mostly a sequence of "call" instructions with a few "store" and "getelementptr" instructions here and there -- fairly straight-forward LLVM code.
I then write out the LLVM IR code to disk and, at some later time, read it back in with ParseIR(), do getPointerToFunction(), execute that function, and the C++ iterator tree has been reconstituted.
This all works, but the JIT compile step is *slow*. For a sequence of about 8000 LLVM instructions (most of which are "call"), it takes several seconds to execute.
It occurred to me that I don't really want JIT compiling. I really want to compile the LLVM code to machine code and write that to disk so that when I read it back, I can just run it. The "reconstitute" function is only ever run once per query invocation, so there's no benefit from JIT compiling it since it will never be run a second or subsequent time.
* Is what I'm doing with LLVM a "reasonable" thing to do with LLVM?
* If so, how can I speed it up? By generating machine code? If so, how?
I've looked at the source for llc, but that apparently only generates assembly source code, not object code.
More information about the llvm-dev