[LLVMdev] Using LLVM to serialize object state -- and performance
Paul J. Lucas
paul at lucasmail.org
Fri Oct 26 16:16:49 PDT 2012
I have a legacy C++ application that constructs a tree of C++ objects (an iterator tree to implement a query language). I am trying to use LLVM to "serialize" the state of this tree to disk for later loading and execution (or "compile" it to disk, if you prefer).
Each of the C++ iterator objects now has a codegen() member function that adds to the LLVM code of an llvm::Function. The LLVM code generated is a sequence of instructions to set up the arguments for and call the constructor of each C++ object. (I am using C "thunks" that provide a C API to LLVM to make C++ class constructor calls.) Hence, all the LLVM code taken together into a single "reconstitute" function are mostly a sequence of "call" instructions with a few "store" and "getelementptr" instructions here and there -- fairly straight-forward LLVM code.
I then write out the LLVM IR code to disk and, at some later time, read it back in with ParseIR(), do getPointerToFunction(), execute that function, and the C++ iterator tree has been reconstituted.
This all works, but the JIT compile step is *slow*. For a sequence of about 8000 LLVM instructions (most of which are "call"), it takes several seconds to execute.
It occurred to me that I don't really want JIT compiling. I really want to compile the LLVM code to machine code and write that to disk so that when I read it back, I can just run it. The "reconstitute" function is only ever run once per query invocation, so there's no benefit from JIT compiling it since it will never be run a second or subsequent time.
Questions:
* Is what I'm doing with LLVM a "reasonable" thing to do with LLVM?
* If so, how can I speed it up? By generating machine code? If so, how?
I've looked at the source for llc, but that apparently only generates assembly source code, not object code.
- Paul
More information about the llvm-dev
mailing list