[LLVMdev] Advice - llvm as binary to binary translator ?

Sun Jun 22 21:54:42 PDT 2008

On Sun, Jun 22, 2008 at 8:08 PM, Erik Buck <erik.buck at sbcglobal.net> wrote:
> I went ahead and tried to translate my legacy machine language into
> IR.  Legacy branch instructions have me stumped.  The branch
> instructions are already resolved to destination addresses in the
> legacy machine code.  For example, there is an instruction that
> performs an unconditional branch to the address stored in legacy
> register B1.
>
> I can represent register B1 as a local variable:
> %B1 = alloca i32  ; storage for emulated register B1
> I can generate IR corresponding to legacy machine code that calculates
> the value to store in %B1.  I can generate the IR to store in %B1.
> store i32 %indirectAddr, i32* %B1
> Now, how do I generate an IR "br" instruction to the calculated
> address in %B1 ?  I don't have a suitable destination label in my IR.
> I can't create a block for the destination address if the address is
> calculated by the legacy code (can I?)
>
> Am I off to the wrong approach ?  Are there any suggestions ?

Completely wrong approach; you're not going to get anywhere trying to
statically translate machine code.  (I actually tried something like
this once, and it broke apart in a similar way.)

As far as I know, the only project that made any real progress using
LLVM for binary translation is llvm-qemu
(http://code.google.com/p/llvm-qemu/).  The approach there is roughly
to JIT one basic block at a time instead of interpreting one
instruction at a time, which is relatively simple and has a relatively
low overhead.

There's some documentation about llvm-qemu on its website, and there's
some good information on how to get started with the JIT at
http://llvm.org/docs/tutorial/.

That said, there are a lot of ways to speed up a pure interpreter; I'd
suggest trying that first before attempting a JIT-based solution.  JIT
can be a useful tool, but translation time can end up being a
significant factor, and it will likely take a lot of work to get good
performance with it.

-Eli