[LLVMdev] Advice - llvm as binary to binary translator ?

Sat Jun 21 18:53:18 PDT 2008

First, is there a way to search the archives for this list ?  I  
apologize in advance if I have stepped on a FAQ.

My goal is to execute legacy binary machine code from a very old one  
of a kind computer on a variety of modern computers.   I already wrote  
an emulator for the legacy machine that executes the old machine  
code.  However, my emulator is just an interpreter and therefore has  
some limitations:

- The emulator spends a lot of time in an executive loop that fetches  
legacy instructions, decodes them, and jumps to appropriate C  
functions that emulate each legacy instruction.  The executive loop  
also has to handle emulated interrupts, support single-step debugging,  
etc.

- The emulator is compiled and run on only a few modern hardware/ 
operating system combinations.  The emulator is fairly portable, but  
extensive optimizations on some platforms restrict capabilities on  
other platforms.

- The emulator executes the legacy machine code unmodified which is  
good, but that means opportunities for optimization are lost.  The  
legacy machine code is full of dead code, jumps to jumps, redundant  
sub-expressions, unnecessary memory accesses, etc.  Back in the old  
days, compilers really didn't optimize at all.  They generated  
horrible code that was sometimes hand modified.

My idea is to convert my emulator into a translator that emits LLVM IR  
either directly or via calls to the LLVM library.  I would then  
execute the result via JIT or native code compilation...

Is this a reasonable approach ?
Can this approach be used even when the legacy code is self  
modifying ?  After a code modification, a re-translation and re-JIT  
would be needed.

Are there any general suggestions ?