[LLVMdev] help decompiling x86 ASM to LLVM IR
James Courtier-Dutton
james.dutton at gmail.com
Tue Mar 12 09:20:26 PDT 2013
Hi,
I am looking to decompile x86 ASM to LLVM IR.
The original C is this:
int test61 ( unsigned value ) {
int ret;
if (value < 1)
ret = 0x40;
else
ret = 0x61;
return ret;
}
It compiles with GCC -O2 to (rather cleverly removing any branches):
0000000000000000 <test61>:
0: 83 ff 01 cmp $0x1,%edi
3: 19 c0 sbb %eax,%eax
5: 83 e0 df and $0xffffffdf,%eax
8: 83 c0 61 add $0x61,%eax
b: c3 retq
How would I represent the SBB instruction in LLVM IR?
Would I have to first convert the ASM to something like:
0000000000000000 <test61>:
0: cmp $0x1,%edi Block A
1: jb 4: Block A
2: mov 0x61,%eax Block B
3: jmp 5: Block B
4: mov 0x40,%eax Block C
5: retq Block D (Due to join point)
...before I could convert it to LLVM IR ?
I.e. Re-write it in such a way as to not need the SBB instruction.
The aim is to be able to then recompile it to maybe a different target.
The aim is to go from binary -> LLVM IR -> binary for cases where the
C source code it not available or lost.
I.e. binary available for x86 32 bit. Re-target it to ARM or x86-64bit.
The LLVM IR should be target agnostic, but would permit the
re-targetting task without having to build AST and structure as a C or
C++ source code program.
Any comments?
James
More information about the llvm-dev
mailing list