[LLVMdev] how to transform elf binary to llvm IR?

mats petersson mats at planetcatfish.com
Fri Jul 17 09:45:45 PDT 2015


For every level of translation [in terms of "human readable -> machine code
translation", not someone translating a literary work from one language to
another - although often some subtle details are lost here too], a little
bit of the semantic meaning is lost. This means that you can almost never
completely reconstruct the code in original form from the machine-code, or
the C-code from the LLVM IR, or the C++ code from the output of something
like cfront (the original C++ -> C translator), or the original Pascal code
from a Pascal to C compiler, etc.

It is, at least sometimes, possible to reconstruct something that can then
be "compiled" [in quotes as it's a loose term in this discussion] again
from the binary file, but it's often lacking some of the original subtlety.
And there are certainly cases where the original code is very hard to
derive from the machine-code. I played with a "symbolic disassembler" many
years back, and on "well-behaved code" it would reconstruct assembly code
that could be recompiled, but it struggled with for example
switch-statements that became a PC-relative jump-table, because when you
modify the code, it couldn't figure out what the jumps were - just as one
example.


I'm pretty sure it's possible to, at least as a human, write code that is
nearly impossible to translate back to a higher level language. And modern
compilers may not use the same types of obfuscation, but they will
certainly produce code that is complex, hard to follow and not using
obvious instructions for some particular purpose.

--
Mats

On 17 July 2015 at 17:11, Shuai Wang <wangshuai901 at gmail.com> wrote:

> This is not a easy task. And I believe there is *NO* (open-source) tool
> can fully solve this problem (statically). Correct me if I was wrong.
>
> It would be more helpful if you can provide details about what you want to
> do, say, static or dynamic ? stripped binary or binary with symbolic
> information?
> What compiler do you work on?
>
> Check out  papers below if you are interested.
>
> http://dl.acm.org/citation.cfm?id=2465380
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__dl.acm.org_citation.cfm-3Fid-3D2465380&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=PMWV93YoHpzwPfOq-d9rjutlZ5ICwU8uIp3HLShT_D0&s=74RkRYSGnXHwJXd5DvxXdamQv0mj7_NjyBzbdCNRrYo&e=>
>
> http://dl.acm.org/citation.cfm?id=2462165
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__dl.acm.org_citation.cfm-3Fid-3D2462165&d=AwMFaQ&c=8hUWFZcy2Z-Za5rBPlktOQ&r=Mfk2qtn1LTDThVkh6-oGglNfMADXfJdty4_bhmuhMHA&m=PMWV93YoHpzwPfOq-d9rjutlZ5ICwU8uIp3HLShT_D0&s=rpl0PCuoy_iecIKs3lz3F0nGYQYw1J1cqTapvfLsceo&e=>
>
>
>
> Shuai
>
>
>
> On Fri, Jul 17, 2015 at 3:09 AM, 慕冬亮 <mudongliangabcd at gmail.com> wrote:
>
>> I want to transform elf binary to llvm IR, and do some instrumentation
>> based on llvm.
>> Is there any tool which can do the transformation?
>> Thanks in advance.
>>
>>     - mudongliang
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150717/c59fcebe/attachment.html>


More information about the llvm-dev mailing list