[LLVMdev] Using LLVM for decompiling.
Joshua Cranmer
pidgeot18 at gmail.com
Mon May 7 06:40:38 PDT 2012
On 5/7/2012 5:47 AM, James Courtier-Dutton wrote:
> Hi,
>
> I am writing a decompiler. I was wondering if some of LLVM could be
> used for a decompiler.
> There are several stages in the decompiler process.
> 1) Take binary and create a higher level representation of it. Like RTL.
> 2) The output is then broken into blocks or nodes, each block ends in
> a CALL, JMP, RET, or 2-way or multiway conditional JMP.
> 3) The blocks or nodes are then analyzed for structure in order to
> extract loop information and if...then...else information.
> 4) Once structure is obtained, data types can be analyzed.
> 5) Lastly, source code is output in C or C++ or whatever is needed.
>
> I was wondering if LLVM could help with any of these steps.
> I am looking at doing step (3) better. Can LLVM help in that area?
The problem of extracting structured control flow graphs is, to my
knowledge, relatively solved. I have seen some work by a few groups on
decompiling LLVM IR to C; one of the results may be found here:
<https://bitbucket.org/gnarf/axtor/>.
The other issue is that it is not that feasible to translate machine
code to LLVM IR right now since the LLVM IR is at a very high level, so
the task of translation would be akin to reimplementing much of, say, IDA.
--
Joshua Cranmer
News submodule owner
DXR coauthor
More information about the llvm-dev
mailing list