[LLVMdev] Using LLVM for decompiling.

Joshua Cranmer pidgeot18 at gmail.com
Mon May 7 06:40:38 PDT 2012


On 5/7/2012 5:47 AM, James Courtier-Dutton wrote:
> Hi,
>
> I am writing a decompiler. I was wondering if some of LLVM could be
> used for a decompiler.
> There are several stages in the decompiler process.
> 1) Take binary and create a higher level representation of it. Like RTL.
> 2) The output is then broken into blocks or nodes, each block ends in
> a CALL, JMP, RET, or 2-way or multiway conditional JMP.
> 3) The blocks or nodes are then analyzed for structure in order to
> extract loop information and if...then...else information.
> 4) Once structure is obtained, data types can be analyzed.
> 5) Lastly, source code is output in C or C++ or whatever is needed.
>
> I was wondering if LLVM could help with any of these steps.
> I am looking at doing step (3) better. Can LLVM help in that area?

The problem of extracting structured control flow graphs is, to my 
knowledge, relatively solved. I have seen some work by a few groups on 
decompiling LLVM IR to C; one of the results may be found here: 
<https://bitbucket.org/gnarf/axtor/>.

The other issue is that it is not that feasible to translate machine 
code to LLVM IR right now since the LLVM IR is at a very high level, so 
the task of translation would be akin to reimplementing much of, say, IDA.

-- 
Joshua Cranmer
News submodule owner
DXR coauthor




More information about the llvm-dev mailing list