<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.25in 1.0in 1.25in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Thanks, I just want to get the conclusion “LLVM IR” is easy to be reverted into source code.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Code
</span>obfuscation is not worth of discussion here, at least it is not IR’s coverage, haha.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">But one more question here, there are some optimization passes are applied in the frontend before generating BC, so it may not easy to revert IR to source code.<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Thanks<br>
Wan Xiaofei<span style="font-size:11.0pt;font-family:"Calibri","sans-serif""><o:p></o:p></span></p>
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Serge Pavlov [mailto:sepavloff@gmail.com]
<br>
<b>Sent:</b> Tuesday, October 15, 2013 12:01 PM<br>
<b>To:</b> Wan, Xiaofei<br>
<b>Cc:</b> LLVMdev@cs.uiuc.edu; Chris Lattner (sabre@nondot.org)<br>
<b>Subject:</b> Re: [LLVMdev] Reverse engineering for LLVM bit-code<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal">LLVM IR represents higher level than assembler code, it keeps some names and it is easier to revert the IR to source code than a binary format.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">The main task of LLVM IR is code generation. I don't think adding obfuscation has particular worth, those who need it can use tools and approaches specifically aimed at obfuscation. Even simple rename of identifiers in source code makes
C/C++ file very difficult to analyze. In other cases one might use anti-debugger tricks or execution code in virtual machine. Everything depends on the level of obfuscation, it is impractical to make LLVM IR a tool for that.<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Thanks,<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">--Serge<o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><o:p> </o:p></p>
<div>
<p class="MsoNormal">2013/10/15 Wan, Xiaofei <<a href="mailto:xiaofei.wan@intel.com" target="_blank">xiaofei.wan@intel.com</a>><o:p></o:p></p>
<p class="MsoNormal">HI,<br>
<br>
I am interested in whether LLVM bit-code is ready for a distribution format(stored in software distribution package); is it easy to revert LLVM IR to C/C++ source code like Java byte code? My understanding is that.<br>
1. LLVM IR is more like assembly code, so it is not easy for reverse engineering.<br>
2. If it is easy for reverse engineering, does it mean it is not suitable for distribution format? Otherwise code obfuscation in IR level must be added.<br>
<br>
Thanks<br>
Wan Xiaofei<br>
<br>
_______________________________________________<br>
LLVM Developers mailing list<br>
<a href="mailto:LLVMdev@cs.uiuc.edu">LLVMdev@cs.uiuc.edu</a> <a href="http://llvm.cs.uiuc.edu" target="_blank">
http://llvm.cs.uiuc.edu</a><br>
<a href="http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev" target="_blank">http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev</a><o:p></o:p></p>
</div>
<p class="MsoNormal"><br>
<br clear="all">
<o:p></o:p></p>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<p class="MsoNormal">-- <br>
Thanks,<br>
--Serge<o:p></o:p></p>
</div>
</div>
</body>
</html>