<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
{font-family:"MS Gothic";
panose-1:2 11 6 9 7 2 5 8 2 4;}
@font-face
{font-family:"MS Gothic";
panose-1:2 11 6 9 7 2 5 8 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:"\@MS Gothic";
panose-1:2 11 6 9 7 2 5 8 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0in;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.EmailStyle17
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";}
@page WordSection1
{size:8.5in 11.0in;
margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="EN-US" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">I can imagine a machine-IR pass that could construct a CFG shortly before starting the AsmPrinter phase; I am pretty sure there is nothing that would affect
the CFG after that point. I don't know whether such an analysis exists currently. I also don't know whether it would be straightforward to map an LLVM IR CFG to the machine-IR CFG; maybe not, as there are certainly machine-IR passes that do things like splitting
and merging blocks.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Deriving final binary addresses for blocks in the CFG would require that you track or insert labels, and then examine the final binary to determine the addresses
for each of those labels. I am unclear how you would correlate these values with the CFG, however.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">--paulr<o:p></o:p></span></p>
<p class="MsoNormal"><a name="_MailEndCompose"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></a></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Muhui Jiang [mailto:jiangmuhui@gmail.com]
<br>
<b>Sent:</b> Wednesday, June 13, 2018 11:12 AM<br>
<b>To:</b> Robinson, Paul<br>
<b>Cc:</b> mayuyu.io; llvm-dev<br>
<b>Subject:</b> Re: [llvm-dev] IR to binary address mapping<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">Hi Paul<o:p></o:p></p>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Thanks for your comments. Suppose I can generate the control flow graph via LLVM Pass or the default option like '-dot-cfg' with opt. However, the control flow graph is based on llvm IR level. I would like to have a control flow graph based
on binary level. Thus, I want to map the IR to binary address. <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">As far as I know, we used to use the debug information to map the IR to source code and then use the dwarf line mapping table to map to binary address. However, I come across many problems. For example, dwarf mapping table's information
is not complete. Sometimes the line number could even be zero. Besides, one line and column number could map to more than one binary address. Thus, I may need the mapping from IR to binary to give me a control flow graph on binary level. Do you have any comments
or solutions? Many Thanks<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal">Regards<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal">Muhui<o:p></o:p></p>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
</div>
<div>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<p class="MsoNormal">2018-06-13 23:05 GMT+08:00 <<a href="mailto:paul.robinson@sony.com" target="_blank">paul.robinson@sony.com</a>>:<o:p></o:p></p>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">We preserve the source line/column for two reasons. First, so that any compiler diagnostic messages
can point to a source location that caused the diagnostic; second, because debugging information in the final binary wants to be able to map machine instruction addresses back to source locations. There is never any need for the end-user to map machine instruction
addresses back to IR instructions, so we don't maintain any information that could produce such a mapping.</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">--paulr</span><o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><a name="m_-1688809287284250986__MailEndCompose"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"> </span></a><o:p></o:p></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0in 0in 0in 4.0pt">
<div>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> llvm-dev [mailto:<a href="mailto:llvm-dev-bounces@lists.llvm.org" target="_blank">llvm-dev-bounces@lists.llvm.org</a>]
<b>On Behalf Of </b>Muhui Jiang via llvm-dev<br>
<b>Sent:</b> Wednesday, June 13, 2018 3:09 AM<br>
<b>To:</b> <a href="http://mayuyu.io" target="_blank">mayuyu.io</a><br>
<b>Cc:</b> llvm-dev<br>
<b>Subject:</b> Re: [llvm-dev] IR to binary address mapping</span><o:p></o:p></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Hi<o:p></o:p></p>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">However, frontend may also do various operations on the source code and one line number and column number could map to more than one binary address. Why LLVM IR cannot?<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Regrads<o:p></o:p></p>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">Muhui<o:p></o:p></p>
</div>
</div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">2018-06-12 23:18 GMT+08:00
<a href="http://mayuyu.io" target="_blank">mayuyu.io</a> <<a href="mailto:admin@mayuyu.io" target="_blank">admin@mayuyu.io</a>>:<o:p></o:p></p>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto">In theory that’s not exactly possible/accurate. Due to various operations in the Backend like Instruction Legalization, one IR instruction might got emitted into multiple assembly
instruction, for example<br>
<br>
Zhang<o:p></o:p></p>
<div>
<div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"><br>
> <span style="font-family:"MS Gothic"">在</span> 2018<span style="font-family:"MS Gothic"">年</span>6<span style="font-family:"MS Gothic"">月</span>12<span style="font-family:"MS Gothic"">日,</span>22:30<span style="font-family:"MS Gothic"">,</span>Muhui Jiang
via llvm-dev <<a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a>>
<span style="font-family:"MS Gothic"">写道:</span><br>
> <br>
> Hi<br>
> <br>
> I know that LLVM provide some debug API for us to know the source code information. For example, every IR instruction's source line number and column number.
<br>
> <br>
> However, are there any method to get a mapping from IR instruction to binary address directly. I don't want to use dwarf line mapping table as a bridge. I think the binary is generated by clang and llvm. I think there definitely is some information about
the mapping relationship between LLVM IR and the target binary address. Do anyone has suggestions? Many Thanks<br>
> <br>
> Regards<br>
> Muhui<o:p></o:p></p>
</div>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;margin-bottom:12.0pt">> _______________________________________________<br>
> LLVM Developers mailing list<br>
> <a href="mailto:llvm-dev@lists.llvm.org" target="_blank">llvm-dev@lists.llvm.org</a><br>
> <a href="http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev" target="_blank">
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev</a><o:p></o:p></p>
</div>
<p class="MsoNormal" style="mso-margin-top-alt:auto;mso-margin-bottom-alt:auto"> <o:p></o:p></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</div>
</div>
</body>
</html>