[llvm-dev] IR to binary address mapping

Muhui Jiang via llvm-dev llvm-dev at lists.llvm.org
Wed Jun 13 09:09:34 PDT 2018


Hi

Since it’s midnight in my time zone and I will try to reply to you shortly
with mobile but may give you more information tomorrow.

First, thanks for your comments. I may try to google whether such kind of
machine IR address exists

Second, to derive a basic blocks binary address. I first use LLVM pass to
get the source line number and column number for every instruction inside
the block, The I will query the dwarf line mapping table to get the binary
address. I use the first instruction’s address as the block’s start
address. I found some exceptions. Then I try to found the lowest addresses
among all the instructions and set it as the block’s address. however, I
found that there are still some exceptions. Thus, I don’t know how to
generate a precise binary level control flow graph.

Regards
Muhui

<paul.robinson at sony.com>于2018年6月13日 周三下午11:53写道:

> I can imagine a machine-IR pass that could construct a CFG shortly before
> starting the AsmPrinter phase; I am pretty sure there is nothing that would
> affect the CFG after that point.  I don't know whether such an analysis
> exists currently.  I also don't know whether it would be straightforward to
> map an LLVM IR CFG to the machine-IR CFG; maybe not, as there are certainly
> machine-IR passes that do things like splitting and merging blocks.
>
> Deriving final binary addresses for blocks in the CFG would require that
> you track or insert labels, and then examine the final binary to determine
> the addresses for each of those labels.  I am unclear how you would
> correlate these values with the CFG, however.
>
> --paulr
>
>
>
> *From:* Muhui Jiang [mailto:jiangmuhui at gmail.com]
> *Sent:* Wednesday, June 13, 2018 11:12 AM
> *To:* Robinson, Paul
> *Cc:* mayuyu.io; llvm-dev
>
>
> *Subject:* Re: [llvm-dev] IR to binary address mapping
>
>
>
> Hi Paul
>
>
>
> Thanks for your comments. Suppose I can generate the control flow graph
> via LLVM Pass or the default option like '-dot-cfg' with opt. However, the
> control flow graph is based on llvm IR level. I would like to have a
> control flow graph based on binary level. Thus, I want to map the IR to
> binary address.
>
>
>
> As far as I know, we used to use the debug information to map the IR to
> source code and then use the dwarf line mapping table to map to binary
> address. However, I come across many problems. For example, dwarf mapping
> table's information is not complete. Sometimes the line number could even
> be zero. Besides, one line and column number could map to more than one
> binary address. Thus, I may need the mapping from IR to binary to give me a
> control flow graph on binary level. Do you have any comments or solutions?
> Many Thanks
>
>
>
> Regards
>
> Muhui
>
>
>
>
>
>
>
> 2018-06-13 23:05 GMT+08:00 <paul.robinson at sony.com>:
>
> We preserve the source line/column for two reasons.  First, so that any
> compiler diagnostic messages can point to a source location that caused the
> diagnostic; second, because debugging information in the final binary wants
> to be able to map machine instruction addresses back to source locations.
> There is never any need for the end-user to map machine instruction
> addresses back to IR instructions, so we don't maintain any information
> that could produce such a mapping.
>
> --paulr
>
>
>
> *From:* llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] *On Behalf Of *Muhui
> Jiang via llvm-dev
> *Sent:* Wednesday, June 13, 2018 3:09 AM
> *To:* mayuyu.io
> *Cc:* llvm-dev
> *Subject:* Re: [llvm-dev] IR to binary address mapping
>
>
>
> Hi
>
>
>
> However, frontend may also do various operations on the source code and
> one line number and column number could map to more than one binary
> address. Why LLVM IR cannot?
>
>
>
> Regrads
>
> Muhui
>
>
>
> 2018-06-12 23:18 GMT+08:00 mayuyu.io <admin at mayuyu.io>:
>
> In theory that’s not exactly possible/accurate. Due to various operations
> in the Backend like Instruction Legalization, one IR instruction might got
> emitted into multiple assembly instruction, for example
>
> Zhang
>
>
> > 在 2018年6月12日,22:30,Muhui Jiang via llvm-dev <llvm-dev at lists.llvm.org>
> 写道:
> >
> > Hi
> >
> > I know that LLVM provide some debug API for us to know the source code
> information. For example, every IR instruction's source line number and
> column number.
> >
> > However, are there any method to get a mapping from IR instruction to
> binary address directly. I don't want to use dwarf line mapping table as a
> bridge. I think the binary is generated by clang and llvm. I think there
> definitely is some information about the mapping relationship between LLVM
> IR and the target binary address. Do anyone has suggestions? Many Thanks
> >
> > Regards
> > Muhui
>
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180614/eaf6702d/attachment.html>


More information about the llvm-dev mailing list