[LLVMdev] help converting llvm metadata into dwarf tags

Tue Sep 7 09:59:25 PDT 2010

hi Devang and thanks for the tips, i finally managed to fit all the pieces
together into something that seems to work.

It's probably not the best (or even correct!) way of doing it but here's a
brief overview for reference:

An instruction in the LLVM IR gets converted into an SDNode in the DAG then
later into a MachineInstr.
I'd already attached my own MDNodes to IR instructions I was interested in.
I wanted to propagate that info to the final binary.

I Added a field to the SelectionDAGBuilder holding the current metadata
which I update in SelectionDAGISel::SetDebugLoc()  for every IR instruction.
Next, in SelectionDAGBuilder::visit() i transfer the current instruction's
metadata from the DAGBuilder to the instruction's SDNode.

In InstrEmitter::EmitNode() I copy the metadata from the SDNode to the
MachineInstr. DwarfDebug::endModule() creates my user-defined DIE (after
defining my own DW_TAG and DW_AT IDs in Dwarf.h) and adds it to the ModuleCU
(for simplicity I'm adding my DIEs to the module's debug_info section)

I Added a few lines to Dwarf.cpp for emitting the correct name for my new
DW_TAG and AT (useful when looking at commented assembly)

Finally, in AsmPrinter::processDebugLoc() I grab the metadata from the
MachineInstr and pass it to the DwarfWriter to add it to its DwarfDebug
member. When the assembly is emitted, the debug_info section contains my new
dwarf DIE which I've managed to retrieve from the binary with a dwarf
consumer.

cheers,
rw.

On 24 August 2010 02:46, Devang Patel <devang.patel at gmail.com> wrote:

> Hi Roger,
>
> On Mon, Aug 23, 2010 at 4:01 PM, Roger Wang <innit42 at gmail.com> wrote:
>
>> Dear all,
>>
>> I'd like to find the memory location of certain instructions in a
>> compiled/linked binary. During the IR phase, I tag instructions I'm
>> interested in with LLVM'-2.7's new metadata (MDNodes with an identifiable
>> ID). I'd now like to propagate that data to the assembly via a custom DWARF
>> tag I attach to each X86 instruction created from a tagged IR instruction.
>> This will then find its way at assembly time into the binary from where I
>> retrieve it (by locating my custom tags with a DWARF consumer and dumping
>> the addresses of the instruction they're attached to).
>>
>> Does this sound reasonable?
>>
>> I've completed the first part, attaching the MDNodes to IR
>> instructions but I'm a bit overwhelmed by all the backend stuff.
>>
> How can I identify which IR instruction an X86 instruction came from (with
>> a view to attaching an identifying DW_TAG to it)?
>>
>
> 1) LLVM IR instructions are converted into  MachineInstr during instruction
> selection time. At this point you need to transfer your custom metadata to
> MachineInstr. See how DebugLoc is transfered (in CodeGen/SelectionDAG
> directory).
>
> 2) At AsmPrint time, while emitting your assembly instructions you have
> access to coresponding MachineInstr and any custom metadata attached with
> it.
>
>>
>> I've found the tag definitions in include/llvm/Support/Dwarf.h and added
>> my own.
>> lib/CodeGen/AsmPrinter/DwarfDebug.cpp seems to be the only place that
>> emits dwarf data into the assembly stream.
>>
>
> See have DwarfDebug.cpp handles DebugLoc attached with each instruction
> (::beginScope() and ::endScope()).
>
>
>>  It also seems to create a DebugInfoFinder which accesses the IR
>> instructions.
>>
>
> This path will allow you to browse entire function and collect info which
> you can use later on.
>
> -
> Devang
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100907/257e767d/attachment.html>