[llvm-dev] Writing a Pass in LLVM MC (Machine Code) level to Analyze Assembly Code

Aaron Smith via llvm-dev llvm-dev at lists.llvm.org
Tue Nov 26 23:00:52 PST 2019


The MC layer doesn’t have passes. There is a method called emitIntruction() which is called one by one to create the MCInst. 

In the past I have accomplished what you’d like by overloading the methods in ObjectStreamer to buffer all the MCInst for a function. Then doing analysis on the buffered instructions.

Here’s a link about how instructions are lowered which might shed some light on how all this works.

https://eli.thegreenplace.net/2012/11/24/life-of-an-instruction-in-llvm



> On Nov 27, 2019, at 5:51 AM, Lele Ma <lelema.cn at gmail.com> wrote:
> 
> 
> Hi All,
> 
> A self-follow up and rephrase of my previous question with updated subject:
> 
> What I want to do is to analyze hand-written assembly code with 'full details' where semantics of each instruction can be known in LLVM passes. Many of such instructions have no corresponding counterparts in IR/MIR forms, such as 'syscall' 'iret', etc. At MC level, such assembly code can be translated to MCInst easily since this level is closest to the assembly code. Therefore, I am thinking to write a pass at MC level instead of IR/MIR.
> 
> However, when I am searching to learn the MC level passes, I cannot find any related classes in LLVM infrastructure (such as FunctionPass at IR level; MachineFunctionPass at MIR pass). Could anyone direct me where I should start to write a MC level pass?
> 
> Best Regards,
> Lele
> 
> 
>> On Mon, Nov 25, 2019 at 5:24 PM Lele Ma <lelema.cn at gmail.com> wrote:
>> Thank you for the instructions, Aaron and Nicolai!
>> 
>> Raising a binary to LLVM IR, or raising to MIR is a reasonable solution for me. However, given Nicolai's information that not all target-specific instructions are representable in MIR, I got two questions that need your help:
>> 
>> 1. Why MIR does not necessarily represent all target specific instructions for certain hardware? If someone added those support, will this violate some design principles of MIR?
>> 
>> 2. Instead of IR/MIR raising, I am wondering whether a third path is possible to solve the problem of analyzing assembly code:
>>     - write simple LLVM pass in the `MC` layer to process information not available in MIR/IR and 
>>     - passing analysis result from IR/MIR pass to the MC layer pass where we can enhance the result with missing representations.
>> So the second question is whether it is possible to write passes directly in the MC layer? If so, is there any documentation or example for that?
>> 
>> 
>> Thank you in advance!
>> 
>> Best Regards,
>> Lele
>> 
>> 
>>> On Mon, Nov 25, 2019 at 9:15 AM Aaron Smith <aaron.lee.smith at gmail.com> wrote:
>>> Llvm-mctoll will raise a binary back to LLVM IR. 
>>> Not exactly what you want but it might be something you can leverage.
>>> 
>>> https://github.com/microsoft/llvm-mctoll 
>>> 
>>>> On Mon, Nov 25, 2019 at 1:19 PM Nicolai Hähnle via llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>>> On Thu, Nov 21, 2019 at 3:37 AM Lele Ma via llvm-dev
>>>> <llvm-dev at lists.llvm.org> wrote:
>>>> > My goal is to write LLVM Machine IR (MIR) passes to analyze the assembly source code. But it seems I need to find a way to translate the handwritten assembly code into MIR format first.
>>>> >
>>>> > Is there any materials, or any modules in LLVM source code, that can help to translate assembly code into LLVM MIR for analysis?
>>>> >
>>>> > Or is there any easier ways to analyze assembly code in MIR passes without translating it?
>>>> 
>>>> MachineIR is designed for code generation, not for general assembly
>>>> representation. MIR is even not necessarily able to represent all
>>>> assembly instructions that a target's hardware supports. The
>>>> disassembler produces MCInsts, and if you wanted to go from there back
>>>> to MachineIR, you'd have to write your own target-specific code to do
>>>> so.
>>>> 
>>>> Cheers,
>>>> Nicolai
>>>> 
>>>> 
>>>> 
>>>> >
>>>> > Best Regards,
>>>> > Lele Ma
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > LLVM Developers mailing list
>>>> > llvm-dev at lists.llvm.org
>>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Lerne, wie die Welt wirklich ist,
>>>> aber vergiss niemals, wie sie sein sollte.
>>>> _______________________________________________
>>>> LLVM Developers mailing list
>>>> llvm-dev at lists.llvm.org
>>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191127/20daa9e1/attachment.html>


More information about the llvm-dev mailing list