[llvm-dev] Writing a Pass in LLVM MC (Machine Code) level to Analyze Assembly Code

Lele Ma via llvm-dev llvm-dev at lists.llvm.org
Tue Nov 26 21:50:48 PST 2019


Hi All,

A self-follow up and rephrase of my previous question with updated subject:

What I want to do is to analyze hand-written assembly code with 'full
details' where semantics of each instruction can be known in LLVM passes.
Many of such instructions have no corresponding counterparts in IR/MIR
forms, such as 'syscall' 'iret', etc. At MC level, such assembly code can
be translated to MCInst easily since this level is closest to the assembly
code. Therefore, I am thinking to write a pass at MC level instead of
IR/MIR.

However, when I am searching to learn the MC level passes, I cannot find
any related classes in LLVM infrastructure (such as FunctionPass at IR
level; MachineFunctionPass at MIR pass). Could anyone direct me where I
should start to write a MC level pass?

Best Regards,
Lele


On Mon, Nov 25, 2019 at 5:24 PM Lele Ma <lelema.cn at gmail.com> wrote:

> Thank you for the instructions, Aaron and Nicolai!
>
> Raising a binary to LLVM IR, or raising to MIR is a reasonable solution
> for me. However, given Nicolai's information that not all target-specific
> instructions are representable in MIR, I got two questions that need your
> help:
>
> 1. Why MIR does not necessarily represent all target specific instructions
> for certain hardware? If someone added those support, will this violate
> some design principles of MIR?
>
> 2. Instead of IR/MIR raising, I am wondering whether a third path is
> possible to solve the problem of analyzing assembly code:
> *    - write simple LLVM pass in the `MC` layer to process information not
> available in MIR/IR and *
> *    - passing analysis result from IR/MIR pass to the MC layer pass where
> we can enhance the result with missing representations.*
> So the second question is whether it is possible to write passes directly
> in the MC layer? If so, is there any documentation or example for that?
>
>
> Thank you in advance!
>
> Best Regards,
> Lele
>
>
> On Mon, Nov 25, 2019 at 9:15 AM Aaron Smith <aaron.lee.smith at gmail.com>
> wrote:
>
>> Llvm-mctoll will raise a binary back to LLVM IR.
>> Not exactly what you want but it might be something you can leverage.
>>
>> https://github.com/microsoft/llvm-mctoll
>>
>> On Mon, Nov 25, 2019 at 1:19 PM Nicolai Hähnle via llvm-dev <
>> llvm-dev at lists.llvm.org> wrote:
>>
>>> On Thu, Nov 21, 2019 at 3:37 AM Lele Ma via llvm-dev
>>> <llvm-dev at lists.llvm.org> wrote:
>>> > My goal is to write LLVM Machine IR (MIR) passes to analyze the
>>> assembly source code. But it seems I need to find a way to translate the
>>> handwritten assembly code into MIR format first.
>>> >
>>> > Is there any materials, or any modules in LLVM source code, that can
>>> help to translate assembly code into LLVM MIR for analysis?
>>> >
>>> > Or is there any easier ways to analyze assembly code in MIR passes
>>> without translating it?
>>>
>>> MachineIR is designed for code generation, not for general assembly
>>> representation. MIR is even not necessarily able to represent all
>>> assembly instructions that a target's hardware supports. The
>>> disassembler produces MCInsts, and if you wanted to go from there back
>>> to MachineIR, you'd have to write your own target-specific code to do
>>> so.
>>>
>>> Cheers,
>>> Nicolai
>>>
>>>
>>>
>>> >
>>> > Best Regards,
>>> > Lele Ma
>>> >
>>> >
>>> > _______________________________________________
>>> > LLVM Developers mailing list
>>> > llvm-dev at lists.llvm.org
>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>>
>>>
>>> --
>>> Lerne, wie die Welt wirklich ist,
>>> aber vergiss niemals, wie sie sein sollte.
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191127/a6fc42ff/attachment.html>


More information about the llvm-dev mailing list