[LLVMdev] Inspecting target-specific opcodes in machine function pass

Mon Mar 9 10:05:36 PDT 2015

There has been previous NOP insertion work, for example:
http://reviews.llvm.org/D3392
--paulr

From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Peter Edelstein
Sent: Sunday, March 08, 2015 1:57 PM
To: Tim Northover
Cc: LLVM Developers Mailing List
Subject: Re: [LLVMdev] Inspecting target-specific opcodes in machine function pass

by the way, everything mentioned in the last mail is of course X86 specific, because that is what I am currently targeting.

Cheers

2015-03-08 21:49 GMT+01:00 Peter Edelstein <peter.c.edelstein at gmail.com<mailto:peter.c.edelstein at gmail.com>>:
Hello,

thank you very much for answering. I am trying to do the following: get the encoding for each instruction and if that encoding contains a C3 byte, insert a NOP instruction (or multiple NOP instructions, or any other instructions) before that instruction. The idea behind this is to protect against ROP (Return Oriented Programming) attacks. By inserting a NOP the attacker can no longer abuse alignment to get a useful gadget. I thought a machine function pass would be sufficient to accomplish this. However, now I realize that this was probably a rather naive thought.

Can you think of any other approaches for achieving this?

Cheers.

2015-03-08 21:21 GMT+01:00 Tim Northover <t.p.northover at gmail.com<mailto:t.p.northover at gmail.com>>:
Hi,

> As the comment suggests I want to inspect the target-specific opcode of each
> instruction. By opcode I mean the actual machine code (=encoding of that in
> struction as an array of bytes), not the integer descriptor returned by
> I->getOpcode().

That's not generally possible. The best you might be able to do would
be to lower each MachineInstr to an MCInst and emit its encoding via
an MCCodeEmitter. But that has quite a few problems:

  + There's virtually no hope before register allocation. You'd
probably get an assertion if you're lucky.
  + Many targets don't have really separate "opcode" and "operand"
bytes. Just 32-bit (say) instructions that have
  + A MachineInstr can become 0, 1, or many real target .instructions
  + What you get back may or may not be what's emitted finally anyway.
  + It's slow and redundant work.

What are you trying to do with the opcode? There may be a better way.

Cheers.

Tim.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150309/44d47cde/attachment.html>