[LLVMdev] How to Find Instruction Encoding for a MachineInstr

Joshua Cranmer pidgeot18 at gmail.com
Tue Oct 23 18:22:30 PDT 2012

On 10/23/2012 1:58 PM, John Criswell wrote:
> Dear All,
> I'm enhancing a MachineFunctionPass that enforces control-flow 
> integrity.  One of the things I want to do is to set the alignment of 
> an instruction (by adding NOPs before it in the MachineBasicBlock or 
> by emitting an alignment directive to the assembler) if it causes a 
> specific sequence of bytes to be generated at a specific alignment.  
> The goal is to ensure that sequences of bytes used to label valid 
> targets of an indirect branch (e.g., a return instruction) do not 
> appear at a given alignment anywhere in a program other than for where 
> I inserted them explicitly.
> It looks like MachineInstr has a method for finding the length of the 
> instruction's binary encoding, but I didn't see a method for finding 
> the exact bytes that would be emitted from the MachineInstr.  Is there 
> a way to do this in the MachineFunctionPass/MachineInstr 
> infrastructure, or do I need to use something like the MC classes?

As I recall (I haven't played this deep with MachineInstrs for close to 
a year), it's not necessarily knowable what the length is or the exact 
bytes that would be emitted since some of them depend on information not 
known until the final assembly emission pass. An example here is the x86 
jmp instruction: the choice between near and long jumps (and hence 2 
bytes or 5 bytes on x86-64) is not made until the actual conversion to 
MCInst and after applying all of the fixups--which only happens deep 
within the bowels of the AsmPrinter pass.

Joshua Cranmer
News submodule owner
DXR coauthor

More information about the llvm-dev mailing list