[LLVMdev] How to Find Instruction Encoding for a MachineInstr

Sergei Larin slarin at codeaurora.org
Wed Oct 24 13:39:26 PDT 2012


Jim and everyone else,

   I have somewhat related question then. I have a similar (to x86-64)
mechanism to handle long jumps on Hexagon.
This means, that on some occasions my jump instruction is 4 bytes long, on
others 8 bytes. It is important for reasons other than trampoline insertion
to know which version am I dealing with. Bundling is a prime example -
extended address affects the number of instructions in a bundle. I can
compute an estimated jump distance every time I need it based on available
CFG layout, but... it is rather expensive, and the test comes about rater
often.

  Here is the question - given the current infrastructure, how I can
tag/mark a given branch instruction as having "extended" address mode
without introducing a new opcode for it?

  There are several unintuitive side effects, so if someone has already
solved this - I would love to know how.
Thanks.

Sergei

---
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by
The Linux Foundation

> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu]
> On Behalf Of Jim Grosbach
> Sent: Wednesday, October 24, 2012 1:52 PM
> To: Joshua Cranmer
> Cc: llvmdev at cs.uiuc.edu
> Subject: Re: [LLVMdev] How to Find Instruction Encoding for a
> MachineInstr
> 
> 
> On Oct 23, 2012, at 6:22 PM, Joshua Cranmer <pidgeot18 at gmail.com>
> wrote:
> 
> > On 10/23/2012 1:58 PM, John Criswell wrote:
> >> Dear All,
> >>
> >> I'm enhancing a MachineFunctionPass that enforces control-flow
> integrity.  One of the things I want to do is to set the alignment of
> an instruction (by adding NOPs before it in the MachineBasicBlock or by
> emitting an alignment directive to the assembler) if it causes a
> specific sequence of bytes to be generated at a specific alignment.
> The goal is to ensure that sequences of bytes used to label valid
> targets of an indirect branch (e.g., a return instruction) do not
> appear at a given alignment anywhere in a program other than for where
> I inserted them explicitly.
> >>
> >> It looks like MachineInstr has a method for finding the length of
> the instruction's binary encoding, but I didn't see a method for
> finding the exact bytes that would be emitted from the MachineInstr.
> Is there a way to do this in the MachineFunctionPass/MachineInstr
> infrastructure, or do I need to use something like the MC classes?
> >>
> >
> > As I recall (I haven't played this deep with MachineInstrs for close
> to a year), it's not necessarily knowable what the length is or the
> exact bytes that would be emitted since some of them depend on
> information not known until the final assembly emission pass. An
> example here is the x86 jmp instruction: the choice between near and
> long jumps (and hence 2 bytes or 5 bytes on x86-64) is not made until
> the actual conversion to MCInst and after applying all of the fixups--
> which only happens deep within the bowels of the AsmPrinter pass.
> 
> Right. See X86AsmBackend::mayNeedRelaxation() and friends for the gory
> details.
> 
> -jim
> 
> >
> > --
> > Joshua Cranmer
> > News submodule owner
> > DXR coauthor
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev




More information about the llvm-dev mailing list