[llvm-dev] Printing PC-relative offsets - how to get the instruction length?

Oliver Stannard via llvm-dev llvm-dev at lists.llvm.org
Wed Mar 27 07:56:45 PDT 2019


Hi Mark,

For your first question, the MCInstPrinter has a reference to the MCInstrInfo
object for your target, so something like this should give you the instruction
encoding size in bytes:

  MII.get(Op.getOpcode()).getSize()

For your second question, it looks like the MCK_Imm8 operand class is matching
the immediate even when it is out of range. This should be checked by a
function in your assembly parser. The ImmediateAsmOperand<"Imm8"> record (which
you didn't show the definition of, so I'm guessing a bit here) should have a
PredicateMethod value giving the name of that function. If that's not
specified, the default function name is based on the tablegen class name, which
won't be correct for both Imm8 and Imm16. Note that the ImmLeaf in the code
snippet you posted is only used for code generation from IR, not by the
assembler.

Oliver

> -----Original Message-----
> From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Mark
> R V Murray via llvm-dev
> Sent: 25 March 2019 16:19
> To: llvm-dev at lists.llvm.org
> Subject: [llvm-dev] Printing PC-relative offsets - how to get the
> instruction length?
> 
> Hi
> 
> In my MC6809 backend, in
> llvm/lib/Target/MC6809/InstPrinter/MC6809InstPrinter.cpp, I have the
> routine
> 
> void MC6809InstPrinter::printPCRelImmOperand(const MCInst *MI, unsigned
> OpNo, raw_ostream &O) {
>   const MCOperand &Op = MI->getOperand(OpNo);
> ZZ
>   if (Op.isImm()) {
>     int64_t Imm = Op.getImm() + 2;  <<<========================
>     O << "$";
>     if (Imm >= 0)
>       O << '+';
>     O << Imm;
>   } else {
>     assert(Op.isExpr() && "unknown pcrel immediate operand");
>     Op.getExpr()->print(O, &MAI);
>   }
> }
> 
> Which works well enough except for the constant 2 that I've arrowed - it
> needs to be the length of the binary instruction in bytes. The MC6809 has
> a *LOT* of variability here, so a case statement would be a right pain to
> maintain.
> 
> An answer is tantalisingly close:
> 
> $ bin/llvm-mc -triple mc6809 -show-inst-operands -show-inst -show-
> encoding <<< "lda 0,pc"
> 	.text
> <stdin>:1:1: note: parsed instruction: ['lda', 0, <register 13>]
> lda 0,pc
> ^
> 	lda	$+2,pc                  ; encoding: [0xa6,0x8c,0x00]
> <<===========
>                                         ; <MCInst #1849 LDAi8oPC
>                                         ;  <MCOperand Imm:0>
>                                         ;  <MCOperand Imm:0>>
> 
> The "encoding:" knows that I have a three-byte instruction, but that is
> generated by another chunk of code miles away. I suppose I could
> replicate that, but it seems wasteful. Is there a better way, not
> involving nasty layering violations, to get the length of an instruction
> in bytes in the context of
> llvm/lib/Target/*/InstPrinter/*InstPrinter.cpp?
> 
> Also, both 8 and 16-bit variants are possible. The instruction picked is
> LDAi8oPC with is the 8-bit offset version. If I supply a bigger offset:
> 
> $ bin/llvm-mc -triple mc6809 -show-inst-operands -show-inst -show-
> encoding <<< "lda 1000,pc"
> 	.text
> <stdin>:1:1: note: parsed instruction: ['lda', 1000, <register 13>]
> lda 1000,pc
> ^
> 	lda	$+1002,pc               ; encoding: [0xa6,0x8c,0xe8]
>                                         ; <MCInst #1849 LDAi8oPC
>                                         ;  <MCOperand Imm:0>
>                                         ;  <MCOperand Imm:1000>>
> 
> I still get the 8-bit variant instead of LDAi16oPC, and the operand is
> truncated.
> 
> The TableGen-generated .inc file has
> 
> { 444 /* lda */, MC6809::LDAi8oPC, Convert__imm_95_0__Imm81_0,
> AMFBS_None, { MCK_Imm8, MCK_PC }, },
> { 444 /* lda */, MC6809::LDAi16oPC, Convert__imm_95_0__Imm161_0,
> AMFBS_None, { MCK_Imm16, MCK_PC }, },
> 
> ... so how do I get the 16-bit variant with MCK_Imm16 selected instead?
> 
> The instructions are defined as
> 
> def LDAi8oPC : MC6809LoadIndexed_i8oPC_P1<
>                 (outs GR8:$dst8),
>                 (ins pcoffset8:$offset),
>                 !strconcat("lda", "\t", "${offset}", ",", "pc"),
>                 0x00,
>                 0xA6,
>                 []
> > { let Inst{23-16} = offset{7-0}; let Inst{15} = 0b1; let Inst{14-13} =
> 0b00; let Inst{12-8} = 0b01100; let Inst{7-0} = opcode; }
> 
> def LDAi16oPC : MC6809LoadIndexed_i16oPC_P1<
>                 (outs GR8:$dst8),
>                 (ins pcoffset16:$offset),
>                 !strconcat("lda", "\t", "${offset}", ",", "pc"),
>                 0x00,
>                 0xA6,
>                 []
> > { let Inst{31-24} = offset{7-0}; let Inst{23-16} = offset{15-8}; let
> Inst{15} = 0b1; let Inst{14-13} = 0b00; let Inst{12-8} = 0b01101; let
> Inst{7-0} = opcode; }
> 
> and I have
> 
> def pcoffset8 : Operand<i8>, ImmLeaf<i8, [{ return Immediate >= -128 &&
> Immediate <= 127; }]> {
>   let PrintMethod = "printPCRelImmOperand";
>   let MIOperandInfo = (ops i8imm);
>   let ParserMatchClass = ImmediateAsmOperand<"Imm8">;
>   let EncoderMethod = "getMemOpValue";
>   let DecoderMethod = "DecodeMemOperand";
> }
> 
> def pcoffset16 : Operand<i16>, ImmLeaf<i16, [{ return Immediate >= -32768
> && Immediate <= 32767; }]> {
>   let PrintMethod = "printPCRelImmOperand";
>   let MIOperandInfo = (ops i16imm);
>   let ParserMatchClass = ImmediateAsmOperand<"Imm16">;
>   let EncoderMethod = "getMemOpValue";
>   let DecoderMethod = "DecodeMemOperand";
> }
> 
> M
> --
> Mark R V Murray
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


More information about the llvm-dev mailing list