[llvm-commits] [patch] ARM/MC/ELF add new stub for movt/movw in ARMFixupKinds

Thu Nov 18 10:46:31 PST 2010

Yep, this is definitely the kind of thing I'm thinking. Looking good. Some comments below.

Index: lib/Target/ARM/ARMInstrInfo.td
===================================================================

--- lib/Target/ARM/ARMInstrInfo.td      (revision 119149)
+++ lib/Target/ARM/ARMInstrInfo.td      (working copy)@@ -425,6 +425,17 @@
   let PrintMethod = "printAddrModeImm12Operand";
   let MIOperandInfo = (ops GPR:$base, i32imm:$offsimm);
 }
+
+
+// For movt 
+// 
+def addrmode_movt : Operand<i32> {
+  let EncoderMethod = "getAddrModeMovtOpValue";
+  // let PrintMethod = "";
+  // let MIOperandInfo = (ops i32imm:$imm);
+}
+
+

A bit more verbose comment would be good. Also, we already have "lo16AllZero" as a predicate. If it works, it would be good to combine these. For naming, it's not really an addressing mode. Your usage here isn't inconsistent with the rest of the backend, as it's already abused elsewhere, I'd just like to avoid propagating the issue. Something like "movt_imm" perhaps?

Trailing whitespace and extraneous vertical whitespace. There are a few other examples in the patch. For brevity, I won't call them out individually. Just do a once-over of the diffs and fix them pre-commit, please.

Index: lib/Target/ARM/ARMCodeEmitter.cpp
===================================================================
--- lib/Target/ARM/ARMCodeEmitter.cpp   (revision 119149)
+++ lib/Target/ARM/ARMCodeEmitter.cpp   (working copy)
@@ -214,6 +214,12 @@
       Binary |= (Reg << 13);
       return Binary;
     }
+
+    unsigned getAddrModeMovtOpValue(const MachineInstr &MI, unsigned Op) const {
+      assert(0 && "Unsupported");
+      return 0;
+    }
+

This can't assert() if we want to even pretend to keep the current JIT working. It doesn't have to do anything since the MOVTi16 operand is already handled in emitDataProcessingInstruction(), so it should just return 0 and be done. The MC based stuff wants to avoid all of the dedicated .cpp code that's part of the current JIT ARMCodeEmitter, so it needs the getOpValue functions to do stuff.

 uint32_t ARMMCCodeEmitter::
+getAddrModeMovtOpValue(const MCInst &MI, unsigned OpIdx,
+                        SmallVectorImpl<MCFixup> &Fixups) const {
+  // {20-16} = imm{15-12}
+  // {11-0}  = imm{11-0}
+  const MCOperand &MO = MI.getOperand(OpIdx); 
+  if (MO.isImm()) {
+    return static_cast<unsigned>(MO.getImm());
+  } else if (MO.isFPImm()) {
+    return static_cast<unsigned>(APFloat(MO.getFPImm())
+                                 .bitcastToAPInt().getHiBits(32).getLimitedValue());

An FP immediate should never get here. The necessary bitcasting should have taken place at a higher level.

+  } else if (MO.isExpr()) {
+    const MCExpr *Expr = MO.getExpr();
+    switch (Expr->getKind()) {
+    case MCExpr::SymbolRef: {
+      const MCSymbolRefExpr &SRE = cast<MCSymbolRefExpr>(*Expr);

You can just use a dyn_cast<> here, I believe.

if (const MCSymbolRefExpr *Expr = dyn_cast<MCSymbolRefExpr>(MO.getExpr())) {
  MCFixupKind Kind;
  switch (Expr->getKind()) {
  return 0;
} 
llvm_unreachable("Unexpected expression type in MOVT operand!");

+#ifndef NDEBUG
+  errs() << MO;
+#endif
+  llvm_unreachable(0);
+  return 0;
+};

Just put a message in llvm_unreachable() like the above. No need to dump the operand here.

+  // The next two are for the movt/movw pair (armv7)

It's ARMv6T2 constrained, not ARMv7 (i.e., there are pre-v7 chips that have these instructions).

+  // the 16bit fields are split into a 4HI:xxxx:12LO 

Using the standard .td file syntax for specifying the bit-range might make this more readable. It took me a moment to realize what this was saying, anyway. Not a huge deal, just something to consider. Also, the bits of the immediate are split up different in T2 than in ARM mode.

+  // Fixme: do we need new ones for Thumb?

If the fixup value implies the bit-layout of the fixup to be done (which I think it does/should(?)), then yes, as the T2 encodings are different.

+  fixup_arm_movt_hi16, // :upper16: Encoding A1
+  fixup_arm_movw_lo16, // :lower16: Encoding A2

If referencing ARM documentation encodings, giving the section number also would help. "A8.6.99 Encoding A1", for example. True completeness would imply a document revision indicator as well, but that might be overkill.


-Jim

On Nov 17, 2010, at 9:03 AM, Jason Kim wrote:

> On Wed, Nov 17, 2010 at 8:45 AM, Jason Kim <jasonwkim at google.com> wrote:
>> -llvmdev
>> +llvmcommits
>> 
>> On Tue, Nov 16, 2010 at 4:27 PM, Jim Grosbach <grosbach at apple.com> wrote:
>>> 
>>> On Nov 16, 2010, at 4:01 PM, Jason Kim wrote:
>>> 
>>> +llvmdev
>>> -llvmcommits
>>> 
>>> On Fri, Nov 12, 2010 at 8:03 AM, Jim Grosbach <grosbach at apple.com> wrote:
>>> 
>>> Sorta. getBinaryCodeForInst() is auto-generated by tablegen, so shouldn't be
>>> modified directly. The target can register hooks for instruction operands
>>> for any special encoding needs, including registering fixups, using the
>>> EncoderMethod string. For an example, have a look at the LDRi12 instruction
>>> and how it registers a fixup for the addrmode_imm12 operand when it needs
>>> one.
>>> 
>>> Hi Jim,. follow up question for ya:
>>> 
>>> The current movt/movw pair (as defined in ARMInstrInfo.td)  does not
>>> use EncoderMethod string to declare  a special case handler.
>>> 
>>> There's two parts to how this works. First, is the handling of the movt/movw
>>> pair for instruction selection, and then the handling of the upper/lower
>>> operand flags for those instructions.
>>> MOVi32imm is expanded as a pseudo instruction in ARMExpandPseudos pass, and
>>> any other patterns that do similar things should also be expanded there. For
>>> a few special cases where we need the post-RA scheduler to handle the
>>> instruction sequences as an atomic unit (PICLDR, PICADD, the
>>> eh.sjlj.setjmp/longjmp patterns, for example), we expand them even later,
>>> during MC lowering in ARMAsmPrinter.cpp. As a result, the movt and movw
>>> instructions should make it into the code emitter as separate entries, not
>>> as a composite pair.
>>> For operands that aren't compile-time constants, the operand flag gets set
>>> in the expansion to the MOVT/MOVW instructions:
>>>       if (MO.isImm()) {
>>>         unsigned Imm = MO.getImm();
>>>         unsigned Lo16 = Imm & 0xffff;
>>>         unsigned Hi16 = (Imm >> 16) & 0xffff;
>>>         LO16 = LO16.addImm(Lo16);
>>>         HI16 = HI16.addImm(Hi16);
>>>       } else {
>>>         const GlobalValue *GV = MO.getGlobal();
>>>         unsigned TF = MO.getTargetFlags();
>>>         LO16 = LO16.addGlobalAddress(GV, MO.getOffset(), TF |
>>> ARMII::MO_LO16);
>>>         HI16 = HI16.addGlobalAddress(GV, MO.getOffset(), TF |
>>> ARMII::MO_HI16);
>>>       }
>>> That's then marked with VK_ARM_LO16/VK_ARM_HI16 by ARMMCInstLower.cpp when
>>> it's looking at symbol reference operands.
>>> 
>>> At the current time, for the assembly printing,
>>> MCAsmStreamer::EmitInstruction(const MCInst &Inst) calls out to
>>>  MCExpr::print(raw_ostream &OS)
>>>   which then calls out to MCSymbolRefExpr::getVariantKindName() to
>>> print the magic :lower16: and :upper16: asm tags for .s emission
>>> Currently, movt/movw emission works correctly in .s, but not in .o emission
>>> 
>>> This lead me to believe that the correct place to put the code to handle
>>> MCSymbolRefExpr::VK_ARM_(HI||LO)16 for the .o path was to place a case
>>> in getMachineOpValue() (i.e. not
>>> ARMMCCodeEmitter::getBinaryCodeForInstr like I mistakenly wrote in my
>>> prior email.)
>>> 
>>> Are you implying that the movt/movw instruction definition in the .td
>>> files need to be fixed up instead to declare a new special case for .o
>>> emission via the EncoderMethod string, for the .o emission of
>>> movt/movw to be considered "correct"?
>>> 
>>> A custom encoder may be the best way to differentiate the kinds of fixups
>>> we'll need for these operands (e.g., using LO16 may generate different
>>> fixups depending on which instruction it's used for, which implies different
>>> operand node types, and thus different encoder methods). In any case, the
>>> encoder, be it in getMachineOpValue() or a specialized method, will need to
>>> differentiate between an immediate operand and a symbol reference expression
>>> operand. In the latter case, it'll create a fixup for the lo/hi variantkind
>>> if one is present.
>> 
>> Got ya. Is this more in tune with what you had in mind? (applies cleanly to
>> -r119149)
> 
> Yikes. As sent, patch is missing two enum definitions
> The update is here.
> Sorry for the noise (again ):
> -jason
> 
>> 
>> Thanks!
>> 
>>> -Jim
>>> 
>>> On Nov 11, 2010, at 7:06 PM, Jason Kim wrote:
>>> 
>>> Is getBinaryCodeForInst the best place to place the case for
>>> 
>>> supporting movt/movw fixup emission?
>>> 
>>> The call stack seems to be:
>>> 
>>> #0 ARMMCCodeEmitter::getBinaryCodeForInstr
>>> 
>>> #1 ARMMCCodeEmitter::EncodeInstruction
>>> 
>>> #2 MCELFStreamer::EmitInstToData
>>> 
>>> #3 MCObjectStreamer::EmitInstruction
>>> 
>>> #4 ARMAsmPrinter::EmitInstruction
>>> 
>>> <arm-mc-elf-s08.patch>_______________________________________________
>>> 
>>> llvm-commits mailing list
>>> 
>>> llvm-commits at cs.uiuc.edu
>>> 
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits
>>> 
>>> 
>>> 
>>> 
>> 
> <arm-mc-elf-s09b-movt.patch>