[LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

Thu Mar 13 01:21:47 PDT 2014

Hello,

Some of the backends seem to be combining positional and named operands when defining some instructions such that some of the positional operands overlap with some of the named operands. I suspect this is not intentional; here's an example:

AArch64 has the following instruction definition:

SMULHxxx {
  field bits<32> Inst = { 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, Rm{4}, Rm{3}, Rm{2}, Rm{1}, Rm{0}, 0, Ra{4}, Ra{3}, Ra{2}, Ra{1}, Ra{0}, Rn{4}, Rn{3}, Rn{2}, Rn{1}, Rn{0}, Rd{4}, Rd{3}, Rd{2}, Rd{1}, Rd{0} };
  ...
  dag OutOperandList = (outs GPR64:$Rd);
  dag InOperandList = (ins GPR64:$Rn, GPR64:$Rm);
  string AsmString = "smulh	$Rd, $Rn, $Rm";
  list<dag> Pattern = [(set i64:$Rd, (mulhs i64:$Rn, i64:$Rm))];
  ...
  bits<5> Rd = { ?, ?, ?, ?, ? };
  bits<5> Rn = { ?, ?, ?, ?, ? };
  bits<5> Rm = { ?, ?, ?, ?, ? };
  bits<5> Ra = { ?, ?, ?, ?, ? };
...
}

So how do the operands in OutOperandList and InOperandList get mapped to the variables used to encode the instruction (Rd,Rn,Rm,Ra)? The first three match by name (as they appear explicitly in OutOperandList and InOperandList), but what about Ra? Ra contributes to defining the bits in Inst, and because there is, by default, no overlap checking, it also gets mapped to the first operand: GPR64:$Rd. The result, from AArch64GenMCCodeEmitter.inc is:

    case AArch64::SMULHxxx:
    case AArch64::UMULHxxx: {
      // op: Rd
      op = getMachineOpValue(MI, MI.getOperand(0), Fixups, STI);
      Value |= op & UINT64_C(31);
      // op: Rn
      op = getMachineOpValue(MI, MI.getOperand(1), Fixups, STI);
      Value |= (op & UINT64_C(31)) << 5;
      // op: Rm
      op = getMachineOpValue(MI, MI.getOperand(2), Fixups, STI);
      Value |= (op & UINT64_C(31)) << 16;
      // op: Ra
      op = getMachineOpValue(MI, MI.getOperand(0), Fixups, STI);
      Value |= (op & UINT64_C(31)) << 10;
      Value = fixMulHigh(MI, Value, STI);
      break;
    }

This may be correct (I have no idea), but even if it is, it seems like an odd way to get this behavior: depending on the fact that, after operands are matched by name, the remaining operands are matched by position such that they might overlap with those operands matched by name.

In any case, I believe that the only in-tree target that still really needs the positional operand support is PowerPC (because the named-operand matching scheme still can't deal with the complex operands that PPC uses for handling memory operands), and PowerPC needs the behavior that named and positional operands are disjoint so that we can start transitioning toward using named mapping, so I've added a new TableGen target bit to disable this problematic overlapping operand behavior (noNamedPositionallyEncodedOperands) in r203767.

This means that in lib/Target/PowerPC/PPC.td, we have:

def PPCInstrInfo : InstrInfo {
...

  let noNamedPositionallyEncodedOperands = 1;

...
}

I'd like those maintaining the current backends (especially AArch64, Mips, AMDGPU, which I know to be problematic in this regard) to try setting this and: a) fix those definitions that are problematic or b) explain to me that the current behavior is useful.

Thanks again,
Hal

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory