[LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)

Hal Finkel hfinkel at anl.gov
Thu Mar 13 08:01:32 PDT 2014


----- Original Message -----
> From: "Tom Stellard" <tom at stellard.net>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
> Sent: Thursday, March 13, 2014 9:46:22 AM
> Subject: Re: [LLVMdev] Be Careful with Positionally-Encoded Operands (AArch64, Mips, AMDGPU, etc.)
> 
> On Thu, Mar 13, 2014 at 03:21:47AM -0500, Hal Finkel wrote:
> > Hello,
> > 
> > Some of the backends seem to be combining positional and named
> > operands when defining some instructions such that some of the
> > positional operands overlap with some of the named operands. I
> > suspect this is not intentional; here's an example:
> > 
> > AArch64 has the following instruction definition:
> > 
> > SMULHxxx {
> >   field bits<32> Inst = { 1, 0, 0, 1, 1, 0, 1, 1, 0, 1, 0, Rm{4},
> >   Rm{3}, Rm{2}, Rm{1}, Rm{0}, 0, Ra{4}, Ra{3}, Ra{2}, Ra{1},
> >   Ra{0}, Rn{4}, Rn{3}, Rn{2}, Rn{1}, Rn{0}, Rd{4}, Rd{3}, Rd{2},
> >   Rd{1}, Rd{0} };
> >   ...
> >   dag OutOperandList = (outs GPR64:$Rd);
> >   dag InOperandList = (ins GPR64:$Rn, GPR64:$Rm);
> >   string AsmString = "smulh	$Rd, $Rn, $Rm";
> >   list<dag> Pattern = [(set i64:$Rd, (mulhs i64:$Rn, i64:$Rm))];
> >   ...
> >   bits<5> Rd = { ?, ?, ?, ?, ? };
> >   bits<5> Rn = { ?, ?, ?, ?, ? };
> >   bits<5> Rm = { ?, ?, ?, ?, ? };
> >   bits<5> Ra = { ?, ?, ?, ?, ? };
> > ...
> > }
> > 
> > So how do the operands in OutOperandList and InOperandList get
> > mapped to the variables used to encode the instruction
> > (Rd,Rn,Rm,Ra)? The first three match by name (as they appear
> > explicitly in OutOperandList and InOperandList), but what about
> > Ra? Ra contributes to defining the bits in Inst, and because there
> > is, by default, no overlap checking, it also gets mapped to the
> > first operand: GPR64:$Rd. The result, from
> > AArch64GenMCCodeEmitter.inc is:
> > 
> >     case AArch64::SMULHxxx:
> >     case AArch64::UMULHxxx: {
> >       // op: Rd
> >       op = getMachineOpValue(MI, MI.getOperand(0), Fixups, STI);
> >       Value |= op & UINT64_C(31);
> >       // op: Rn
> >       op = getMachineOpValue(MI, MI.getOperand(1), Fixups, STI);
> >       Value |= (op & UINT64_C(31)) << 5;
> >       // op: Rm
> >       op = getMachineOpValue(MI, MI.getOperand(2), Fixups, STI);
> >       Value |= (op & UINT64_C(31)) << 16;
> >       // op: Ra
> >       op = getMachineOpValue(MI, MI.getOperand(0), Fixups, STI);
> >       Value |= (op & UINT64_C(31)) << 10;
> >       Value = fixMulHigh(MI, Value, STI);
> >       break;
> >     }
> > 
> > This may be correct (I have no idea), but even if it is, it seems
> > like an odd way to get this behavior: depending on the fact that,
> > after operands are matched by name, the remaining operands are
> > matched by position such that they might overlap with those
> > operands matched by name.
> > 
> > In any case, I believe that the only in-tree target that still
> > really needs the positional operand support is PowerPC (because
> > the named-operand matching scheme still can't deal with the
> > complex operands that PPC uses for handling memory operands), and
> > PowerPC needs the behavior that named and positional operands are
> > disjoint so that we can start transitioning toward using named
> > mapping, so I've added a new TableGen target bit to disable this
> > problematic overlapping operand behavior
> > (noNamedPositionallyEncodedOperands) in r203767.
> > 
> > This means that in lib/Target/PowerPC/PPC.td, we have:
> > 
> > def PPCInstrInfo : InstrInfo {
> > ...
> > 
> >   let noNamedPositionallyEncodedOperands = 1;
> > 
> > ...
> > }
> > 
> > I'd like those maintaining the current backends (especially
> > AArch64, Mips, AMDGPU, which I know to be problematic in this
> > regard) to try setting this and: a) fix those definitions that are
> > problematic or b) explain to me that the current behavior is
> > useful.
> >
> 
> Hi Hal,
> 
> Thanks for working on this, I have run into several problems with the
> positional encodings and I would be happy to see it go away.
>  However,
> I was under the impression that positional encoding was the only way
> to
> encode custom operand types that map to multiple machine operands.
> See for example the MEMxi Operand sub-class in
> R600/R600Instructions.td.

Yes, this is the remaining use case for the positional scheme. We need to invent something to replace it (maybe this should be something as simple as mapping '.' -> '_'). Thoughts?

 -Hal

> 
> Thanks,
> Tom
> 
> > Thanks again,
> > Hal
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > _______________________________________________
> > LLVM Developers mailing list
> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory



More information about the llvm-dev mailing list