[PATCH] [PATCH][ARM] Fix issue with UMLAL (unsigned multiple accumulate long) lowering

Tue May 26 07:34:29 PDT 2015

In http://reviews.llvm.org/D9881#176525, @MatzeB wrote:

> In http://reviews.llvm.org/D9881#176316, @jyoti.allur wrote:
>
> > Hi Matthias,
> >  To clarify, Constant, SRA, CopyFromReg are possible input nodes to ADDE node.
>
>
> Hmm, why would you need to restrict the possible nodes here? My understanding is that UMLAL does a full 64bit addition, so I don't see why you would need to restrict those.

the cases that i was trying to cover (added newly in the testcase longMAC.ll here) contained one of the operands to ADDE from either of those nodes, hence it was added, but i now see that its not necessary to restrict to those nodes only.

> 

> 

> > I was trying to cover cases where ISD::MUL, ISD::ADDC, ISD::ADDE can be combined to a S/UMLAL where in ISD::MUL operands are less 8-bit or 16-bits operands. In these cases result can stored in a in single 32-bit register. Hence in such cases, none of ADDE 's operand will store result from ISD:MUL. Instead, one of the ADDE's operand will hold 32:63 bits of 64-bit accumulate value, while the 0:31 bits of the accumulate value is stored in one of the operands of ADDC.

> 

> > 

> 

> > for example: in the assembly below, 

> 

> >  R1 - ACC [ 32:63 ] 

> 

> >  R0 - ACC [ 0:31 ]

> 

> > 

> 

> >   foo:

> 

> >           ldrb    r2, [r2]

> 

> >           ldrh    r3, [r3]

> 

> >           mul     r2, r3, r2

> 

> >           adds    r0, r2, r0

> 

> >           adc     r1, r1, #0

> 

> >           bx      lr

> 

> > 

> 

> > 

> 

> > I agree, overflow cases were not handled yet. Could you suggest how to detect overflow ?

> 

> 

> Well you need to match patterns that are obviously fine and reject everything else.

> 

> Good things would probably be querying for the NoUnsignedWrap flag, or using SelectionDAG::computeKnownBits and test if both MUL operands have their upper 16bit known to be zero. One can probably think of more patterns so I'd factor it out into a function so people can add more patterns later.

mul r2, r1, r3
SelectionDAG::computeKnownBits does not return the bit info for this instruction ISD::MUL if the mul operands are of opcode ISD::CopyFromReg This is not handled in computeKnownBits. I tried to do that but, RegisterSDNode class only gives register number, not the value contained in the register. How could one access the value contained in a RegisterSDNode, so as to check that MUL operands have their upper 16bit known to be zero ?

I also tried,
MULOp.getOperand(0).getValueSizeInBits() > 16 = overflow
but MULOp.getOperand(0).getValueSizeInBits() always returns 32 irrespective of the value stored in MUL operands.

This is my code to check overflow

  MULOp = (AddcOp0->getOpcode() == ISD::MUL) ? AddcOp0 : AddcOp1;
  APInt KnownZero, KnownOne;
      APInt KnownZero1, KnownOne1;
      DCI.DAG.computeKnownBits(MULOp.getOperand(0), KnownZero, KnownOne);
      DCI.DAG.computeKnownBits(MULOp.getOperand(1), KnownZero1, KnownOne1);

    APInt Mask(APInt::getHighBitsSet(VT.getSizeInBits(), 16));
    if ((KnownZero & Mask).getBoolValue() && (KnownZero1 & Mask).getBoolValue())
        return SDValue();

This should do the trick, but KnownZero is always zero even when MUL operands contain 0xFFFFFFFF, 0XFFFFFFFF for reasons sited above.
I would highly appreciate your inputs on this.

REPOSITORY
  rL LLVM

http://reviews.llvm.org/D9881

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/