[PATCH] [PATCH][ARM] Fix issue with UMLAL (unsigned multiple accumulate long) lowering
Jyoti Allur
jyoti.allur at samsung.com
Tue May 26 07:34:29 PDT 2015
In http://reviews.llvm.org/D9881#176525, @MatzeB wrote:
> In http://reviews.llvm.org/D9881#176316, @jyoti.allur wrote:
>
> > Hi Matthias,
> > To clarify, Constant, SRA, CopyFromReg are possible input nodes to ADDE node.
>
>
> Hmm, why would you need to restrict the possible nodes here? My understanding is that UMLAL does a full 64bit addition, so I don't see why you would need to restrict those.
the cases that i was trying to cover (added newly in the testcase longMAC.ll here) contained one of the operands to ADDE from either of those nodes, hence it was added, but i now see that its not necessary to restrict to those nodes only.
>
>
> > I was trying to cover cases where ISD::MUL, ISD::ADDC, ISD::ADDE can be combined to a S/UMLAL where in ISD::MUL operands are less 8-bit or 16-bits operands. In these cases result can stored in a in single 32-bit register. Hence in such cases, none of ADDE 's operand will store result from ISD:MUL. Instead, one of the ADDE's operand will hold 32:63 bits of 64-bit accumulate value, while the 0:31 bits of the accumulate value is stored in one of the operands of ADDC.
>
> >
>
> > for example: in the assembly below,
>
> > R1 - ACC [ 32:63 ]
>
> > R0 - ACC [ 0:31 ]
>
> >
>
> > foo:
>
> > ldrb r2, [r2]
>
> > ldrh r3, [r3]
>
> > mul r2, r3, r2
>
> > adds r0, r2, r0
>
> > adc r1, r1, #0
>
> > bx lr
>
> >
>
> >
>
> > I agree, overflow cases were not handled yet. Could you suggest how to detect overflow ?
>
>
> Well you need to match patterns that are obviously fine and reject everything else.
>
> Good things would probably be querying for the NoUnsignedWrap flag, or using SelectionDAG::computeKnownBits and test if both MUL operands have their upper 16bit known to be zero. One can probably think of more patterns so I'd factor it out into a function so people can add more patterns later.
mul r2, r1, r3
SelectionDAG::computeKnownBits does not return the bit info for this instruction ISD::MUL if the mul operands are of opcode ISD::CopyFromReg This is not handled in computeKnownBits. I tried to do that but, RegisterSDNode class only gives register number, not the value contained in the register. How could one access the value contained in a RegisterSDNode, so as to check that MUL operands have their upper 16bit known to be zero ?
I also tried,
MULOp.getOperand(0).getValueSizeInBits() > 16 = overflow
but MULOp.getOperand(0).getValueSizeInBits() always returns 32 irrespective of the value stored in MUL operands.
This is my code to check overflow
MULOp = (AddcOp0->getOpcode() == ISD::MUL) ? AddcOp0 : AddcOp1;
APInt KnownZero, KnownOne;
APInt KnownZero1, KnownOne1;
DCI.DAG.computeKnownBits(MULOp.getOperand(0), KnownZero, KnownOne);
DCI.DAG.computeKnownBits(MULOp.getOperand(1), KnownZero1, KnownOne1);
APInt Mask(APInt::getHighBitsSet(VT.getSizeInBits(), 16));
if ((KnownZero & Mask).getBoolValue() && (KnownZero1 & Mask).getBoolValue())
return SDValue();
This should do the trick, but KnownZero is always zero even when MUL operands contain 0xFFFFFFFF, 0XFFFFFFFF for reasons sited above.
I would highly appreciate your inputs on this.
REPOSITORY
rL LLVM
http://reviews.llvm.org/D9881
EMAIL PREFERENCES
http://reviews.llvm.org/settings/panel/emailpreferences/
More information about the llvm-commits
mailing list