[PATCH] D62266: [DAGCombine][X86][AArch64][ARM] (C - x) + y -> (y - x) + C fold
Roman Lebedev via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Wed May 22 14:24:02 PDT 2019
lebedev.ri added a comment.
In D62266#1512613 <https://reviews.llvm.org/D62266#1512613>, @efriedma wrote:
> > So the DAG is simpler, but that particular case is worse for the backend.
>
> It's "simpler", in some sense, but it's more complicated to lower because it's using the carry output of a uaddo in a non-addcarry operation, which is legal, but more expensive than using in an addcarry, particularly on ARM. I think the missing combine is that we should be able to turn the subtraction into some form of addcarry/subcarry.
Hmm, that does not seem to be all.
I've added
bool HasSUBCARRY = TLI.isOperationLegalOrCustom(ISD::SUBCARRY, VT);
bool HasADDCARRY = TLI.isOperationLegalOrCustom(ISD::ADDCARRY, VT);
if (HasSUBCARRY || HasADDCARRY) {
unsigned ANYCARRY = HasSUBCARRY ? ISD::SUBCARRY : ISD::ADDCARRY;
// (sub X, Carry) -> (subcarry/addcarry X, 0, Carry)
if (SDValue Carry = getAsCarry(TLI, N1)) {
return DAG.getNode(ANYCARRY, DL, DAG.getVTList(VT, Carry.getValueType()),
N0, DAG.getConstant(0, DL, VT), Carry);
}
if (HasSUBCARRY) {
// (sub Carry, X) -> (subcarry 0, X, Carry)
if (SDValue Carry = getAsCarry(TLI, N0)) {
return DAG.getNode(ISD::SUBCARRY, DL,
DAG.getVTList(VT, Carry.getValueType()),
DAG.getConstant(0, DL, VT), N1, Carry);
}
}
}
at the end of `DAGCombiner::visitSUB()`, and get this
SelectionDAG has 17 nodes:
t0: ch = EntryToken
t6: i32,ch = CopyFromReg t0, Register:i32 %2
t4: i32,ch = CopyFromReg t0, Register:i32 %1
t2: i32,ch = CopyFromReg t0, Register:i32 %0
t30: i32,i32 = uaddo t4, t2
t31: i32,i32 = subcarry Constant:i32<0>, t6, t30:1
t39: i32 = and t31, Constant:i32<65535>
t40: ch = br_cc t0, seteq:ch, t39, Constant:i32<65535>, BasicBlock:ch<if.end 0x55706fb58828>
t19: ch = br t40, BasicBlock:ch<for.cond.preheader 0x55706fb58698>
which is then resulting in
SelectionDAG has 22 nodes:
t0: ch = EntryToken
t6: i32,ch = CopyFromReg t0, Register:i32 %2
t4: i32,ch = CopyFromReg t0, Register:i32 %1
t2: i32,ch = CopyFromReg t0, Register:i32 %0
t51: i32,i32 = ARMISD::ADDC t4, t2
t52: i32,i32 = ARMISD::ADDE Constant:i32<0>, Constant:i32<0>, t51:1
t45: i32 = sub Constant:i32<1>, t52
t46: i32,i32 = ARMISD::SUBC t45, Constant:i32<1>
t47: i32,i32 = ARMISD::SUBE Constant:i32<0>, t6, t46:1
t39: i32 = and t47, Constant:i32<65535>
t41: glue = ARMISD::CMPZ t39, Constant:i32<65535>
t43: ch = ARMISD::BRCOND t0, BasicBlock:ch<if.end 0x55706fb58828>, Constant:i32<0>, Register:i32 $cpsr, t41
t19: ch = br t43, BasicBlock:ch<for.cond.preheader 0x55706fb58698>
So some other ARM-specific combine is missing, too?
Or am i totally misreading this?
Repository:
rL LLVM
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D62266/new/
https://reviews.llvm.org/D62266
More information about the llvm-commits
mailing list