[PATCH] D137936: [AArch64] Optimize cmp chain before legalization
chenglin.bi via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 21 17:54:52 PST 2022
bcl5980 added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:8574
}
// All the non-leaf nodes must be OR.
----------------
Allen wrote:
> bcl5980 wrote:
> > Maybe the ISD::ZERO_EXTEND can remain here?
> In the current implementation, we generate the AND and ANY_EXTEND sequence when , so the ISD::ZERO_EXTEND is not required?
The case like that can trigger the zext part. https://www.godbolt.org/z/bPrT3G7bh
And I think the cost is very low so we can add it.
================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:8614
// Exit early by inverting the condition, which help reduce indentations.
- SDValue TVal = DAG.getConstant(1, DL, VT);
- SDValue FVal = DAG.getConstant(0, DL, VT);
- AArch64CC::CondCode CC = changeIntCCToAArch64CC(Cond);
- AArch64CC::CondCode InvCC = AArch64CC::getInvertedCondCode(CC);
- return DAG.getNode(AArch64ISD::CSEL, DL, VT, FVal, TVal,
- DAG.getConstant(InvCC, DL, MVT::i32), CCmp);
+ return DAG.getSetCC(DL, VT, Cmp, DAG.getConstant(0, DL, VT), Cond);
}
----------------
Allen wrote:
> bcl5980 wrote:
> > It looks this can continue to be simplified to
> >
> > ```
> > unsigned LogicOp = (Cond == ISD::SETEQ) ? ISD::AND : ISD::OR;
> > SDValue Cmp = DAG.getSetCC(DL, VT, XOR0, XOR1, Cond);
> > for (unsigned I = 1; I < WorkList.size(); I++) {
> > std::tie(XOR0, XOR1) = WorkList[I];
> > SDValue CmpChain = DAG.getSetCC(DL, VT, XOR0, XOR1, Cond);
> > Cmp = DAG.getNode(LogicOp, DL, VT, Cmp, CmpChain);
> > }
> >
> > return Cmp;
> > ```
> > Looks more cases can get benefit from it.
> Yes, this change fix the case @PR58675, while regression on case combine_setcc_glue, so I'll need more work on it.
> ```
> +++ b/llvm/test/CodeGen/AArch64/dag-combine-setcc.ll
> @@ -191,9 +191,11 @@ define i32 @combine_setcc_glue(i128 noundef %x, i128 noundef %y) {
> ; CHECK-LABEL: combine_setcc_glue:
> ; CHECK: // %bb.0: // %entry
> ; CHECK-NEXT: cmp x1, x3
> -; CHECK-NEXT: ccmp x0, x2, #0, eq
> -; CHECK-NEXT: ccmp x0, x2, #4, ne
> -; CHECK-NEXT: cset w0, eq
> +; CHECK-NEXT: cset w8, eq
> +; CHECK-NEXT: cmp x0, x2
> +; CHECK-NEXT: cset w9, eq
> +; CHECK-NEXT: and w8, w9, w8
> +; CHECK-NEXT: orr w0, w8, w9
> ; CHECK-NEXT: ret
> ```
Don't worry about the combine_setcc_glue regression. D138401 already fixed that.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D137936/new/
https://reviews.llvm.org/D137936
More information about the llvm-commits
mailing list