[PATCH] D136244: [AArch64] Optimize memcmp when the result is tested for [in]equality with 0

chenglin.bi via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Oct 25 20:03:23 PDT 2022


bcl5980 added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:19499-19500
+      (LHS.getOperand(0)->getOpcode() == ISD::XOR &&
+       LHS.getOperand(1)->getOpcode() == ISD::XOR) &&
+      LHS.getOperand(0)->hasOneUse() && LHS.getOperand(1)->hasOneUse()) {
+    SDValue XOR0 = LHS.getOperand(0);
----------------
Allen wrote:
> bcl5980 wrote:
> > LHS should be OneUse also?
> The **LHS **node itself is not used in the return value when the pattern matched, so I don't think the OneUse is  needed, correct me if I'm wrong, thanks.
for example:
```
int use(int);
int f(int a, int b, int c, int d)
{
   int xor0 = a ^ b;
   int xor1 = c ^ d;
   int or0   = xor0 | xor1;
   if (or0 != 0)
        return use(or0);
   return a;
}
```
or0 is not one use. So we should keep all of the xor+or patterns. 
 


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:19510
+    SDValue NZCVOp = DAG.getConstant(0, DL, MVT::i32);
+    SDValue CCmp = DAG.getNode(AArch64ISD::CCMP, DL, MVT_CC, XOR1.getOperand(0),
+                               XOR1.getOperand(1), NZCVOp, CCVal, Overflow);
----------------
Allen wrote:
> bcl5980 wrote:
> > I am not sure if we can just combine to ISD::SETCC ? Maybe it can combine with some other op.
> sorry, I don't understand what is the **ISD::SETCC**, could you please show more detailedly? as I don't find it in my changes.
The code should be simpler by combine to SetCC:

```
    SDValue XOR0 = LHS.getOperand(0);
    SDValue XOR1 = LHS.getOperand(1);
    SDValue Cmp0 = DAG.getSetCC(DL, VT, XOR0.getOperand(0), XOR0.getOperand(1),
                                ISD::SETNE);
    SDValue Cmp1 = DAG.getSetCC(DL, VT, XOR1.getOperand(0), XOR1.getOperand(1),
                                ISD::SETNE);
    SDValue Cmp = DAG.getNode(ISD::OR, DL, VT, Cmp0, Cmp1);
    return DAG.getSetCC(DL, VT, Cmp, DAG.getConstant(0, DL, VT), Cond);
```
But may fall into potential dead loop if somewhere has the reverse combination. 
@dmgreen , which way do you think is better?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136244/new/

https://reviews.llvm.org/D136244



More information about the llvm-commits mailing list