[PATCH] D38160: [AArch64] Improve codegen for inverted overflow checking intrinsics

Thu Oct 5 05:52:07 PDT 2017

kristof.beyls added inline comments.

================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:1980-2012
+  ConstantSDNode *COp1 = dyn_cast<ConstantSDNode>(Other);
+  unsigned Opc = Sel.getOpcode();
+  // If the operand is an overflow checking operation, invert the condition
+  // code and kill the xor(op, 1).
+  if (Sel.getResNo() == 1 &&
+      (Opc == ISD::SADDO || Opc == ISD::UADDO || Opc == ISD::SSUBO ||
+       Opc == ISD::USUBO || Opc == ISD::SMULO || Opc == ISD::UMULO) &&
----------------
aemerson wrote:
> kristof.beyls wrote:
> > Hi Amara,
> > 
> > I'm only vaguely familiar with this area of the code base.
> > My understanding is that you're aiming to have one more pattern apply involving an xor node.
> > I think it'd be nice to write out the pattern in a comment just like is available for the pattern matched in the existing code in LowerXOR.
> > 
> > Apart from that, with my unfamiliarity of this code base, I wonder why this pattern matching optimization is done during lowering.
> > Are there good reasons this optimization isn't done elsewhere, e.g. described by a tablegen pattern or during DAGCombine (e.g. in performXorCombine in this same source file)?
> > Apologies if the answer is blatantly obvious to people more experienced in this area.
> > 
> > Thanks,
> > 
> > Kristof
> Thanks for taking a look. This is part of lowering because I want to re-use the code for detecting the overflow arithmetic nodes in getAArch64XALUOOp(). That helper function is used to recognize the patterns in a few other places like LowerBR_CC and LowerSelect for example. If we leave this combine until later we can't re-use that as the pattern is destroyed.
> 
> I'll improve the comment to better explain the transformation.
Thanks Amara, that makes sense to me.
Now that I've looked at getAArch64XALUOOp() and where it's used, I couldn't help but notice that almost everywhere it's used, the same boiler-plate kind-of code is present around it:

```
if (Sel.getResNo() == 1 &&
      (Opc == ISD::SADDO || Opc == ISD::UADDO || Opc == ISD::SSUBO ||
       Opc == ISD::USUBO || Opc == ISD::SMULO || Opc == ISD::UMULO) && ...) {
      // Only lower legal XALUO ops.
    if (!DAG.getTargetLoweringInfo().isTypeLegal(LHS->getValueType(0)))
      return SDValue();
...
    AArch64CC::CondCode OFCC;
    SDValue Value, Overflow;
    std::tie(Value, Overflow) = getAArch64XALUOOp(OFCC, CCVal.getValue(0), DAG);
    SDValue CCVal = DAG.getConstant(OFCC, DL, MVT::i32);
...
  return DAG.getNode(AArch64ISD::CSEL, DL, Op.getValueType(), TVal, FVal,
                   CCVal, Overflow);
}

```

Couldn't that code be factored out somehow instead of having it copy pasted a few times?
I think that could improve the readability of the code and result in the advantages that the DRY-principle brings.

Repository:
  rL LLVM

https://reviews.llvm.org/D38160