[PATCH] D38160: [AArch64] Improve codegen for inverted overflow checking intrinsics

Thu Oct 5 08:03:14 PDT 2017

aemerson added inline comments.

================
Comment at: lib/Target/AArch64/AArch64ISelLowering.cpp:1980-2012
+  ConstantSDNode *COp1 = dyn_cast<ConstantSDNode>(Other);
+  unsigned Opc = Sel.getOpcode();
+  // If the operand is an overflow checking operation, invert the condition
+  // code and kill the xor(op, 1).
+  if (Sel.getResNo() == 1 &&
+      (Opc == ISD::SADDO || Opc == ISD::UADDO || Opc == ISD::SSUBO ||
+       Opc == ISD::USUBO || Opc == ISD::SMULO || Opc == ISD::UMULO) &&
----------------
kristof.beyls wrote:
> aemerson wrote:
> > kristof.beyls wrote:
> > > Hi Amara,
> > > 
> > > I'm only vaguely familiar with this area of the code base.
> > > My understanding is that you're aiming to have one more pattern apply involving an xor node.
> > > I think it'd be nice to write out the pattern in a comment just like is available for the pattern matched in the existing code in LowerXOR.
> > > 
> > > Apart from that, with my unfamiliarity of this code base, I wonder why this pattern matching optimization is done during lowering.
> > > Are there good reasons this optimization isn't done elsewhere, e.g. described by a tablegen pattern or during DAGCombine (e.g. in performXorCombine in this same source file)?
> > > Apologies if the answer is blatantly obvious to people more experienced in this area.
> > > 
> > > Thanks,
> > > 
> > > Kristof
> > Thanks for taking a look. This is part of lowering because I want to re-use the code for detecting the overflow arithmetic nodes in getAArch64XALUOOp(). That helper function is used to recognize the patterns in a few other places like LowerBR_CC and LowerSelect for example. If we leave this combine until later we can't re-use that as the pattern is destroyed.
> > 
> > I'll improve the comment to better explain the transformation.
> Thanks Amara, that makes sense to me.
> Now that I've looked at getAArch64XALUOOp() and where it's used, I couldn't help but notice that almost everywhere it's used, the same boiler-plate kind-of code is present around it:
> 
> ```
> if (Sel.getResNo() == 1 &&
>       (Opc == ISD::SADDO || Opc == ISD::UADDO || Opc == ISD::SSUBO ||
>        Opc == ISD::USUBO || Opc == ISD::SMULO || Opc == ISD::UMULO) && ...) {
>       // Only lower legal XALUO ops.
>     if (!DAG.getTargetLoweringInfo().isTypeLegal(LHS->getValueType(0)))
>       return SDValue();
> ...
>     AArch64CC::CondCode OFCC;
>     SDValue Value, Overflow;
>     std::tie(Value, Overflow) = getAArch64XALUOOp(OFCC, CCVal.getValue(0), DAG);
>     SDValue CCVal = DAG.getConstant(OFCC, DL, MVT::i32);
> ...
>   return DAG.getNode(AArch64ISD::CSEL, DL, Op.getValueType(), TVal, FVal,
>                    CCVal, Overflow);
> }
> 
> ```
> 
> Couldn't that code be factored out somehow instead of having it copy pasted a few times?
> I think that could improve the readability of the code and result in the advantages that the DRY-principle brings.
> 
Yeah, I think this is a borderline case. There's not a way I can see that can factor things out in a nice way due to the differences in the initial if-condition as well as how the AArch64 CC is mutated in two of the cases (while maintaining readability). I can split out the Opc checks though as they're the same in all cases?

Repository:
  rL LLVM

https://reviews.llvm.org/D38160