[llvm] r245198 - Remove hand-rolled matching for fmin and fmax.

Wed Aug 19 03:58:47 PDT 2015

Hi Ahmed,

Thanks for the explanation.

I think this should be simpler than you think. SDAGBuilder::visitSelect is very conservative currently. It will only look through SplitVectors, and expect to find a Legal or Custom action at the end of the chain.

If we teach SDAGBuilder::visitSelect to understand about promoted types, it could understand that f16 would be promoted to f32, and FMINNAN/FMINNUM are legal on f32.

If we do this (I’ve noticed a bug where fminnan doesn’t know how to promote itself… whoops), we get:

fcvt s1, h1
fcvt s0, h0
fminnm s0, s0, s1
fcvt h0, s0
fcvt s0, h0
ret

Which looks about right? So a fairly simple improvement.

Cheers,

James

On 18 Aug 2015, at 21:54, Ahmed Bougacha <ahmed.bougacha at gmail.com<mailto:ahmed.bougacha at gmail.com>> wrote:

On Tue, Aug 18, 2015 at 12:51 PM, James Molloy <James.Molloy at arm.com<mailto:James.Molloy at arm.com>> wrote:
Hi Ahmed,

I'm afraid I'm not quite certain what you mean - could you elaborate?

I was thinking of something like:

define double @f(float %a, float %b) #0 {
  %cc = fcmp olt float %a, %b
  %xa = fpext float %a to double
  %xb = fpext float %b to double
  %r = select i1 %cc, double %xa, double %xb
  ret double %r
}

attributes #0 = { "no-infs-fp-math"="true" "no-nans-fp-math"="true" "unsafe-fp-math"="true" }

for which the previous lowering would have generated something like:

    fcvt d0, s0
    fcvt d1, s1
    fmin d0, d0, d1

But we now generate:

    fcvt d2, s0
    fcvt d3, s1
    fcmp s0, s1
    fcsel d0, d2, d3, lt

For float/double, it's not a problem, because the now-canonical IR, per InstCombine, would be:

define double @f(float %a, float %b) #0 {
  %cc = fcmp olt float %a, %b
  %r.v = select i1 %cc, float %a, float %b
  %r = fpext float %r.v to double
  ret double %r
}

Which we can indeed match.  However, for half, fpexts on the fcmp operands are generated by the legalizer, so neither instcombine nor the builder can match those anymore.  We basically have a DAG for:

define double @f(half %a, half %b) #0 {
  %sa = fpext half %a to float
  %sb = fpext half %b to float
  %cc = fcmp olt float %sa, %sb
  %xa = fpext float %sa to double
  %xb = fpext float %sb to double
  %r = select i1 %cc, double %xa, double %xb
  ret double %r
}

Perhaps we want a similar DAG combine, or even have the builder match the fpext on the select result directly?  Not sure either's a good idea.

Again, not a big problem by any means, just wondering if we can/should improve this somehow!

-Ahmed

Cheers,

James



On 18 Aug 2015, at 19:36, Ahmed Bougacha <ahmed.bougacha at gmail.com<mailto:ahmed.bougacha at gmail.com>> wrote:

Hey James,

If I read this correctly, we can't fold fpext anymore, right? Not a big deal right now, as the main case where I see this being useful is a promoted f16 min/max, and that's not even caught currently.
Do you think we should do something similar in the builder?  If so, let's file a PR!

-Ahmed

On Mon, Aug 17, 2015 at 12:13 AM, James Molloy via llvm-commits <llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>> wrote:
Author: jamesm
Date: Mon Aug 17 02:13:20 2015
New Revision: 245198

URL: http://llvm.org/viewvc/llvm-project?rev=245198&view=rev
Log:
Remove hand-rolled matching for fmin and fmax.

SDAGBuilder now does this all for us.

Modified:
    llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp

Modified: llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp?rev=245198&r1=245197&r2=245198&view=diff
==============================================================================

--- llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp (original)
+++ llvm/trunk/lib/Target/AArch64/AArch64ISelLowering.cpp Mon Aug 17 02:13:20 2015
@@ -393,6 +393,8 @@ AArch64TargetLowering::AArch64TargetLowe
     setOperationAction(ISD::FROUND, Ty, Legal);
     setOperationAction(ISD::FMINNUM, Ty, Legal);
     setOperationAction(ISD::FMAXNUM, Ty, Legal);
+    setOperationAction(ISD::FMINNAN, Ty, Legal);
+    setOperationAction(ISD::FMAXNAN, Ty, Legal);
   }

   setOperationAction(ISD::PREFETCH, MVT::Other, Custom);
@@ -484,7 +486,6 @@ AArch64TargetLowering::AArch64TargetLowe

   setTargetDAGCombine(ISD::SELECT);
   setTargetDAGCombine(ISD::VSELECT);
-  setTargetDAGCombine(ISD::SELECT_CC);

   setTargetDAGCombine(ISD::INTRINSIC_VOID);
   setTargetDAGCombine(ISD::INTRINSIC_W_CHAIN);
@@ -3826,32 +3827,6 @@ SDValue AArch64TargetLowering::LowerSETC
   }
 }

-/// A SELECT_CC operation is really some kind of max or min if both values being
-/// compared are, in some sense, equal to the results in either case. However,
-/// it is permissible to compare f32 values and produce directly extended f64
-/// values.
-///
-/// Extending the comparison operands would also be allowed, but is less likely
-/// to happen in practice since their use is right here. Note that truncate
-/// operations would *not* be semantically equivalent.
-static bool selectCCOpsAreFMaxCompatible(SDValue Cmp, SDValue Result) {
-  if (Cmp == Result)
-    return (Cmp.getValueType() == MVT::f32 ||
-            Cmp.getValueType() == MVT::f64);
-
-  ConstantFPSDNode *CCmp = dyn_cast<ConstantFPSDNode>(Cmp);
-  ConstantFPSDNode *CResult = dyn_cast<ConstantFPSDNode>(Result);
-  if (CCmp && CResult && Cmp.getValueType() == MVT::f32 &&
-      Result.getValueType() == MVT::f64) {
-    bool Lossy;
-    APFloat CmpVal = CCmp->getValueAPF();
-    CmpVal.convert(APFloat::IEEEdouble, APFloat::rmNearestTiesToEven, &Lossy);
-    return CResult->getValueAPF().bitwiseIsEqual(CmpVal);
-  }
-
-  return Result->getOpcode() == ISD::FP_EXTEND && Result->getOperand(0) == Cmp;
-}
-
 SDValue AArch64TargetLowering::LowerSELECT_CC(ISD::CondCode CC, SDValue LHS,
                                               SDValue RHS, SDValue TVal,
                                               SDValue FVal, SDLoc dl,
@@ -9124,75 +9099,6 @@ static SDValue performSelectCombine(SDNo
   return DAG.getSelect(DL, ResVT, Mask, N->getOperand(1), N->getOperand(2));
 }

-/// performSelectCCCombine - Target-specific DAG combining for ISD::SELECT_CC
-/// to match FMIN/FMAX patterns.
-static SDValue performSelectCCCombine(SDNode *N, SelectionDAG &DAG) {
-  // Try to use FMIN/FMAX instructions for FP selects like "x < y ? x : y".
-  // Unless the NoNaNsFPMath option is set, be careful about NaNs:
-  // vmax/vmin return NaN if either operand is a NaN;
-  // only do the transformation when it matches that behavior.
-
-  SDValue CondLHS = N->getOperand(0);
-  SDValue CondRHS = N->getOperand(1);
-  SDValue LHS = N->getOperand(2);
-  SDValue RHS = N->getOperand(3);
-  ISD::CondCode CC = cast<CondCodeSDNode>(N->getOperand(4))->get();
-
-  unsigned Opcode;
-  bool IsReversed;
-  if (selectCCOpsAreFMaxCompatible(CondLHS, LHS) &&
-      selectCCOpsAreFMaxCompatible(CondRHS, RHS)) {
-    IsReversed = false; // x CC y ? x : y
-  } else if (selectCCOpsAreFMaxCompatible(CondRHS, LHS) &&
-             selectCCOpsAreFMaxCompatible(CondLHS, RHS)) {
-    IsReversed = true ; // x CC y ? y : x
-  } else {
-    return SDValue();
-  }
-
-  bool IsUnordered = false, IsOrEqual;
-  switch (CC) {
-  default:
-    return SDValue();
-  case ISD::SETULT:
-  case ISD::SETULE:
-    IsUnordered = true;
-  case ISD::SETOLT:
-  case ISD::SETOLE:
-  case ISD::SETLT:
-  case ISD::SETLE:
-    IsOrEqual = (CC == ISD::SETLE || CC == ISD::SETOLE || CC == ISD::SETULE);
-    Opcode = IsReversed ? ISD::FMAXNAN : ISD::FMINNAN;
-    break;
-
-  case ISD::SETUGT:
-  case ISD::SETUGE:
-    IsUnordered = true;
-  case ISD::SETOGT:
-  case ISD::SETOGE:
-  case ISD::SETGT:
-  case ISD::SETGE:
-    IsOrEqual = (CC == ISD::SETGE || CC == ISD::SETOGE || CC == ISD::SETUGE);
-    Opcode = IsReversed ? ISD::FMINNAN : ISD::FMAXNAN;
-    break;
-  }
-
-  // If LHS is NaN, an ordered comparison will be false and the result will be
-  // the RHS, but FMIN(NaN, RHS) = FMAX(NaN, RHS) = NaN. Avoid this by checking
-  // that LHS != NaN. Likewise, for unordered comparisons, check for RHS != NaN.
-  if (!DAG.isKnownNeverNaN(IsUnordered ? RHS : LHS))
-    return SDValue();
-
-  // For xxx-or-equal comparisons, "+0 <= -0" and "-0 >= +0" will both be true,
-  // but FMIN will return -0, and FMAX will return +0. So FMIN/FMAX can only be
-  // used for unsafe math or if one of the operands is known to be nonzero.
-  if (IsOrEqual && !DAG.getTarget().Options.UnsafeFPMath &&
-      !(DAG.isKnownNeverZero(LHS) || DAG.isKnownNeverZero(RHS)))
-    return SDValue();
-
-  return DAG.getNode(Opcode, SDLoc(N), N->getValueType(0), LHS, RHS);
-}
-
 /// Get rid of unnecessary NVCASTs (that don't change the type).
 static SDValue performNVCASTCombine(SDNode *N) {
   if (N->getValueType(0) == N->getOperand(0).getValueType())
@@ -9233,8 +9139,6 @@ SDValue AArch64TargetLowering::PerformDA
     return performSelectCombine(N, DCI);
   case ISD::VSELECT:
     return performVSelectCombine(N, DCI.DAG);
-  case ISD::SELECT_CC:
-    return performSelectCCCombine(N, DCI.DAG);
   case ISD::STORE:
     return performSTORECombine(N, DCI, DAG, Subtarget);
   case AArch64ISD::BRCOND:


_______________________________________________
llvm-commits mailing list
llvm-commits at lists.llvm.org<mailto:llvm-commits at lists.llvm.org>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits


-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782


-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20150819/766dc3bb/attachment.html>