[llvm] [AMDGPU][ISel] Reduce `f64` compare to integer compare of upper half (PR #188356)

Thu Apr 2 09:16:33 PDT 2026

================
@@ -17387,6 +17387,201 @@ SDValue SITargetLowering::performFMACombine(SDNode *N,
   return SDValue();
 }
 
+// Given a double-precision ordered or unordered comparison, return the
+// condition code for an equivalent integral comparison of the operands' upper
+// 32 bits, or `SETCC_INVALID` if not possible.
+//
+// EQ/NE:
+//  If LHS.lo32 == RHS.lo32:
+//    setcc LHS, RHS, eq/ne => setcc LHS.hi32, RHS.hi32, eq/ne
+//  If LHS.lo32 != RHS.lo32:
+//    setcc LHS, RHS, eq/ne => setcc LHS.hi32, RHS.hi32, false/true
+//  The reduction is not possible if operands may be +0 and -0.
+//  For ordered eq / unordered ne, at most one operand may be NaN.
+//  For unordered eq / ordered ne, neither operand can be NaN.
+//
+// LT/GE:
+//  If LHS.lo32 >= RHS.lo32 (unsigned):
+//    setcc LHS, RHS, [u]lt/ge => LHS.hi32, RHS.hi32, [u]lt/ge
+//  If LHS.lo32 < RHS.lo32 (unsigned):
+//    setcc LHS, RHS, [u]lt/ge => LHS.hi32, RHS.hi32, [u]le/gt
+//  The reduction is only supported if both operands are nonnegative.
+//  For ordered lt / unordered ge, the RHS cannot be NaN.
+//  For unordered lt / ordered ge, neither operand can be NaN.
+//
+// LE/GT:
+//  If LHS.lo32 > RHS.lo32 (unsigned):
+//    setcc LHS, RHS, [u]le/gt => LHS.hi32, RHS.hi32, [u]lt/ge
+//  If LHS.lo32 <= RHS.lo32 (unsigned):
+//    setcc LHS, RHS, [u]le/gt => LHS.hi32, RHS.hi32, [u]le/gt
+//  The reduction is only supported if both operands are nonnegative.
+//  For unordered le / ordered gt, the LHS cannot be NaN.
+//  For ordered le / unordered gt, neither operand can be NaN.
+static ISD::CondCode tryReduceF64CompareToHiHalf(const ISD::CondCode CC,
+                                                 const SDValue LHS,
+                                                 const SDValue RHS,
+                                                 const SelectionDAG &DAG) {
+  assert(LHS.getValueType() == MVT::f64 && "Incorrect operand type!");
+
+  const KnownFPClass LHSFPClass = DAG.computeKnownFPClass(LHS, fcAllFlags);
+  const KnownFPClass RHSFPClass = DAG.computeKnownFPClass(RHS, fcAllFlags);
+  const KnownBits LHSKnownLo32 = DAG.computeKnownBits(LHS).trunc(32);
----------------
arsenm wrote:

Using both computeKnownFPClass and computeKnownBits is unfortunate. Can you stick to one or the other? You could teach computeKnownBits to report a result from FP sources 

https://github.com/llvm/llvm-project/pull/188356