[llvm] [InstCombine] Transform (fcmp + fadd + sel) into (fcmp + sel + fadd) (PR #106492)

Tue Sep 3 23:27:56 PDT 2024

================
@@ -3668,6 +3668,51 @@ static bool hasAffectedValue(Value *V, SmallPtrSetImpl<Value *> &Affected,
   return false;
 }
 
+static Value *foldSelectAddConstant(SelectInst &SI,
+                                    InstCombiner::BuilderTy &Builder) {
+  Value *Cmp;
+  Instruction *FAdd;
+  ConstantFP *C;
+
+  // select((fcmp OGT/OLT, X, 0), (fadd X, C), C) => fadd((select (fcmp OGT/OLT,
+  // X, 0), X, 0), C)
+
+  // This transformation enables the possibility of transforming fcmp + sel into
+  // a fmax/fmin.
+
+  // OneUse check for `Cmp` is necessary because it makes sure that other
+  // InstCombine folds don't undo this transformation and cause an infinite
+  // loop.
+  if (match(&SI, m_Select(m_OneUse(m_Value(Cmp)), m_OneUse(m_Instruction(FAdd)),
+                          m_ConstantFP(C))) ||
+      match(&SI, m_Select(m_OneUse(m_Value(Cmp)), m_ConstantFP(C),
+                          m_OneUse(m_Instruction(FAdd))))) {
+    Value *X;
+    CmpInst::Predicate Pred;
+    if (!match(Cmp, m_FCmp(Pred, m_Value(X), m_AnyZeroFP())))
+      return nullptr;
+
+    if (Pred != CmpInst::FCMP_OGT && Pred != CmpInst::FCMP_OLT)
----------------
arsenm wrote:

The codegen is worse with the -0 materialize, so maybe only do this for nsz. This is maybe a case where we want the inverse transform in CGP though. 

test_ogt actually introduces the fmaxnum but then requires the canonicalize. v_max_f32 follows IEEE754 2008 and has the inverted signaling nan behavior and requires pre-quieting. the legacy case doesn't require the extra instruction 

https://github.com/llvm/llvm-project/pull/106492