[llvm] [LV] Vectorize FMax w/o fast-math flags. (PR #146711)

Sat Jul 12 03:04:37 PDT 2025

================
@@ -47,6 +47,9 @@ enum class RecurKind {
   FMul,     ///< Product of floats.
   FMin,     ///< FP min implemented in terms of select(cmp()).
   FMax,     ///< FP max implemented in terms of select(cmp()).
+  FCmpOGTSelect, ///< FP max implemented in terms of select(cmp()), but without
+                 /// any fast-math flags. Users need to handle NaNs and signed
+                 /// zeros when generating code.
----------------
ayalz wrote:

Pattern and how ("users need") to handle it should indeed be explained, but preferably elsewhere. Pattern seems to suggest how to handle FP reductions (min, max, possibly others as well?) in the presence of NaNs and/or signed zeroes (both equally challenging?), which is evaded in the presence of certain fast-math flags (namely absence of nans and signed zeroes?).

Does the following sound right:
a. If the set is NaN-free, its reduction result is as with the fast-math flag.
b. If the set contains only NaN's, its reduction is either NaN or the initial value, depending on the reduction operation being unordered or ordered, respectively.
c. If the set contains both NaN's and non-NaN's, its reduction is either NaN or the reduction of all non-NaN's, depending on the reduction operation being unordered or ordered, respectively.

The vector of partial subset reduction results of case (a) contain only non NaN's, and is subject to standard final reduction. In case (b), this vector holds only NaN's or only the initial value, which provides the respective final value. Case (c) requires "tie breaking" based on index? What if the initial value is NaN?

https://github.com/llvm/llvm-project/pull/146711