[llvm] [IVDescriptors] Don't require nsz/nnan for (min|max)num. (PR #137003)
Florian Hahn via llvm-commits
llvm-commits at lists.llvm.org
Wed Apr 23 08:25:25 PDT 2025
https://github.com/fhahn created https://github.com/llvm/llvm-project/pull/137003
We do not need to require nsz or nnan for FP reductions with minnum and maxnum. NaNs and nsz are handled consistently with the vector versions of minnum and maxnum. vector.reduce.(fmin|fmax) also matches the minnum/maxnum semantics for comparisons.
IIUC this was also the conclusion when support for minimum/maximum was added (https://reviews.llvm.org/D151482?id=531021#inline-1481555).
Alive2 agrees that maxnum/minnum can be re-ordered (scalar version: https://alive2.llvm.org/ce/z/GVmgBX) and also verifies the vectorized code end-to-end, with a tweak to replace llvm.reduce.fmax with a scalar maxnum, as Alive2 doesn't support llvm.reduce.fmax: https://alive2.llvm.org/ce/z/EwJKeJ . Note that verification requires a higher timeout than available in the online version.
>From d00bc585951f24cdf3a6fa1eb627df1e493ec873 Mon Sep 17 00:00:00 2001
From: Florian Hahn <flo at fhahn.com>
Date: Wed, 23 Apr 2025 15:45:36 +0100
Subject: [PATCH] [IVDescriptors] Don't require nsz/nnan for (min|max)num.
We do not need to require nsz or nnan for FP reductions with minnum and
maxnum. NaNs and nsz are handled consistently with the vector versions
of minnum and maxnum. vector.reduce.(fmin|fmax) also matches the
minnum/maxnum semantics for comparisons.
IIUC this was also the conclusion when support for minimum/maximum was
added (https://reviews.llvm.org/D151482?id=531021#inline-1481555).
Alive2 agrees that maxnum/minnum can be re-ordered (scalar version:
https://alive2.llvm.org/ce/z/GVmgBX) and also verifies the vectorized
code end-to-end, with a tweak to replace llvm.reduce.fmax with a scalar
maxnum, as Alive2 doesn't support llvm.reduce.fmax:
https://alive2.llvm.org/ce/z/EwJKeJ . Note that verification requires a
higher timeout than available in the online version.
---
llvm/lib/Analysis/IVDescriptors.cpp | 9 ++++++---
llvm/test/Transforms/LoopVectorize/minmax_reduction.ll | 6 ++++--
2 files changed, 10 insertions(+), 5 deletions(-)
diff --git a/llvm/lib/Analysis/IVDescriptors.cpp b/llvm/lib/Analysis/IVDescriptors.cpp
index 94c347b01bbfb..4a6efd0ca821a 100644
--- a/llvm/lib/Analysis/IVDescriptors.cpp
+++ b/llvm/lib/Analysis/IVDescriptors.cpp
@@ -892,10 +892,13 @@ RecurrenceDescriptor::InstDesc RecurrenceDescriptor::isRecurrenceInstr(
return true;
if (isa<FPMathOperator>(I) && I->hasNoNaNs() && I->hasNoSignedZeros())
return true;
- // minimum and maximum intrinsics do not require nsz and nnan flags since
- // NaN and signed zeroes are propagated in the intrinsic implementation.
+ // minimum/minnum and maximum/maxnum intrinsics do not require nsz and nnan
+ // flags since NaN and signed zeroes are propagated in the intrinsic
+ // implementation.
return match(I, m_Intrinsic<Intrinsic::minimum>(m_Value(), m_Value())) ||
- match(I, m_Intrinsic<Intrinsic::maximum>(m_Value(), m_Value()));
+ match(I, m_Intrinsic<Intrinsic::maximum>(m_Value(), m_Value())) ||
+ match(I, m_Intrinsic<Intrinsic::minnum>(m_Value(), m_Value())) ||
+ match(I, m_Intrinsic<Intrinsic::maxnum>(m_Value(), m_Value()));
};
if (isIntMinMaxRecurrenceKind(Kind) ||
(HasRequiredFMF() && isFPMinMaxRecurrenceKind(Kind)))
diff --git a/llvm/test/Transforms/LoopVectorize/minmax_reduction.ll b/llvm/test/Transforms/LoopVectorize/minmax_reduction.ll
index 85a90f2e04c5e..97b65e3435b5a 100644
--- a/llvm/test/Transforms/LoopVectorize/minmax_reduction.ll
+++ b/llvm/test/Transforms/LoopVectorize/minmax_reduction.ll
@@ -1002,7 +1002,8 @@ for.body: ; preds = %entry, %for.body
}
; CHECK-LABEL: @fmin_intrinsic_nofast(
-; CHECK-NOT: <2 x float> @llvm.minnum.v2f32
+; CHECK: call <2 x float> @llvm.minnum.v2f32
+; CHECK: call float @llvm.vector.reduce.fmin.v2f32
define float @fmin_intrinsic_nofast(ptr nocapture readonly %x) {
entry:
br label %for.body
@@ -1022,7 +1023,8 @@ for.body: ; preds = %entry, %for.body
}
; CHECK-LABEL: @fmax_intrinsic_nofast(
-; CHECK-NOT: <2 x float> @llvm.maxnum.v2f32
+; CHECK: call <2 x float> @llvm.maxnum.v2f32
+; CHECK: call float @llvm.vector.reduce.fmax.v2f32
define float @fmax_intrinsic_nofast(ptr nocapture readonly %x) {
entry:
br label %for.body
More information about the llvm-commits
mailing list