[llvm] [LV][doc] Update and extend the docs on floating-point reduction vectorization (PR #172809)
Tibor Győri via llvm-commits
llvm-commits at lists.llvm.org
Fri Jan 2 08:38:23 PST 2026
https://github.com/TiborGY updated https://github.com/llvm/llvm-project/pull/172809
>From 6eb45feae557776a711551efaa35568cd334d23f Mon Sep 17 00:00:00 2001
From: GYT <tiborgyri at gmail.com>
Date: Wed, 17 Dec 2025 06:41:00 +0100
Subject: [PATCH 1/6] Update and elaborate the status of floating point
reduction vectorization
---
llvm/docs/Vectorizers.rst | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)
diff --git a/llvm/docs/Vectorizers.rst b/llvm/docs/Vectorizers.rst
index 0dfa33753cdd0..964164522c987 100644
--- a/llvm/docs/Vectorizers.rst
+++ b/llvm/docs/Vectorizers.rst
@@ -200,7 +200,22 @@ reduction operations, such as addition, multiplication, XOR, AND and OR.
return sum;
}
-We support floating point reduction operations when `-ffast-math` is used.
+The full vectorization of reductions inherently involves reordering operations,
+which is problematic for floating-point reductions. Since floating-point
+operations are not associative, the result may depend on the order of operations.
+Therefore, vectorizing floating-point reductions is implicitly prohibited by
+the C and C++ standards, unless the compiler can ensure that the result does
+not change.
+
+For this reason, on most targets we support floating point reduction operations
+only when `-ffast-math` (or at least the `-fassociative-math -fno-signed-zeros
+-fno-trapping-math` subset of `-ffast-math`) is used. On select targets such as
+AArch64 and RISC-V we support generating ordered reductions which preserve the
+exact result, allowing a limited form of vectorization to take place while
+remaining standards-compliant. However, ordered reductions are typically less
+efficient than traditionally vectorized reductions, therefore enabling floating-
+point reordering may still result in more performant reductions on these
+targets.
Inductions
^^^^^^^^^^
>From e0c057426dfadb3b709e37b808791c8d0819e77f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tibor=20Gy=C5=91ri?= <tibor.gyori at chem.u-szeged.hu>
Date: Fri, 2 Jan 2026 17:09:33 +0100
Subject: [PATCH 2/6] Suggestion 1
Co-authored-by: Florian Hahn <flo at fhahn.com>
---
llvm/docs/Vectorizers.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/llvm/docs/Vectorizers.rst b/llvm/docs/Vectorizers.rst
index 964164522c987..1d60e5892c498 100644
--- a/llvm/docs/Vectorizers.rst
+++ b/llvm/docs/Vectorizers.rst
@@ -200,7 +200,7 @@ reduction operations, such as addition, multiplication, XOR, AND and OR.
return sum;
}
-The full vectorization of reductions inherently involves reordering operations,
+The full vectorization of reductions requires reordering operations,
which is problematic for floating-point reductions. Since floating-point
operations are not associative, the result may depend on the order of operations.
Therefore, vectorizing floating-point reductions is implicitly prohibited by
>From 5129b2203b8ef171dbe38da5ff1fa1d9316aa8bc Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tibor=20Gy=C5=91ri?= <tibor.gyori at chem.u-szeged.hu>
Date: Fri, 2 Jan 2026 17:10:34 +0100
Subject: [PATCH 3/6] Suggestion 3
Co-authored-by: Florian Hahn <flo at fhahn.com>
---
llvm/docs/Vectorizers.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/llvm/docs/Vectorizers.rst b/llvm/docs/Vectorizers.rst
index 1d60e5892c498..b8515afb2b76a 100644
--- a/llvm/docs/Vectorizers.rst
+++ b/llvm/docs/Vectorizers.rst
@@ -210,7 +210,7 @@ not change.
For this reason, on most targets we support floating point reduction operations
only when `-ffast-math` (or at least the `-fassociative-math -fno-signed-zeros
-fno-trapping-math` subset of `-ffast-math`) is used. On select targets such as
-AArch64 and RISC-V we support generating ordered reductions which preserve the
+AArch64 and RISC-V LLVM supports generating ordered reductions which preserve the
exact result, allowing a limited form of vectorization to take place while
remaining standards-compliant. However, ordered reductions are typically less
efficient than traditionally vectorized reductions, therefore enabling floating-
>From 9f31b59b9dbc87ed6ac377b1caa449ff4a6aa14c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Tibor=20Gy=C5=91ri?= <tibor.gyori at chem.u-szeged.hu>
Date: Fri, 2 Jan 2026 17:24:28 +0100
Subject: [PATCH 4/6] Suggestion 2
Co-authored-by: Florian Hahn <flo at fhahn.com>
---
llvm/docs/Vectorizers.rst | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/llvm/docs/Vectorizers.rst b/llvm/docs/Vectorizers.rst
index b8515afb2b76a..27c6c2be6c68e 100644
--- a/llvm/docs/Vectorizers.rst
+++ b/llvm/docs/Vectorizers.rst
@@ -207,9 +207,9 @@ Therefore, vectorizing floating-point reductions is implicitly prohibited by
the C and C++ standards, unless the compiler can ensure that the result does
not change.
-For this reason, on most targets we support floating point reduction operations
-only when `-ffast-math` (or at least the `-fassociative-math -fno-signed-zeros
--fno-trapping-math` subset of `-ffast-math`) is used. On select targets such as
+Therefore LLVM supports vectorizing floating point reductions
+only when at least the `-fassociative-math -fno-signed-zeros
+-fno-trapping-math` subset of `-ffast-math` is used on most targets. On select targets such as
AArch64 and RISC-V LLVM supports generating ordered reductions which preserve the
exact result, allowing a limited form of vectorization to take place while
remaining standards-compliant. However, ordered reductions are typically less
>From 55306996e0e76a59225c730e85feef8b61c48c4f Mon Sep 17 00:00:00 2001
From: GYT <tiborgyri at gmail.com>
Date: Fri, 2 Jan 2026 17:33:44 +0100
Subject: [PATCH 5/6] implement suggestion 4
---
llvm/docs/Vectorizers.rst | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/llvm/docs/Vectorizers.rst b/llvm/docs/Vectorizers.rst
index 27c6c2be6c68e..1c151bbfa3a73 100644
--- a/llvm/docs/Vectorizers.rst
+++ b/llvm/docs/Vectorizers.rst
@@ -203,11 +203,9 @@ reduction operations, such as addition, multiplication, XOR, AND and OR.
The full vectorization of reductions requires reordering operations,
which is problematic for floating-point reductions. Since floating-point
operations are not associative, the result may depend on the order of operations.
-Therefore, vectorizing floating-point reductions is implicitly prohibited by
-the C and C++ standards, unless the compiler can ensure that the result does
-not change.
-Therefore LLVM supports vectorizing floating point reductions
+Changing floating-point results is implicitly prohibited by the C and C++ standards,
+therefore LLVM supports vectorizing floating point reductions
only when at least the `-fassociative-math -fno-signed-zeros
-fno-trapping-math` subset of `-ffast-math` is used on most targets. On select targets such as
AArch64 and RISC-V LLVM supports generating ordered reductions which preserve the
>From f1c72d51be73ae26e0147b5877f5f997eeabafe1 Mon Sep 17 00:00:00 2001
From: GYT <tiborgyri at gmail.com>
Date: Fri, 2 Jan 2026 17:38:08 +0100
Subject: [PATCH 6/6] formatting
---
llvm/docs/Vectorizers.rst | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/llvm/docs/Vectorizers.rst b/llvm/docs/Vectorizers.rst
index 1c151bbfa3a73..33a145adfa83e 100644
--- a/llvm/docs/Vectorizers.rst
+++ b/llvm/docs/Vectorizers.rst
@@ -202,18 +202,19 @@ reduction operations, such as addition, multiplication, XOR, AND and OR.
The full vectorization of reductions requires reordering operations,
which is problematic for floating-point reductions. Since floating-point
-operations are not associative, the result may depend on the order of operations.
+operations are not associative, the result may depend on the order of
+operations.
-Changing floating-point results is implicitly prohibited by the C and C++ standards,
-therefore LLVM supports vectorizing floating point reductions
+Changing floating-point results is implicitly prohibited by the C and C++
+standards, therefore LLVM supports vectorizing floating point reductions
only when at least the `-fassociative-math -fno-signed-zeros
--fno-trapping-math` subset of `-ffast-math` is used on most targets. On select targets such as
-AArch64 and RISC-V LLVM supports generating ordered reductions which preserve the
-exact result, allowing a limited form of vectorization to take place while
-remaining standards-compliant. However, ordered reductions are typically less
-efficient than traditionally vectorized reductions, therefore enabling floating-
-point reordering may still result in more performant reductions on these
-targets.
+-fno-trapping-math` subset of `-ffast-math` is used on most targets. On select
+targets such as AArch64 and RISC-V LLVM supports generating ordered reductions
+which preserve the exact result, allowing a limited form of vectorization to
+take place while remaining standards-compliant. However, ordered reductions
+are typically less efficient than traditionally vectorized reductions,
+therefore enabling floating-point reordering may still result in more
+performant reductions on these targets.
Inductions
^^^^^^^^^^
More information about the llvm-commits
mailing list