[llvm-branch-commits] [llvm] [AMDGPU] Update documentation for wave reduction intrinsics (PR #175132)
Matt Arsenault via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Wed Jan 14 01:52:42 PST 2026
================
@@ -1378,19 +1378,87 @@ The AMDGPU backend implements the following LLVM IR intrinsics.
0: Target default preference,
1: `Iterative strategy`, and
2: `DPP`.
- If target does not support the DPP operations (e.g. gfx6/7),
+ If the target does not support the DPP operations (e.g. gfx6/7),
reduction will be performed using default iterative strategy.
- Intrinsic is currently only implemented for i32.
+ Intrinsic is implemented for i32 and i64 types.
+
+ llvm.amdgcn.wave.reduce.min Similar to `llvm.amdgcn.wave.reduce.umin`, but performs a signed min
+ reduction on signed integers.
+ Intrinsic is implemented for i32 and i64 types.
+
+ llvm.amdgcn.wave.reduce.fmin Similar to `llvm.amdgcn.wave.reduce.umin`, but performs a floating point min
+ reduction on floating point values.
+ Intrinsic is implemented for float and double types.
+ NAN values are canonicalized.
+ However if there are two consecutive NAN values, and the second value is a SNAN,
+ wave_mode IEEE=False propogates the SNAN, while wave_mode IEEE=True quietens it.
----------------
arsenm wrote:
This is in inaccurate way to state this. The issue is signaling nan quieting. I would just state the ordering of signaling nans is nondeterministic and ignore the issue
https://github.com/llvm/llvm-project/pull/175132
More information about the llvm-branch-commits
mailing list