[llvm-branch-commits] [llvm] [AMDGPU] Update documentation for wave reduction intrinsics (PR #175132)

Matt Arsenault via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Sat Jan 24 11:39:42 PST 2026


================
@@ -1378,19 +1378,85 @@ The AMDGPU backend implements the following LLVM IR intrinsics.
                                                    0: Target default preference,
                                                    1: `Iterative strategy`, and
                                                    2: `DPP`.
-                                                   If target does not support the DPP operations (e.g. gfx6/7),
+                                                   If the target does not support the DPP operations (e.g. gfx6/7),
                                                    reduction will be performed using default iterative strategy.
-                                                   Intrinsic is currently only implemented for i32.
+                                                   Intrinsic is implemented for i32 and i64 types.
+
+  llvm.amdgcn.wave.reduce.min                      Similar to `llvm.amdgcn.wave.reduce.umin`, but performs a signed min
+                                                   reduction on signed integers.
+                                                   Intrinsic is implemented for i32 and i64 types.
+
+  llvm.amdgcn.wave.reduce.fmin                     Similar to `llvm.amdgcn.wave.reduce.umin`, but performs a floating point min
+                                                   reduction on floating point values.
+                                                   Intrinsic is implemented for float and double types.
+                                                   NAN values are not canonnicalized.
+                                                   The ordering behaviour of SNANs is non-deterministic.
 
   llvm.amdgcn.wave.reduce.umax                     Performs an arithmetic unsigned max reduction on the unsigned values
                                                    provided by each lane in the wavefront.
                                                    Intrinsic takes a hint for reduction strategy using second operand
                                                    0: Target default preference,
                                                    1: `Iterative strategy`, and
                                                    2: `DPP`.
-                                                   If target does not support the DPP operations (e.g. gfx6/7),
+                                                   If the target does not support the DPP operations (e.g. gfx6/7),
                                                    reduction will be performed using default iterative strategy.
-                                                   Intrinsic is currently only implemented for i32.
+                                                   Intrinsic is implemented for i32 and i64 types.
+
+  llvm.amdgcn.wave.reduce.max                      Similar to `llvm.amdgcn.wave.reduce.umax`, but performs a signed max
+                                                   reduction on signed integers.
+                                                   Intrinsic is implemented for i32 and i64 types.
+
+  llvm.amdgcn.wave.reduce.fmax                     Similar to `llvm.amdgcn.wave.reduce.umax`, but performs a floating point max
+                                                   reduction on floating point values.
+                                                   Intrinsic is implemented for float and double types.
+                                                   NAN values are not canonnicalized.
----------------
arsenm wrote:

```suggestion
```

Same as above, reference llvm.maxnum 

https://github.com/llvm/llvm-project/pull/175132


More information about the llvm-branch-commits mailing list