[llvm] [RISCV] Support vectorizing FMINIMUMNUM and FMAXIMUMNUM (PR #135727)

Sat Apr 26 21:28:28 PDT 2025

wzssyqa wrote:

> The CodeGen is OK, but I think we need more tests (I think you just need to reorganize the structure):
> 

If I understand your request correctly, I think that they existed.

> 1. `@llvm.maximumnum`/`@llvm.minimumnum` with scalable vector types.

Such as 
```
define <2 x double> @max_v2f64(<2 x double> %a, <2 x double> %b) {
entry:
  %c = call <2 x double> @llvm.maximumnum.v2f64(<2 x double> %a, <2 x double> %b)
  ret <2 x double> %c
}
```

in `llvm/test/CodeGen/RISCV/rvv/maximumnum-minimumnum.ll`

> 2. `@llvm.maximumnum`/`@llvm.minimumnum` with fixed vector types.
> 

```
define void @fmin32(ptr noundef readonly captures(none) %input1, ptr noundef readonly captures(none) %input2, ptr noundef writeonly captures(none) %output) {
entry:
  %input23 = ptrtoint ptr %input2 to i64
  %input12 = ptrtoint ptr %input1 to i64
  %output1 = ptrtoint ptr %output to i64
  br label %vector.ph

vector.ph:
  %9 = call i64 @llvm.vscale.i64()
  %10 = mul i64 %9, 4
  %n.mod.vf = urem i64 4096, %10
  %n.vec = sub i64 4096, %n.mod.vf
  %11 = call i64 @llvm.vscale.i64()
  %12 = mul i64 %11, 4
  br label %vector.body

vector.body:                                      ; preds = %vector.body, %vector.ph
  %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
  %13 = getelementptr inbounds nuw [4096 x float], ptr %input1, i64 0, i64 %index
  %14 = getelementptr inbounds nuw float, ptr %13, i32 0
  %wide.load = load <vscale x 4 x float>, ptr %14, align 4
  %15 = getelementptr inbounds nuw [4096 x float], ptr %input2, i64 0, i64 %index
  %16 = getelementptr inbounds nuw float, ptr %15, i32 0
  %wide.load5 = load <vscale x 4 x float>, ptr %16, align 4
  %17 = call <vscale x 4 x float> @llvm.minimumnum.nxv4f32(<vscale x 4 x float> %wide.load, <vscale x 4 x float> %wide.load5)
  %18 = getelementptr inbounds nuw [4096 x float], ptr %output, i64 0, i64 %index
  %19 = getelementptr inbounds nuw float, ptr %18, i32 0
  store <vscale x 4 x float> %17, ptr %19, align 4
  %index.next = add nuw i64 %index, %12
  %20 = icmp eq i64 %index.next, %n.vec
  br i1 %20, label %exit, label %vector.body

exit:                                             ; preds = %middle.block, %for.body
  ret void
}
```

> And also the VP intrinsics version (if exists). Each is in a standalone file.

We haven't define the VP flavor yet (for all architectures and common code)
If we add them, of course we will support RISC-V.

https://github.com/llvm/llvm-project/pull/135727