[llvm] [RISCV] Support vectorizing FMINIMUMNUM and FMAXIMUMNUM (PR #135727)
Pengcheng Wang via llvm-commits
llvm-commits at lists.llvm.org
Sat Apr 26 23:08:41 PDT 2025
wangpc-pp wrote:
> > The CodeGen is OK, but I think we need more tests (I think you just need to reorganize the structure):
>
> If I understand your request correctly, I think that they existed. See `llvm/test/CodeGen/RISCV/rvv/maximumnum-minimumnum.ll`
>
> > 1. `@llvm.maximumnum`/`@llvm.minimumnum` with scalable vector types.
>
> ```
> define void @fmin32(ptr noundef readonly captures(none) %input1, ptr noundef readonly captures(none) %input2, ptr noundef writeonly captures(none) %output) {
> entry:
> %input23 = ptrtoint ptr %input2 to i64
> %input12 = ptrtoint ptr %input1 to i64
> %output1 = ptrtoint ptr %output to i64
> br label %vector.ph
>
> vector.ph:
> %9 = call i64 @llvm.vscale.i64()
> %10 = mul i64 %9, 4
> %n.mod.vf = urem i64 4096, %10
> %n.vec = sub i64 4096, %n.mod.vf
> %11 = call i64 @llvm.vscale.i64()
> %12 = mul i64 %11, 4
> br label %vector.body
>
> vector.body: ; preds = %vector.body, %vector.ph
> %index = phi i64 [ 0, %vector.ph ], [ %index.next, %vector.body ]
> %13 = getelementptr inbounds nuw [4096 x float], ptr %input1, i64 0, i64 %index
> %14 = getelementptr inbounds nuw float, ptr %13, i32 0
> %wide.load = load <vscale x 4 x float>, ptr %14, align 4
> %15 = getelementptr inbounds nuw [4096 x float], ptr %input2, i64 0, i64 %index
> %16 = getelementptr inbounds nuw float, ptr %15, i32 0
> %wide.load5 = load <vscale x 4 x float>, ptr %16, align 4
> %17 = call <vscale x 4 x float> @llvm.minimumnum.nxv4f32(<vscale x 4 x float> %wide.load, <vscale x 4 x float> %wide.load5)
> %18 = getelementptr inbounds nuw [4096 x float], ptr %output, i64 0, i64 %index
> %19 = getelementptr inbounds nuw float, ptr %18, i32 0
> store <vscale x 4 x float> %17, ptr %19, align 4
> %index.next = add nuw i64 %index, %12
> %20 = icmp eq i64 %index.next, %n.vec
> br i1 %20, label %exit, label %vector.body
>
> exit: ; preds = %middle.block, %for.body
> ret void
> }
> ```
>
> > 2. `@llvm.maximumnum`/`@llvm.minimumnum` with fixed vector types.
>
> ```
> define <2 x double> @max_v2f64(<2 x double> %a, <2 x double> %b) {
> entry:
> %c = call <2 x double> @llvm.maximumnum.v2f64(<2 x double> %a, <2 x double> %b)
> ret <2 x double> %c
> }
> ```
>
> > And also the VP intrinsics version (if exists). Each is in a standalone file.
>
> We haven't define the VP flavor yet (for all architectures and common code) If we add them, of course we will support RISC-V.
I see them, just put them in standalone files (see how we test others). That means you need to add four tests:
1. fmaximumnum-sdnode.ll (scalable vectors)
2. fminimumnum-sdnode.ll (scalable vectors)
3. fixed-vectors-fmaximumnum.ll (fixed vectors)
4. fixed-vectors-fminimumnum.ll (fixed vectors)
And also, for scalable vectors, you don't need to test the whole vectorized loop, just test test intrinsic itself.
https://github.com/llvm/llvm-project/pull/135727
More information about the llvm-commits
mailing list