[llvm] [RISCV] Lower VP_SELECT constant false to use vmerge.vxm/vmerge.vim (PR #144461)
Min-Yih Hsu via llvm-commits
llvm-commits at lists.llvm.org
Wed Jun 18 09:14:52 PDT 2025
================
@@ -34,10 +34,10 @@ define <vscale x 1 x i8> @masked_load_passthru_nxv1i8(ptr %a, <vscale x 1 x i1>
; ZVE32: # %bb.0:
; ZVE32-NEXT: csrr a1, vlenb
; ZVE32-NEXT: srli a1, a1, 3
-; ZVE32-NEXT: vsetvli a2, zero, e8, mf4, ta, ma
-; ZVE32-NEXT: vmv.v.i v8, 0
----------------
mshockwave wrote:
what I meant was that ideally, instead of the existing
```
vsetvli a2, zero, e8, mf4, ta, ma
vmv.v.i v8, 0
vsetvli zero, a1, e8, mf4, ta, mu
vle8.v v8, (a0), v0.t
```
we can have
```
vsetvli zero, a1, e8, mf4, ta, mu
vmv.v.i v8, 0
vle8.v v8, (a0), v0.t
```
In other words, using a shorter VL for both instructions instead of using VLMAX on the splat. The reason I brought up ma/mu was because I was justifying the correctness for `vmv.v.i` to use `vsetvli zero, a1, e8, mf4, ta, mu`, and aside from the VL difference, mask policy is the only other difference between those two vsetvli.
Based on this, I think
```
vsetvli zero, a1, e8, mf4, ta, mu
vmv.v.i v8, 0
vle8.v v8, (a0), v0.t
```
will be faster than your proposed
```
vsetvli zero, a1, e8, mf4, ta, ma
vle8.v v8, (a0), v0.t
vmnot.m v0, v0
vmerge.vim v8, v8, 0, v0
```
https://github.com/llvm/llvm-project/pull/144461
More information about the llvm-commits
mailing list