[llvm] [RISCV] Lower VP_SELECT constant false to use vmerge.vxm/vmerge.vim (PR #144461)

Wed Jun 18 09:14:52 PDT 2025

================
@@ -34,10 +34,10 @@ define <vscale x 1 x i8> @masked_load_passthru_nxv1i8(ptr %a, <vscale x 1 x i1>
 ; ZVE32:       # %bb.0:
 ; ZVE32-NEXT:    csrr a1, vlenb
 ; ZVE32-NEXT:    srli a1, a1, 3
-; ZVE32-NEXT:    vsetvli a2, zero, e8, mf4, ta, ma
-; ZVE32-NEXT:    vmv.v.i v8, 0
----------------
mshockwave wrote:

what I meant was that ideally, instead of the existing
```
vsetvli a2, zero, e8, mf4, ta, ma
vmv.v.i v8, 0
vsetvli zero, a1, e8, mf4, ta, mu
vle8.v v8, (a0), v0.t
```
we can have
```
vsetvli zero, a1, e8, mf4, ta, mu
vmv.v.i v8, 0
vle8.v v8, (a0), v0.t
```
In other words, using a shorter VL for both instructions instead of using VLMAX on the splat. The reason I brought up ma/mu was because I was justifying the correctness for `vmv.v.i` to use `vsetvli zero, a1, e8, mf4, ta, mu`, and aside from the VL difference, mask policy is the only other difference between those two vsetvli.

Based on this, I think
```
vsetvli zero, a1, e8, mf4, ta, mu
vmv.v.i v8, 0
vle8.v v8, (a0), v0.t
```
will be faster than your proposed
```
vsetvli zero, a1, e8, mf4, ta, ma
vle8.v v8, (a0), v0.t
vmnot.m v0, v0
vmerge.vim v8, v8, 0, v0
```

https://github.com/llvm/llvm-project/pull/144461