[llvm] [RISCV] Lower unmasked zero-stride vp.stride to a splat of one scalar load. (PR #97394)
Pengcheng Wang via llvm-commits
llvm-commits at lists.llvm.org
Wed Jul 3 03:55:37 PDT 2024
wangpc-pp wrote:
> > We may use `-1` to represent VLMAX?
>
> This doesn't get picked up on RV64 but it does on RV32: I think it's because the EVL argument is 32 bits and we check against a 64 bit sentinel value. Probably something we should fix?
Agree, I met the same problem in my quick prototype. We need to fix it.
>
> ```llvm
> define <vscale x 2 x i32> @f(<vscale x 2 x i32> %x, <vscale x 2 x i32> %y, <vscale x 2 x i1> %mask) {
> %z = call <vscale x 2 x i32> @llvm.vp.add.nxv2i32(<vscale x 2 x i32> %x, <vscale x 2 x i32> %y, <vscale x 2 x i1> %mask, i32 -1)
> ret <vscale x 2 x i32> %z
> }
> ```
>
> ```
> f:
> li a0, -1
> srli a0, a0, 32
> vsetvli zero, a0, e32, m1, ta, ma
> vadd.vv v8, v8, v9, v0.t
> ret
> ```
>
> > We need to add a passthru operand to vp.strided.load I think.
>
> We should be able to emulate the passthru with vp.merge:
>
> ```llvm
> define <vscale x 2 x i32> @vpmerge_vpload(<vscale x 2 x i32> %passthru, ptr %p, <vscale x 2 x i1> %m, i32 zeroext %vl) {
> ; CHECK-LABEL: vpmerge_vpload:
> ; CHECK: # %bb.0:
> ; CHECK-NEXT: vsetvli zero, a1, e32, m1, tu, mu
> ; CHECK-NEXT: vle32.v v8, (a0), v0.t
> ; CHECK-NEXT: ret
> %a = call <vscale x 2 x i32> @llvm.vp.load.nxv2i32.p0(ptr %p, <vscale x 2 x i1> splat (i1 -1), i32 %vl)
> %b = call <vscale x 2 x i32> @llvm.vp.merge.nxv2i32(<vscale x 2 x i1> %m, <vscale x 2 x i32> %a, <vscale x 2 x i32> %passthru, i32 %vl)
> ret <vscale x 2 x i32> %b
> }
> ```
Yeah! That's feasible!
But is there any reason why vp.strided.load doesn't have a passthru operand? Would it be much more straightforward if we add a passthru operand to it? I don't know the history, but it seems that these intrinsics (gather/scatter vs vp.strided.load/store) are not consistent.
https://github.com/llvm/llvm-project/pull/97394
More information about the llvm-commits
mailing list