[llvm] [VPlan] Add support for in-loop AnyOf reductions (PR #131830)

Luke Lau via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 20 03:29:13 PDT 2025


lukel97 wrote:

> As I mentioned in https://github.com/llvm/llvm-project/issues/120405#issuecomment-2569024111, another possible approach is to widen the type of vp.merge,

> I believe what we talked about doing internally was a vp.zext to i8 then a i8 vp.merge in the loop with a vp.reduce.or after the loop. That avoids putting a vcpop.m in the loop.

I just ran some tests on the BPI-F3, I think widening to i8 is actually more profitable on the BPI-F3 vs vcpop.m, e.g:

```asm
	vsetvli a5, zero, e8, m1, ta, ma
	vmv.v.i	v9, 0
loop:
	vsetvli	a5, a7, e32, m1, ta, ma
	vle32.v	v8, (a0)
	add	a0, a0, a5
	vmseq.vx	v0, v8, zero
	vsetvli	zero, zero, e8, mf4, ta, ma
	vmerge.vim	v10, v9, 1, v0
	vor.vv	v11, v11, v10
	sub	a7, a7, a5
	bnez	a7, loop
exit:
	vmsne.vi	v10, v11, 0
	vcpop.m	a1, v10
```

This sounds like an approach all microarchs can agree on. Is anyone at SiFive already working on this? Otherwise I can take a look at it.

https://github.com/llvm/llvm-project/pull/131830


More information about the llvm-commits mailing list