[llvm] [VPlan] Add support for in-loop AnyOf reductions (PR #131830)
Alexey Bataev via llvm-commits
llvm-commits at lists.llvm.org
Thu Mar 20 04:30:49 PDT 2025
alexey-bataev wrote:
> > As I mentioned in [#120405 (comment)](https://github.com/llvm/llvm-project/issues/120405#issuecomment-2569024111), another possible approach is to widen the type of vp.merge,
>
> > I believe what we talked about doing internally was a vp.zext to i8 then a i8 vp.merge in the loop with a vp.reduce.or after the loop. That avoids putting a vcpop.m in the loop.
>
> I just ran some tests, I think widening to i8 is also more profitable on the BPI-F3 vs vcpop.m, e.g:
>
> ```assembly
> vsetvli a5, zero, e8, m1, ta, ma
> vmv.v.i v9, 0
> loop:
> vsetvli a5, a7, e32, m1, ta, ma
> vle32.v v8, (a0)
> add a0, a0, a5
> vmseq.vx v0, v8, zero
> vsetvli zero, zero, e8, mf4, ta, ma
> vmerge.vim v10, v9, 1, v0
> vor.vv v11, v11, v10
> sub a7, a7, a5
> bnez a7, loop
> exit:
> vmsne.vi v10, v11, 0
> vcpop.m a1, v10
> ```
>
> This sounds like an approach all microarchs can agree on.
+1.
Generally speaking, all such transformations should be cost-based decisions. There should 3 vplans - the original, the one with vcpop and the one with extensions. And cost-based decision should choose the best plan.
> Is anyone at SiFive already working on this? Otherwise I can take a look at it.
Go ahead
https://github.com/llvm/llvm-project/pull/131830
More information about the llvm-commits
mailing list