[llvm] [VPlan] Don't convert widen recipes to VP intrinsics in EVL transform (PR #126177)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Fri Feb 7 01:14:19 PST 2025
lukel97 wrote:
> I have an question—could you check whether div is handled by VPWidenEVLRecipe? If so, we may need to handle div separately to avoid division-by-zero issues, rather than simply discarding VPWidenEVLRecipe.
I had the same question too, I checked beforehand and it looks like LoopVectorizationLegality is used when the VPlan is constructed to mask off any lanes that would trap e.g.:
```c
void f(int *x, int *y, int n) {
for (int i = 0; i < n; i++)
x[i] /= y[i];
}
```
```
<x1> vector loop: {
vector.body:
EMIT vp<%4> = CANONICAL-INDUCTION ir<0>, vp<%index.next>
EXPLICIT-VECTOR-LENGTH-BASED-IV-PHI vp<%5> = phi ir<0>, vp<%index.evl.next>
EMIT vp<%avl> = sub vp<%3>, vp<%5>
EMIT vp<%6> = EXPLICIT-VECTOR-LENGTH vp<%avl>
vp<%7> = SCALAR-STEPS vp<%5>, ir<1>
CLONE ir<%arrayidx> = getelementptr inbounds nuw ir<%y>, vp<%7>
vp<%8> = vector-pointer ir<%arrayidx>
WIDEN ir<%0> = vp.load vp<%8>, vp<%6>
CLONE ir<%arrayidx2> = getelementptr inbounds nuw ir<%x>, vp<%7>
vp<%9> = vector-pointer ir<%arrayidx2>
WIDEN ir<%1> = vp.load vp<%9>, vp<%6>
WIDEN-INTRINSIC vp<%10> = call llvm.vp.merge(ir<true>, ir<%0>, ir<1>, vp<%6>)
WIDEN ir<%div> = sdiv ir<%1>, vp<%10>
vp<%11> = vector-pointer ir<%arrayidx2>
WIDEN vp.store vp<%11>, ir<%div>, vp<%6>
SCALAR-CAST vp<%12> = zext vp<%6> to i64
EMIT vp<%index.evl.next> = add nuw vp<%12>, vp<%5>
EMIT vp<%index.next> = add nuw vp<%4>, vp<%0>
EMIT branch-on-count vp<%index.next>, vp<%1>
No successors
}
```
The `llvm.vp.merge(ir<true>, ir<%0>, ir<1>, vp<%6>)` gets folded away by RISCVVectorPeephole so it ends up generating
```asm
.LBB0_5: # %vector.body
# =>This Inner Loop Header: Depth=1
sub t0, a2, a3
sh2add a6, a3, a1
sh2add a4, a3, a0
vsetvli t0, t0, e8, mf2, ta, ma
vmv2r.v v10, v8
vle32.v v12, (a4)
vsetvli zero, zero, e32, m2, tu, ma
vle32.v v10, (a6)
sub a5, a5, a7
vsetvli zero, zero, e32, m2, ta, ma
vdiv.vv v10, v12, v10
vse32.v v10, (a4)
add a3, a3, t0
bnez a5, .LBB0_5
```
https://github.com/llvm/llvm-project/pull/126177
More information about the llvm-commits
mailing list