[llvm] [VPlan] Don't convert widen recipes to VP intrinsics in EVL transform (PR #126177)

Luke Lau via llvm-commits llvm-commits at lists.llvm.org
Fri Feb 7 01:14:19 PST 2025


lukel97 wrote:

> I have an question—could you check whether div is handled by VPWidenEVLRecipe? If so, we may need to handle div separately to avoid division-by-zero issues, rather than simply discarding VPWidenEVLRecipe.

I had the same question too, I checked beforehand and it looks like LoopVectorizationLegality is used when the VPlan is constructed to mask off any lanes that would trap e.g.:

```c
void f(int *x, int *y, int n) {
  for (int i = 0; i < n; i++)
    x[i] /= y[i];
}
```

```
<x1> vector loop: {
  vector.body:
    EMIT vp<%4> = CANONICAL-INDUCTION ir<0>, vp<%index.next>
    EXPLICIT-VECTOR-LENGTH-BASED-IV-PHI vp<%5> = phi ir<0>, vp<%index.evl.next>
    EMIT vp<%avl> = sub vp<%3>, vp<%5>
    EMIT vp<%6> = EXPLICIT-VECTOR-LENGTH vp<%avl>
    vp<%7> = SCALAR-STEPS vp<%5>, ir<1>
    CLONE ir<%arrayidx> = getelementptr inbounds nuw ir<%y>, vp<%7>
    vp<%8> = vector-pointer ir<%arrayidx>
    WIDEN ir<%0> = vp.load vp<%8>, vp<%6>
    CLONE ir<%arrayidx2> = getelementptr inbounds nuw ir<%x>, vp<%7>
    vp<%9> = vector-pointer ir<%arrayidx2>
    WIDEN ir<%1> = vp.load vp<%9>, vp<%6>
    WIDEN-INTRINSIC vp<%10> = call llvm.vp.merge(ir<true>, ir<%0>, ir<1>, vp<%6>)
    WIDEN ir<%div> = sdiv ir<%1>, vp<%10>
    vp<%11> = vector-pointer ir<%arrayidx2>
    WIDEN vp.store vp<%11>, ir<%div>, vp<%6>
    SCALAR-CAST vp<%12> = zext vp<%6> to i64
    EMIT vp<%index.evl.next> = add nuw vp<%12>, vp<%5>
    EMIT vp<%index.next> = add nuw vp<%4>, vp<%0>
    EMIT branch-on-count vp<%index.next>, vp<%1>
  No successors
}
```

The `llvm.vp.merge(ir<true>, ir<%0>, ir<1>, vp<%6>)` gets folded away by RISCVVectorPeephole so it ends up generating

```asm
.LBB0_5:                                # %vector.body
                                        # =>This Inner Loop Header: Depth=1
        sub     t0, a2, a3
        sh2add  a6, a3, a1
        sh2add  a4, a3, a0
        vsetvli t0, t0, e8, mf2, ta, ma
        vmv2r.v v10, v8
        vle32.v v12, (a4)
        vsetvli zero, zero, e32, m2, tu, ma
        vle32.v v10, (a6)
        sub     a5, a5, a7
        vsetvli zero, zero, e32, m2, ta, ma
        vdiv.vv v10, v12, v10
        vse32.v v10, (a4)
        add     a3, a3, t0
        bnez    a5, .LBB0_5
```

https://github.com/llvm/llvm-project/pull/126177


More information about the llvm-commits mailing list