[llvm] [VPlan] Convert EVL loops to variable-length stepping after dissolution (PR #147222)
Luke Lau via llvm-commits
llvm-commits at lists.llvm.org
Mon Jul 21 03:08:00 PDT 2025
https://github.com/lukel97 approved this pull request.
LGTM with the nits I posted earlier + Florian's comments. I tested this on TSVC too and most loops are unchanged, but there's a couple of places where we actually end up improving the loop e.g. in Reductions-dbl:
```diff
+ sub s1, s11, a6
+ sh2add a4, a6, s2
+ slli a0, a6, 10
.LBB14_7: # %vector.body
# Parent Loop BB14_3 Depth=1
# Parent Loop BB14_5 Depth=2
# => This Inner Loop Header: Depth=3
- add a0, a5, a2
+ sub a3, s1, a2
add a1, s0, a2
- add s1, a3, s6
- flw fa5, 0(a6)
- sub a0, s8, a0
- sh2add a1, a1, s3
- vsetvli a0, a0, e32, m2, ta, ma
- add s1, s1, a1
- vle32.v v8, (s1)
- vle32.v v10, (a1)
- sub a4, a4, s2
- vfnmsac.vf v10, fa5, v8
- vse32.v v10, (a1)
- add a2, a2, a0
- bnez a4, .LBB14_7
+ add a5, a0, s6
+ flw fa5, 0(a4)
+ vsetvli a3, a3, e32, m2, ta, ma
+ sh2add a1, a1, s2
+ add a5, a5, a1
+ vle32.v v8, (a1)
+ vle32.v v10, (a5)
+ vfnmsac.vf v8, fa5, v10
+ add a2, a2, a3
+ vse32.v v8, (a1)
+ bne a2, s1, .LBB14_7
j .LBB14_4
```
https://github.com/llvm/llvm-project/pull/147222
More information about the llvm-commits
mailing list