[llvm] [AArch64][SVE] Fold zero-extend into add reduction. (PR #102325)
David Green via llvm-commits
llvm-commits at lists.llvm.org
Fri Aug 9 01:47:52 PDT 2024
davemgreen wrote:
Thanks for this. All the added test look good to me, but could this turn `i64 vecreduce(sext(v4i32))` into sve `saddv d0, p0, z0.s`, as opposed to neon `saddlv d0, v0.4s`? Can we get it to not do that one?
```
define i64 @add_v4i32_v4i64_sext(<4 x i32> %x) {
; CHECK-LABEL: add_v4i32_v4i64_sext:
; CHECK: // %bb.0: // %entry
; CHECK-NEXT: saddlv d0, v0.4s
; CHECK-NEXT: fmov x0, d0
; CHECK-NEXT: ret
entry:
%xx = sext <4 x i32> %x to <4 x i64>
%z = call i64 @llvm.vector.reduce.add.v4i64(<4 x i64> %xx)
ret i64 %z
}
```
It might be worth trying the examples from `bin/llc -mtriple aarch64-none-eabi -mattr=+sve2 ../llvm/test/CodeGen/AArch64/vecreduce-add.ll -o -` compared to how they were before this patch. There are quite a few examples in that test file now.
https://github.com/llvm/llvm-project/pull/102325
More information about the llvm-commits
mailing list