[PATCH] D45336: Apply accumulator to fadd/fmul experimental vector reductions (PR36734)
Gonzalo BG via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Fri Apr 6 09:03:43 PDT 2018
gnzlbg added a comment.
> it should really be the default in both LLVM and source languages except where there's an explicit constraint that forces linear reduction.
Tree reductions is what Rust portable packed SIMD RFC [0] currently specifies.
@aemerson
> For strictly ordered reductions which are supported on some vector architectures like ARM SVE, then
> the accumulator operand is used when there are no FMF flags on the call.
This makes sense. Reading from the ARM SVE spec [1]:
> Horizontal reductions
>
> These instructions perform arithmetic horizontally across Active elements of a single source vector and deliver a
> scalar result.
>
> The floating-point horizontal accumulating sum instruction, FADDA, operates strictly in order of increasing Element
> number across a vector, using the scalar destination register as a source for the initial value of the accumulator. This
> preserves the original program evaluation order where non-associativity is required.
> The other floating-point reductions calculate their result using a recursive pair-wise algorithm that does not preserve
> the original program order, but permits increased parallelism for code that does not require strict order of evaluation.
The accumulator does really make sense for `FADDA`: one can iterate over a large sequence of memory, adding horizontal vectors to the accumulator, and get a result that preserves the ordered arithmetic. That's pretty cool.
[0]: https://github.com/rust-lang/rfcs/pull/2366
[1]: https://static.docs.arm.com/ddi0584/a/DDI0584A_b_SVE_supp_armv8A.pdf
Repository:
rL LLVM
https://reviews.llvm.org/D45336
More information about the llvm-commits
mailing list