[PATCH] D45336: Apply accumulator to fadd/fmul experimental vector reductions (PR36734)

Gonzalo BG via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Apr 6 09:03:43 PDT 2018


gnzlbg added a comment.

> it should really be the default in both LLVM and source languages except where there's an explicit constraint that forces linear reduction.

Tree reductions is what Rust portable packed SIMD RFC [0] currently specifies.

@aemerson

> For strictly ordered reductions which are supported on some vector architectures like ARM SVE, then
>  the accumulator operand is used when there are no FMF flags on the call.

This makes sense. Reading from the ARM SVE spec [1]:

> Horizontal reductions
> 
> These instructions perform arithmetic horizontally across Active elements of a single source vector and deliver a
>  scalar result.
> 
> The floating-point horizontal accumulating sum instruction, FADDA, operates strictly in order of increasing Element
>  number across a vector, using the scalar destination register as a source for the initial value of the accumulator. This
>  preserves the original program evaluation order where non-associativity is required.
>  The other floating-point reductions calculate their result using a recursive pair-wise algorithm that does not preserve
>  the original program order, but permits increased parallelism for code that does not require strict order of evaluation.

The accumulator does really make sense for `FADDA`: one can iterate over a large sequence of memory, adding horizontal vectors to the accumulator, and get a result that preserves the ordered arithmetic. That's pretty cool.

[0]: https://github.com/rust-lang/rfcs/pull/2366
[1]: https://static.docs.arm.com/ddi0584/a/DDI0584A_b_SVE_supp_armv8A.pdf


Repository:
  rL LLVM

https://reviews.llvm.org/D45336





More information about the llvm-commits mailing list