[llvm-dev] [RFC] Changes to llvm.experimental.vector.reduce intrinsics
David Greene via llvm-dev
llvm-dev at lists.llvm.org
Thu Apr 4 13:03:21 PDT 2019
Sander De Smalen <Sander.DeSmalen at arm.com> writes:
> Hi David,
> The reason for the asymmetry and requiring an explicit start-value
> operand is to be able to describe strict reductions that need to
> preserve the same associativity of a scalarized reduction.
> For example:
> %res = call float @llvm.experimental.vector.reduce.fadd(float %start, <4 x float> <float %elt0, float %elt1, float %elt2, float %elt3>)
> describes the following reduction:
> %res = (((%start + %elt0) + %elt1) + %elt2) + %elt3
> %tmp = call float @llvm.experimental.vector.reduce.fadd(<4 x float> <float %elt0, float %elt1, float %elt2, float %elt3>)
> %res = add float %start, %tmp
> %res = %start + (((%elt0 + %elt1) + %elt2) + %elt3)
> Which is not the same, hence why the start operand is needed in the
> intrinsic itself. For fast-math (specifically the 'reassoc' property)
> the compiler is free to reassociate the expression, so the
> start/accumulator operand isn't needed.
Ok, I see. I was assuming the scalar would just be folded into the
vector and then the vector would be reduced but that could be awkward,
necessitating use of insertelement/extractelement.
Thanks for explaining.
More information about the llvm-dev