[llvm-dev] [RFC] Changes to llvm.experimental.vector.reduce intrinsics
David Greene via llvm-dev
llvm-dev at lists.llvm.org
Thu Apr 4 08:44:32 PDT 2019
Sander De Smalen via llvm-dev <llvm-dev at lists.llvm.org> writes:
> This means that for example:
> %res = call fast float @llvm.experimental.vector.reduce.fadd.f32.v4f32(float undef, <4 x float> %v)
> does not result in %res being 'undef', but rather a reduction of <4 x
> float> %v. The definition of these intrinsics are different from their
> corresponding SelectionDAG nodes which explicitly split out a
> non-strict VECREDUCE_FADD that explicitly does not take a start-value
> operand, and a VECREDUCE_STRICT_FADD which does.
This seems very strange to me. What was the rationale for ignoring the
first argument? What was the rationale for the first argument existing
at all? Because that's how SVE reductions work? The asymmetry with
llvm.experimental.vector.reduce.add is odd.
> [Option B] Having separate ordered and unordered intrinsics (https://reviews.llvm.org/D60262).
> declare float @llvm.experimental.vector.reduce.ordered.fadd.f32.v4f32(float %start_value, <4 x float> %vec)
> declare float @llvm.experimental.vector.reduce.unordered.fadd.f32.v4f32(<4 x float> %vec)
> This will mean that the behaviour is explicit from the intrinsic and
> the use of 'fast' or ‘reassoc’ on the call has no effect on how that
> intrinsic is lowered. The ordered reduction intrinsic will take a
> scalar start-value operand, where the unordered reduction intrinsic
> will only take a vector operand.
This seems by far the better solution. I'd much rather have things be
explicit in the IR than implicit via flags that might accidentally get
Again, the asymmetry between these (one with a start value and one
without) seems strange and arbitrary. Why do we need start values at
all? Is it really difficult for isel to match s + vector.reduce(v)?
More information about the llvm-dev