[llvm-dev] [RFC] Changes to llvm.experimental.vector.reduce intrinsics

Simon Pilgrim via llvm-dev llvm-dev at lists.llvm.org
Fri Apr 5 01:37:03 PDT 2019

On 04/04/2019 14:11, Sander De Smalen wrote:
> Proposed change:
> ----------------------------
> In this RFC I propose changing the intrinsics for 
> llvm.experimental.vector.reduce.fadd and 
> llvm.experimental.vector.reduce.fmul (see options A and B). I also 
> propose renaming the 'accumulator' operand to 'start value' because 
> for fmul this is the start value of the reduction, rather than a value 
> to which the fmul reduction is accumulated into.
> [Option A] Always using the start value operand in the reduction 
> (https://reviews.llvm.org/D60261)
>   declare float 
> @llvm.experimental.vector.reduce.v2.fadd.f32.v4f32(float %start_value, 
> <4 x float> %vec)
> This means that if the start value is 'undef', the result will be 
> undef and all code creating such a reduction will need to ensure it 
> has a sensible start value (e.g. 0.0 for fadd, 1.0 for fmul). When 
> using 'fast' or ‘reassoc’ on the call it will be implemented using an 
> unordered reduction, otherwise it will be implemented with an ordered 
> reduction. Note that a new intrinsic is required to capture the new 
> semantics. In this proposal the intrinsic is prefixed with a 'v2' for 
> the time being, with the expectation this will be dropped when we 
> remove 'experimental' from the reduction intrinsics in the future.
> [Option B] Having separate ordered and unordered intrinsics 
> (https://reviews.llvm.org/D60262).
>   declare float 
> @llvm.experimental.vector.reduce.ordered.fadd.f32.v4f32(float 
> %start_value, <4 x float> %vec)
>   declare float 
> @llvm.experimental.vector.reduce.unordered.fadd.f32.v4f32(<4 x float> 
> %vec)
> This will mean that the behaviour is explicit from the intrinsic and 
> the use of 'fast' or ‘reassoc’ on the call has no effect on how that 
> intrinsic is lowered. The ordered reduction intrinsic will take a 
> scalar start-value operand, where the unordered reduction intrinsic 
> will only take a vector operand.
> Both options auto-upgrade the IR to use the new (version of the) 
> intrinsics. I'm personally slightly in favour of [Option B], because 
> it better aligns with the definition of the SelectionDAG nodes and is 
> more explicit in its semantics. We also avoid having to use an 
> artificial 'v2' like prefix to denote the new behaviour of the intrinsic.
Do we have any targets with instructions that can actually use the start 
value? TBH I'd be tempted to suggest we just make the initial 
extractelement/fadd/insertelement pattern a manual extra stage and avoid 
having having that argument entirely.

> Further efforts:
> ----------------------------
> Here a non-exhaustive list of items I think work towards making the 
> intrinsics non-experimental:

>   * Adding SelectionDAG legalization for the  _STRICT reduction
>     SDNodes. After some great work from Nikita in D58015, unordered
>     reductions are now legalized/expanded in SelectionDAG, so if we
>     add expansion in SelectionDAG for strict reductions this would
>     make the ExpandReductionsPass redundant.
>   * Better enforcing the constraints of the intrinsics (see
>     https://reviews.llvm.org/D60260 ).

>   * I think we'll also want to be able to overload the result operand
>     based on the vector element type for the intrinsics having the
>     constraint that the result type must match the vector element
>     type. e.g. dropping the redundant 'i32' in:
>     i32 @llvm.experimental.vector.reduce.and.i32.v4i32(<4 x i32> %a)
>     => i32 @llvm.experimental.vector.reduce.and.v4i32(<4 x i32> %a)
> since i32 is implied by <4 x i32>. This would have the added benefit 
> that LLVM would automatically check for the operands to match.

Won't this cause issues with overflow? Isn't the point  of an add (or 
mul....) reduction of say, <64 x i8> giving a larger (i32 or i64) result 
so we don't lose anything? I agree for bitop reductions it doesn't make 
sense though.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190405/fb6910a4/attachment.html>

More information about the llvm-dev mailing list