[llvm-dev] RFC: Generic IR reductions

Renato Golin via llvm-dev llvm-dev at lists.llvm.org
Wed Feb 1 06:20:12 PST 2017

```On 1 February 2017 at 13:06, Demikhovsky, Elena
<elena.demikhovsky at intel.com> wrote:
> Constant propagation:
>
> %sum = add <N x float> %a, %b
> @llvm.reduce(ext <N x double>  %sum)
>
> if %a and %b are vector of constants, the %sum also becomes a vector of constants.
> At this point you have @llvm.reduce(ext <N x double>  %sum) and don't know what kind of reduction do you need.

Well, sum is an add node. If %a and %b have special semantics for the
target, than @reduce is meaningful and this means some type of
summation.

But the more I think of it, the less I think this could actually solve
the semantic issues around reductions... The zeroinit idiom would be
more of a hack than re-using similar concepts in IR.

So, let me take a step back, and assume that, for scalable vectors we
*have* to use all intrinsics. IR simply has no compatible idiom, and
unless we introduce some (like the stepvector), it won't work.

As I said before, adding intrinsics is better than new IR constructs,
so let's also assume this is the less costly way forward for now.

But having IR instructions is better than adding intrinsics, and I'm
not sure we want to completely replace what's there already, for
intrinsics.

Does AVX512 suffer from any cost in using the current extract/op
idiom? NEON has small vectors, so IR sequences end up being 1~4 ops,
which I don't consider a problem.

Also, the current idiom can cope with ordered/unordered reduction by
interleaving the operations or not:

%sum0 = %vec[0] + %vec[1]
%sum1 = %vec[2] + %vec[3]
%sum = %sum0 + %sum1

or

%sum0 = %vec[0] + %vec[1]
%sum1 = %sum0 + %vec[2]
%sum = %sum1 + %vec[3]

It may not cope with special semantics leading to use of
target-specific instructions, in which case we obviously need
intrinsics. It certainly can't cope with unknown vector sizes either.

cheers,
--renato
```