[llvm] [LoopVectorizer] Add support for partial reductions (PR #92418)

Thu May 23 07:46:05 PDT 2024

paulwalker-arm wrote:

> It seems a bit weird to me to introduce a new intrinsic that, in the general case, isn't actually a natively supported operation on any target.

I see it more about giving LLVM IR a more powerful representation of reductions than we have today. The current representation effectively demands a specific order in which elements are reduced that is hard to break down (as can be seen with Graham's original patches).

By dissociating input and output types we can make VF decisions that better reflect the input data whilst at the same time express there is no defined ordering for how the inputs are reduced.  For AArch64 specifically I'm hoping this goes beyond just dot instructions and allow us to make better use of paired and top-bottom instructions.  I'd expect targets that have no special instructions to simply select the output type to match the input and then code generate a standard binop as they do today.

Perhaps there's an argument the new intrinsics can replace the current vector_reduce_ ones which are another special case being they have a single element result.

https://github.com/llvm/llvm-project/pull/92418