[PATCH] D81397: [ARM] Better reductions

Mon Jun 8 08:11:19 PDT 2020

dmgreen created this revision.
dmgreen added reviewers: samparker, SjoerdMeijer, simon_tatham, efriedma, ostannard.
Herald added subscribers: danielkiss, hiraditya, kristof.beyls.
Herald added a project: LLVM.

MVE has native reductions for integer add and min/max. The others need to be expanded to a series of extract's and scalar operators to reduce the vector into a single scalar. The default codegen for that expands the reduction into a series on in-order operations.

This modifies that to something more suitable for MVE. The basic idea is to use vector operations until there are 4 remaining items then switch to pairwise operations. For example a v8f16 fadd reduction would become:
Y = VREV X
Z = ADD(X, Y)
z0 = Z[0] + Z[1]
z1 = Z[2] + Z[3]
return z0 + z1

The awkwardness (there is always some) comes in from something like a v4f16, which is first legalized by adding identity values to the extra lanes of the reduction, and which can then not be optimized away through the vrev; fadd combo, the inserts remain. I've made sure they custom lower so that we can produce the pairwise additions before the extra values are added.

https://reviews.llvm.org/D81397

Files:
  llvm/lib/Target/ARM/ARMISelLowering.cpp
  llvm/test/CodeGen/Thumb2/mve-vecreduce-bit.ll
  llvm/test/CodeGen/Thumb2/mve-vecreduce-fadd.ll
  llvm/test/CodeGen/Thumb2/mve-vecreduce-fminmax.ll
  llvm/test/CodeGen/Thumb2/mve-vecreduce-fmul.ll
  llvm/test/CodeGen/Thumb2/mve-vecreduce-loops.ll
  llvm/test/CodeGen/Thumb2/mve-vecreduce-mul.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D81397.269220.patch
Type: text/x-patch
Size: 122428 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20200608/213fc179/attachment.bin>