[PATCH] D105680: [ARM] Lower v16i8 -> i64 VMLA reductions.

Fri Jul 9 01:02:12 PDT 2021

dmgreen created this revision.
dmgreen added reviewers: samtebbs, SjoerdMeijer, NickGuy, simon_tatham, ostannard.
Herald added subscribers: danielkiss, hiraditya, kristof.beyls.
dmgreen requested review of this revision.
Herald added a project: LLVM.

MVE does not have a VMLALV instruction that can perform v16i8 -> i64 reductions, like it does for v8i16->i64 and v4i32->i64 reductions. That means that the pattern to create them will be spilt up by type legalization, creating a lot of instructions.

This extends the patterns for matching i64 reductions a little to handle the v16i8->i64 case. We need to turn them into a pair of v8i16->i64 VMLALVs that each perform half of the reduction and are summed together (so the later is a VMLALVA). The order of the lanes does not matter for the reduction so we generate a MVEEXT for the extension, that will either be folded into a extending load or can be optimized to a VREV/VMOVL. Some of the resulting codegen isn't optimal, but will be improved in a later patch.

https://reviews.llvm.org/D105680

Files:
  llvm/lib/Target/ARM/ARMISelLowering.cpp
  llvm/test/CodeGen/Thumb2/mve-vecreduce-mla.ll

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D105680.357332.patch
Type: text/x-patch
Size: 36100 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210709/841d7917/attachment.bin>