[all-commits] [llvm/llvm-project] 338314: [ARM] Lower v16i8 -> i64 VMLA reductions.
David Green via All-commits
all-commits at lists.llvm.org
Wed Jul 14 10:11:53 PDT 2021
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 338314f9c26d4594d49fdd3a7656d71c77255c54
https://github.com/llvm/llvm-project/commit/338314f9c26d4594d49fdd3a7656d71c77255c54
Author: David Green <david.green at arm.com>
Date: 2021-07-14 (Wed, 14 Jul 2021)
Changed paths:
M llvm/lib/Target/ARM/ARMISelLowering.cpp
M llvm/test/CodeGen/Thumb2/mve-vecreduce-mla.ll
Log Message:
-----------
[ARM] Lower v16i8 -> i64 VMLA reductions.
MVE does not have a VMLALV instruction that can perform v16i8 -> i64
reductions, like it does for v8i16->i64 and v4i32->i64 reductions. That
means that the pattern to create them will be spilt up by type
legalization, creating a lot of instructions.
This extends the patterns for matching i64 reductions a little to handle
the v16i8->i64 case. We need to turn them into a pair of v8i16->i64
VMLALVs that each perform half of the reduction and are summed together
(so the later is a VMLALVA). The order of the lanes does not matter for
the reduction so we generate a MVEEXT for the extension, that will
either be folded into a extending load or can be optimized to a
VREV/VMOVL. Some of the resulting codegen isn't optimal, but will be
improved in a later patch.
Differential Revision: https://reviews.llvm.org/D105680
More information about the All-commits
mailing list