[llvm] [AArch64][NEON][SVE] Lower i8 to i64 partial reduction to a dot product (PR #110220)
Benjamin Maxwell via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 30 06:00:29 PDT 2024
================
@@ -21942,6 +21944,20 @@ SDValue tryLowerPartialReductionToDot(SDNode *N,
else
Opcode = AArch64ISD::UDOT;
+ // Partial reduction lowering for (nx)v16i8 to (nx)v4i64 requires an i32 dot
+ // product followed by a zero / sign extension
+ if ((ReducedType == MVT::nxv4i64 && MulSrcType == MVT::nxv16i8) ||
+ (ReducedType == MVT::v4i64 && MulSrcType == MVT::v16i8)) {
+ EVT ReducedTypeHalved =
+ (ReducedType.isScalableVector()) ? MVT::nxv4i32 : MVT::v4i32;
----------------
MacDue wrote:
(optional) nit: `ReducedTypeHalved` -> `ReducedTypeI32`?
To me `Halved` could mean half the vector width (e.g. v4 -> v2).
https://github.com/llvm/llvm-project/pull/110220
More information about the llvm-commits
mailing list