[PATCH] D141693: [AArch64] turn extended vecreduce bigger than v16i8 into udot/sdot

Zain Jaffal via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Jan 16 02:21:58 PST 2023


zjaffal added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:15226
+  bool IsValidSize = Op0VT.getScalarSizeInBits() == 8;
+  if (Op0VT != MVT::v8i8 && !IsValidElementCount && !IsValidSize)
     return SDValue();
----------------
dmgreen wrote:
> I think this should be something like !(IsValidElementCount && IsValidSize).
> It is worth adding a v4i8 test if one doesn't exist already:
> ```
> define i32 @src(ptr %p, i32 %b) {
> entry:
>   %a64 = load <4 x i8>, ptr %p
>   %a65 = sext <4 x i8> %a64 to <4 x i32>
>   %a66 = mul nsw <4 x i32> %a65, %a65
>   %a67 = tail call i32 @llvm.vector.reduce.add.v4i32(<4 x i32> %a66)
>   %a = add i32 %a67, %b
>   ret i32 %a
> }
> ```
>I think this should be something like !(IsValidElementCount && IsValidSize).
but then we won't cover the case for v8i8 or shall we change is validElementCount to be
`Op0VT.getVectorNumElements() % 16 == 0; || Op0VT.getVectorNumElements() % 8 == 0;`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141693/new/

https://reviews.llvm.org/D141693



More information about the llvm-commits mailing list