[llvm] [AArch64][NEON][SVE] Lower mixed sign/zero extended partial reductions to usdot (PR #107566)
Paul Walker via llvm-commits
llvm-commits at lists.llvm.org
Wed Sep 11 09:12:25 PDT 2024
================
@@ -3408,6 +3408,7 @@ let Predicates = [HasSVEorSME, HasMatMulInt8] in {
defm USDOT_ZZZ : sve_int_dot_mixed<"usdot", int_aarch64_sve_usdot>;
defm USDOT_ZZZI : sve_int_dot_mixed_indexed<0, "usdot", int_aarch64_sve_usdot_lane>;
defm SUDOT_ZZZI : sve_int_dot_mixed_indexed<1, "sudot", int_aarch64_sve_sudot_lane>;
+ def : Pat<(nxv4i32 (AArch64usdot (nxv4i32 ZPR32:$Rd), (nxv16i8 ZPR8:$Rm), (nxv16i8 ZPR8:$Rn))), (USDOT_ZZZ $Rd, $Rm, $Rn)>;
----------------
paulwalker-arm wrote:
As above, you can pass `AArch64usdot` directly into `defm USDOT_ZZZ :` replacing the existing `int_aarch64_sve_usdot` parameter.
https://github.com/llvm/llvm-project/pull/107566
More information about the llvm-commits
mailing list