[llvm] [AArch64] Add ISel support for partial reductions to use SVE2.1 udot/sdot (PR #158310)
Sander de Smalen via llvm-commits
llvm-commits at lists.llvm.org
Mon Sep 15 05:40:43 PDT 2025
================
@@ -27,32 +27,60 @@ entry:
ret <vscale x 4 x i32> %partial.reduce
}
-define <vscale x 8 x i32> @udot_vl256(<vscale x 8 x i32> %acc, <vscale x 16 x i16> %a, <vscale x 16 x i16> %b) vscale_range(2,2) {
+define <8 x i32> @udot_vl256(<8 x i32> %acc, <16 x i16> %a, <16 x i16> %b) vscale_range(2,2) {
----------------
sdesmalen-arm wrote:
Sorry, I should have thought of this earlier, but to avoid the splice/ext instructions, it's better to load and store these pointers, than it is to pass them by value (because it has to fall back on the NEON ABI where it assumes `<8 x i32>` is passed in registers v0-v3, and then needs to reconstruct that value into `z0`. See for example llvm/test/CodeGen/AArch64/sve-fixed-length-permute-rev.ll
https://github.com/llvm/llvm-project/pull/158310
More information about the llvm-commits
mailing list