[clang] [llvm] [AArch64][clang][llvm] Add structured sparsity outer product (TMOP) intrinsics (PR #135145)

Tue Apr 15 01:14:58 PDT 2025

================
@@ -0,0 +1,138 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4
+; RUN: llc -force-streaming -verify-machineinstrs < %s | FileCheck %s
+
+target triple = "aarch64-linux"
+
+define void @tmopa_za32_s8(<vscale x 16 x i8> %zn1, <vscale x 16 x i8> %zn2, <vscale x 16 x i8> %zm, <vscale x 16 x i8> %zk) #0 {
----------------
CarolineConcatto wrote:

I just notice that all the tests are only for za32, there is no one lowering to z16.
https://developer.arm.com/documentation/ddi0602/2025-03/SME-Instructions/FTMOPA--non-widening---Floating-point-sparse-outer-product--accumulating-
I think you need to create one llvm-ir for each size of za, otherwise we cannot make difference into any of them.

https://github.com/llvm/llvm-project/pull/135145