[clang] [llvm] [AArch64][clang][llvm] Add structured sparsity outer product (TMOP) intrinsics (PR #135145)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 10 10:45:26 PDT 2025
================
@@ -3593,6 +3578,25 @@ class sme_tmopa_32b<bits<5> opc, RegisterOperand zn_ty, RegisterOperand zm_ty, s
let Constraints = "$ZAda = $_ZAda";
}
+multiclass sme_tmopa_16b<bits<5> opc, RegisterOperand zn_ty, RegisterOperand zm_ty, ValueType vt, string mnemonic, string intrinsic> {
+ def NAME : sme_int_sparse_outer_product_i16<opc, zn_ty, zm_ty, mnemonic>, SMEPseudo2Instr<NAME, 1> {
+ let Uses = [FPMR, FPCR];
+ }
+
+ def NAME # _PSEUDO : sme_sparse_outer_product_pseudo<zn_ty, zm_ty, SMEMatrixTileH>, SMEPseudo2Instr<NAME, 0>;
+
+ def _ : SME2_ZA_TMOP_Pat<NAME, !cast<SDPatternOperator>(intrinsic), timm32_0_3, vt>;
----------------
CarolineConcatto wrote:
Can we replace 'def _' by 'def'
So TileOp16:$ZAda and VectorIndexS32b:$imm have different limits
and in the Pattern we are passing them as the same value:
def _ : SME2_ZA_TMOP_Pat<NAME, !cast<SDPatternOperator>(intrinsic), **timm32_0_3,** vt>;
I think we should add another parameter in the function one for Zda and another for index.
Index timm32_0_3
tile : timm32_0_1
https://developer.arm.com/documentation/ddi0602/2025-03/SME-Instructions/FTMOPA--widening--2-way--FP8-to-FP16---8-bit-floating-point-sparse-sum-of-two-outer-products--accumulating-
https://developer.arm.com/documentation/ddi0602/2025-03/SME-Instructions/FTMOPA--non-widening---Floating-point-sparse-outer-product--accumulating-
https://github.com/llvm/llvm-project/pull/135145
More information about the llvm-commits
mailing list