[Mlir-commits] [mlir] [mlir][amdgpu] implement amdgpu.sparse_mfma wrapper for smfmac instructions (PR #171968)
Krzysztof Drewniak
llvmlistbot at llvm.org
Wed Dec 17 08:43:14 PST 2025
================
@@ -1138,6 +1161,66 @@ def AMDGPU_WMMAOp :
let hasVerifier = 1;
}
+def AMDGPU_SparseMFMAOp :
+ AMDGPU_Op<"sparse_mfma", [AllTypesMatch<["destC", "destD"]>,
+ Pure]>,
+ Arguments<(ins
+ ConfinedAttr<I32Attr, [IntIsOneOf<[16, 32]>]>:$m,
+ ConfinedAttr<I32Attr, [IntIsOneOf<[16, 32]>]>:$n,
+ ConfinedAttr<I32Attr, [IntIsOneOf<[16, 32, 64, 128]>]>:$k,
+ SMFMACSparseInTypes:$sourceA,
+ SMFMACDenseInTypes:$sourceB,
+ SMFMACOutTypes:$destC,
+ I32:$sparseIdx,
+ DefaultValuedAttr<I32Attr, "0">:$cbsz,
+ DefaultValuedAttr<I32Attr, "0">:$abid)>,
+ Results<(outs SMFMACOutTypes: $destD)> {
+ let summary = "MLIR wrapper for CDNA sparse mfma (smfmac) instructions";
+ let description = [{
+ The `amdgpu.sparse_mfma` op is an MLIR wrapper around intrinsics for various
+ `smfmac` instructions in the AMDGPU architecture, which perform matrix
+ multiply-accumulate operations using 2:4 structured sparsity on matrix A
+ with dense matrices B, C, and D.
+
+ On gfx942, smfmac intrinsics support:
+ - M=N=16, K=32 and M=N=32, K=16 for f16 and bf16 sources
+ - M=N=16, K=64 and M=N=32, K=32 for i8 and fp8 sources
+
+ On gfx950, smfmac intrinsics additionally support:
+ - M=N=16, K=64 and M=N=32, K=32 for f16 and bf16 sources
+ - M=N=16, K=128 and M=N=32, K=64 for i8 and fp8 sources
+
+ The `sparseIdx` parameter (i32) contains packed indices identifying the
----------------
krzysz00 wrote:
Edit: reading below, this is either `vector<4 x i8>` or `vector<2 x i16>` and you'll probably want to validate which is which
https://github.com/llvm/llvm-project/pull/171968
More information about the Mlir-commits
mailing list