[llvm] 3eacda4 - [AArch64] Add all SME2.1 instructions Assembly/Disassembly
Caroline Concatto via llvm-commits
llvm-commits at lists.llvm.org
Mon Nov 14 06:56:53 PST 2022
Author: Caroline Concatto
Date: 2022-11-14T14:56:16Z
New Revision: 3eacda4547c59c3daa2daf275321c8013eb485cd
URL: https://github.com/llvm/llvm-project/commit/3eacda4547c59c3daa2daf275321c8013eb485cd
DIFF: https://github.com/llvm/llvm-project/commit/3eacda4547c59c3daa2daf275321c8013eb485cd.diff
LOG: [AArch64] Add all SME2.1 instructions Assembly/Disassembly
This patch adds a new feature flag:
sme-f16f16 to represent FEAT_SME-F16F16
This patch add the following instructions:
SME2.1 stand alone instructions:
MOVAZ (array to vector, four registers): Move and zero four ZA single-vector groups to vector registers.
(array to vector, two registers): Move and zero two ZA single-vector groups to vector registers.
(tile to vector, four registers): Move and zero four ZA tile slices to vector registers.
(tile to vector, single): Move and zero ZA tile slice to vector register.
(tile to vector, two registers): Move and zero two ZA tile slices to vector registers.
LUTI2 (Strided four registers): Lookup table read with 2-bit indexes.
(Strided two registers): Lookup table read with 2-bit indexes.
LUTI4 (Strided four registers): Lookup table read with 4-bit indexes.
(Strided two registers): Lookup table read with 4-bit indexes.
ZERO (double-vector): Zero ZA double-vector groups.
(quad-vector): Zero ZA quad-vector groups.
(single-vector): Zero ZA single-vector groups.
SME2p1 and SME-F16F16:
All instructions are half precision elements:
FADD: Floating-point add multi-vector to ZA array vector accumulators.
FSUB: Floating-point subtract multi-vector from ZA array vector accumulators.
FMLA (multiple and indexed vector): Multi-vector floating-point fused multiply-add by indexed element.
(multiple and single vector): Multi-vector floating-point fused multiply-add by vector.
(multiple vectors): Multi-vector floating-point fused multiply-add.
FMLS (multiple and indexed vector): Multi-vector floating-point fused multiply-subtract by indexed element.
(multiple and single vector): Multi-vector floating-point fused multiply-subtract by vector.
(multiple vectors): Multi-vector floating-point fused multiply-subtract.
FCVT (widening): Multi-vector floating-point convert from half-precision to single-precision (in-order).
FCVTL: Multi-vector floating-point convert from half-precision to deinterleaved single-precision.
FMOPA (non-widening): Floating-point outer product and accumulate.
FMOPS (non-widening): Floating-point outer product and subtract.
SME2p1 and B16B16:
BFADD: BFloat16 floating-point add multi-vector to ZA array vector accumulators.
BFSUB: BFloat16 floating-point subtract multi-vector from ZA array vector accumulators.
BFCLAMP: Multi-vector BFloat16 floating-point clamp to minimum/maximum number.
BFMLA (multiple and indexed vector): Multi-vector BFloat16 floating-point fused multiply-add by indexed element.
(multiple and single vector): Multi-vector BFloat16 floating-point fused multiply-add by vector.
(multiple vectors): Multi-vector BFloat16 floating-point fused multiply-add.
BFMLS (multiple and indexed vector): Multi-vector BFloat16 floating-point fused multiply-subtract by indexed element.
(multiple and single vector): Multi-vector BFloat16 floating-point fused multiply-subtract by vector.
(multiple vectors): Multi-vector BFloat16 floating-point fused multiply-subtract.
BFMAX (multiple and single vector): Multi-vector BFloat16 floating-point maximum by vector.
(multiple vectors): Multi-vector BFloat16 floating-point maximum.
BFMAXNM (multiple and single vector): Multi-vector BFloat16 floating-point maximum number by vector.
(multiple vectors): Multi-vector BFloat16 floating-point maximum number.
BFMIN (multiple and single vector): Multi-vector BFloat16 floating-point minimum by vector.
(multiple vectors): Multi-vector BFloat16 floating-point minimum.
BFMINNM (multiple and single vector): Multi-vector BFloat16 floating-point minimum number by vector.
(multiple vectors): Multi-vector BFloat16 floating-point minimum number.
BFMOPA (non-widening): BFloat16 floating-point outer product and accumulate.
BFMOPS (non-widening): BFloat16 floating-point outer product and subtract.
The reference can be found here:
https://developer.arm.com/documentation/ddi0602/2022-09
Differential Revision: https://reviews.llvm.org/D137571
Added:
llvm/test/MC/AArch64/SME2p1/bfadd-diagnostics.s
llvm/test/MC/AArch64/SME2p1/bfadd.s
llvm/test/MC/AArch64/SME2p1/bfclamp-diagnostics.s
llvm/test/MC/AArch64/SME2p1/bfclamp.s
llvm/test/MC/AArch64/SME2p1/bfmax-diagnostics.s
llvm/test/MC/AArch64/SME2p1/bfmax.s
llvm/test/MC/AArch64/SME2p1/bfmaxnm-diagnostics.s
llvm/test/MC/AArch64/SME2p1/bfmaxnm.s
llvm/test/MC/AArch64/SME2p1/bfmin-diagnostics.s
llvm/test/MC/AArch64/SME2p1/bfmin.s
llvm/test/MC/AArch64/SME2p1/bfminnm-diagnostics.s
llvm/test/MC/AArch64/SME2p1/bfminnm.s
llvm/test/MC/AArch64/SME2p1/bfmla-diagnostics.s
llvm/test/MC/AArch64/SME2p1/bfmla.s
llvm/test/MC/AArch64/SME2p1/bfmls-diagnostics.s
llvm/test/MC/AArch64/SME2p1/bfmls.s
llvm/test/MC/AArch64/SME2p1/bfmopa-diagnostics.s
llvm/test/MC/AArch64/SME2p1/bfmopa.s
llvm/test/MC/AArch64/SME2p1/bfmops-diagnostics.s
llvm/test/MC/AArch64/SME2p1/bfmops.s
llvm/test/MC/AArch64/SME2p1/bfsub-diagnostics.s
llvm/test/MC/AArch64/SME2p1/bfsub.s
llvm/test/MC/AArch64/SME2p1/fadd-diagnostics.s
llvm/test/MC/AArch64/SME2p1/fadd.s
llvm/test/MC/AArch64/SME2p1/fcvt-diagnostics.s
llvm/test/MC/AArch64/SME2p1/fcvt.s
llvm/test/MC/AArch64/SME2p1/fcvtl-diagnostics.s
llvm/test/MC/AArch64/SME2p1/fcvtl.s
llvm/test/MC/AArch64/SME2p1/fmla-diagnostics.s
llvm/test/MC/AArch64/SME2p1/fmla.s
llvm/test/MC/AArch64/SME2p1/fmls-diagnostics.s
llvm/test/MC/AArch64/SME2p1/fmls.s
llvm/test/MC/AArch64/SME2p1/fmopa-diagnostics.s
llvm/test/MC/AArch64/SME2p1/fmopa.s
llvm/test/MC/AArch64/SME2p1/fmops-diagnostics.s
llvm/test/MC/AArch64/SME2p1/fmops.s
llvm/test/MC/AArch64/SME2p1/fsub-diagnostics.s
llvm/test/MC/AArch64/SME2p1/fsub.s
llvm/test/MC/AArch64/SME2p1/luti2-diagnostics.s
llvm/test/MC/AArch64/SME2p1/luti2.s
llvm/test/MC/AArch64/SME2p1/luti4-diagnostics.s
llvm/test/MC/AArch64/SME2p1/luti4.s
llvm/test/MC/AArch64/SME2p1/movaz-diagnostics.s
llvm/test/MC/AArch64/SME2p1/movaz.s
llvm/test/MC/AArch64/SME2p1/zero-diagnostics.s
llvm/test/MC/AArch64/SME2p1/zero.s
Modified:
llvm/include/llvm/Support/AArch64TargetParser.def
llvm/include/llvm/Support/AArch64TargetParser.h
llvm/lib/Target/AArch64/AArch64.td
llvm/lib/Target/AArch64/AArch64InstrInfo.td
llvm/lib/Target/AArch64/AArch64RegisterInfo.td
llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
llvm/lib/Target/AArch64/AArch64SchedA64FX.td
llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
llvm/lib/Target/AArch64/SMEInstrFormats.td
llvm/test/MC/AArch64/SME2/fmla-diagnostics.s
llvm/test/MC/AArch64/SME2/fmls-diagnostics.s
llvm/unittests/Support/TargetParserTest.cpp
Removed:
################################################################################
diff --git a/llvm/include/llvm/Support/AArch64TargetParser.def b/llvm/include/llvm/Support/AArch64TargetParser.def
index 847b54c7a1d4..12956267f616 100644
--- a/llvm/include/llvm/Support/AArch64TargetParser.def
+++ b/llvm/include/llvm/Support/AArch64TargetParser.def
@@ -148,6 +148,7 @@ AARCH64_ARCH_EXT_NAME("flagm", AArch64::AEK_FLAGM, "+flagm",
AARCH64_ARCH_EXT_NAME("sme", AArch64::AEK_SME, "+sme", "-sme")
AARCH64_ARCH_EXT_NAME("sme-f64f64", AArch64::AEK_SMEF64F64, "+sme-f64f64", "-sme-f64f64")
AARCH64_ARCH_EXT_NAME("sme-i16i64", AArch64::AEK_SMEI16I64, "+sme-i16i64", "-sme-i16i64")
+AARCH64_ARCH_EXT_NAME("sme-f16f16", AArch64::AEK_SMEF16F16, "+sme-f16f16", "-sme-f16f16")
AARCH64_ARCH_EXT_NAME("sme2", AArch64::AEK_SME2, "+sme2", "-sme2")
AARCH64_ARCH_EXT_NAME("sme2p1", AArch64::AEK_SME2p1, "+sme2p1", "-sme2p1")
AARCH64_ARCH_EXT_NAME("hbc", AArch64::AEK_HBC, "+hbc", "-hbc")
diff --git a/llvm/include/llvm/Support/AArch64TargetParser.h b/llvm/include/llvm/Support/AArch64TargetParser.h
index 24ffb9195454..a3923de77082 100644
--- a/llvm/include/llvm/Support/AArch64TargetParser.h
+++ b/llvm/include/llvm/Support/AArch64TargetParser.h
@@ -75,7 +75,8 @@ enum ArchExtKind : uint64_t {
AEK_SME2 = 1ULL << 43, // FEAT_SME2
AEK_SVE2p1 = 1ULL << 44, // FEAT_SVE2p1
AEK_SME2p1 = 1ULL << 45, // FEAT_SME2p1
- AEK_B16B16 = 1ULL << 46 // FEAT_B16B16
+ AEK_B16B16 = 1ULL << 46, // FEAT_B16B16
+ AEK_SMEF16F16 = 1ULL << 47 // FEAT_SMEF16F16
};
enum class ArchKind {
diff --git a/llvm/lib/Target/AArch64/AArch64.td b/llvm/lib/Target/AArch64/AArch64.td
index 484baea311b4..ded63c636d15 100644
--- a/llvm/lib/Target/AArch64/AArch64.td
+++ b/llvm/lib/Target/AArch64/AArch64.td
@@ -475,6 +475,9 @@ def FeatureSMEF64F64 : SubtargetFeature<"sme-f64f64", "HasSMEF64F64", "true",
def FeatureSMEI16I64 : SubtargetFeature<"sme-i16i64", "HasSMEI16I64", "true",
"Enable Scalable Matrix Extension (SME) I16I64 instructions (FEAT_SME_I16I64)", [FeatureSME]>;
+def FeatureSMEF16F16 : SubtargetFeature<"sme-f16f16", "HasSMEF16F16", "true",
+ "Enable SME2.1 non-widening Float16 instructions (FEAT_SME_F16F16)", []>;
+
def FeatureSME2 : SubtargetFeature<"sme2", "HasSME2", "true",
"Enable Scalable Matrix Extension 2 (SME2) instructions", [FeatureSME]>;
@@ -653,7 +656,8 @@ def PAUnsupported : AArch64Unsupported {
}
def SMEUnsupported : AArch64Unsupported {
- let F = [HasSME, HasSMEF64F64, HasSMEI16I64, HasSME2, HasSVE2p1_or_HasSME2, HasSVE2p1_or_HasSME2p1];
+ let F = [HasSME, HasSMEF64F64, HasSMEI16I64, HasSME2, HasSVE2p1_or_HasSME2,
+ HasSVE2p1_or_HasSME2p1, HasSME2p1, HasSMEF16F16];
}
include "AArch64SchedA53.td"
diff --git a/llvm/lib/Target/AArch64/AArch64InstrInfo.td b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
index d100a0a576d7..e93d68a13c76 100644
--- a/llvm/lib/Target/AArch64/AArch64InstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64InstrInfo.td
@@ -144,12 +144,15 @@ def HasSME : Predicate<"Subtarget->hasSME()">,
AssemblerPredicateWithAll<(all_of FeatureSME), "sme">;
def HasSMEF64F64 : Predicate<"Subtarget->hasSMEF64F64()">,
AssemblerPredicateWithAll<(all_of FeatureSMEF64F64), "sme-f64f64">;
+def HasSMEF16F16 : Predicate<"Subtarget->hasSMEF16F16()">,
+ AssemblerPredicateWithAll<(all_of FeatureSMEF16F16), "sme-f16f16">;
def HasSMEI16I64 : Predicate<"Subtarget->hasSMEI16I64()">,
AssemblerPredicateWithAll<(all_of FeatureSMEI16I64), "sme-i16i64">;
def HasSME2 : Predicate<"Subtarget->hasSME2()">,
AssemblerPredicateWithAll<(all_of FeatureSME2), "sme2">;
def HasSME2p1 : Predicate<"Subtarget->hasSME2p1()">,
AssemblerPredicateWithAll<(all_of FeatureSME2p1), "sme2p1">;
+
// A subset of SVE(2) instructions are legal in Streaming SVE execution mode,
// they should be enabled if either has been specified.
def HasSVEorSME
diff --git a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
index fb847dfe3512..57e32d6009db 100644
--- a/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64RegisterInfo.td
@@ -1595,6 +1595,7 @@ class MatrixTileOperand<int EltSize, int NumBitsForTile, RegisterClass RC>
let PrintMethod = "printMatrixTile";
}
+def TileOp16 : MatrixTileOperand<16, 1, MPR16>;
def TileOp32 : MatrixTileOperand<32, 2, MPR32>;
def TileOp64 : MatrixTileOperand<64, 3, MPR64>;
diff --git a/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td b/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
index 4bdadf1889ad..73de3b6049b5 100644
--- a/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
+++ b/llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td
@@ -262,17 +262,17 @@ defm FMLS_VG4_M4Z4Z_S : sme2_dot_mla_add_sub_array_vg4_multi<"fmls", 0b011001, M
defm FMLS_VG2_M2ZZI_S : sme2_multi_vec_array_vg2_index_32b<"fmls", 0b0010, ZZ_s_mul_r, ZPR4b32>;
defm FMLS_VG4_M4ZZI_S : sme2_multi_vec_array_vg4_index_32b<"fmls", 0b0010, ZZZZ_s_mul_r, ZPR4b32>;
-defm ADDA_VG2_M2Z2Z_S : sme2_multivec_accum_add_sub_vg2_S<"add", 0b10>;
-defm ADDA_VG4_M4Z4Z_S : sme2_multivec_accum_add_sub_vg4_S<"add", 0b10>;
+defm ADDA_VG2_M2Z2Z_S : sme2_multivec_accum_add_sub_vg2<"add", 0b0010, MatrixOp32, ZZ_s_mul_r>;
+defm ADDA_VG4_M4Z4Z_S : sme2_multivec_accum_add_sub_vg4<"add", 0b0010, MatrixOp32, ZZZZ_s_mul_r>;
-defm SUBA_VG2_M2Z2Z_S : sme2_multivec_accum_add_sub_vg2_S<"sub", 0b11>;
-defm SUBA_VG4_M4Z4Z_S : sme2_multivec_accum_add_sub_vg4_S<"sub", 0b11>;
+defm SUBA_VG2_M2Z2Z_S : sme2_multivec_accum_add_sub_vg2<"sub", 0b0011, MatrixOp32, ZZ_s_mul_r>;
+defm SUBA_VG4_M4Z4Z_S : sme2_multivec_accum_add_sub_vg4<"sub", 0b0011, MatrixOp32, ZZZZ_s_mul_r>;
-defm FADD_VG2_M2Z2Z_S : sme2_multivec_accum_add_sub_vg2_S<"fadd", 0b00>;
-defm FADD_VG4_M4Z4Z_S : sme2_multivec_accum_add_sub_vg4_S<"fadd", 0b00>;
+defm FADD_VG2_M2Z2Z_S : sme2_multivec_accum_add_sub_vg2<"fadd", 0b0000, MatrixOp32, ZZ_s_mul_r>;
+defm FADD_VG4_M4Z4Z_S : sme2_multivec_accum_add_sub_vg4<"fadd", 0b0000, MatrixOp32, ZZZZ_s_mul_r>;
-defm FSUB_VG2_M2Z2Z_S : sme2_multivec_accum_add_sub_vg2_S<"fsub", 0b01>;
-defm FSUB_VG4_M4Z4Z_S : sme2_multivec_accum_add_sub_vg4_S<"fsub", 0b01>;
+defm FSUB_VG2_M2Z2Z_S : sme2_multivec_accum_add_sub_vg2<"fsub", 0b0001, MatrixOp32, ZZ_s_mul_r>;
+defm FSUB_VG4_M4Z4Z_S : sme2_multivec_accum_add_sub_vg4<"fsub", 0b0001, MatrixOp32, ZZZZ_s_mul_r>;
defm SQDMULH_VG2_2ZZ : sme2_int_sve_destructive_vector_vg2_single<"sqdmulh", 0b1000000>;
defm SQDMULH_VG4_4ZZ : sme2_int_sve_destructive_vector_vg4_single<"sqdmulh", 0b1000000>;
@@ -703,11 +703,11 @@ defm SUB_VG4_M4ZZ_D : sme2_dot_mla_add_sub_array_vg24_single<"sub", 0b1111011,
defm SUB_VG2_M2Z2Z_D : sme2_dot_mla_add_sub_array_vg2_multi<"sub", 0b111011, MatrixOp64, ZZ_d_mul_r>;
defm SUB_VG4_M4Z4Z_D : sme2_dot_mla_add_sub_array_vg4_multi<"sub", 0b111011, MatrixOp64, ZZZZ_d_mul_r>;
-defm ADDA_VG2_M2Z2Z_D : sme2_multivec_accum_add_sub_vg2_D<"add", 0b10>;
-defm ADDA_VG4_M4Z4Z_D : sme2_multivec_accum_add_sub_vg4_D<"add", 0b10>;
+defm ADDA_VG2_M2Z2Z_D : sme2_multivec_accum_add_sub_vg2<"add", 0b1010, MatrixOp64, ZZ_d_mul_r>;
+defm ADDA_VG4_M4Z4Z_D : sme2_multivec_accum_add_sub_vg4<"add", 0b1010, MatrixOp64, ZZZZ_d_mul_r>;
-defm SUBA_VG2_M2Z2Z_D : sme2_multivec_accum_add_sub_vg2_D<"sub", 0b11>;
-defm SUBA_VG4_M4Z4Z_D : sme2_multivec_accum_add_sub_vg4_D<"sub", 0b11>;
+defm SUBA_VG2_M2Z2Z_D : sme2_multivec_accum_add_sub_vg2<"sub", 0b1011, MatrixOp64, ZZ_d_mul_r>;
+defm SUBA_VG4_M4Z4Z_D : sme2_multivec_accum_add_sub_vg4<"sub", 0b1011, MatrixOp64, ZZZZ_d_mul_r>;
defm SDOT_VG2_M2ZZI_HtoD : sme2_multi_vec_array_vg2_index_64b<"sdot", 0b01, ZZ_h_mul_r, ZPR4b16>;
defm SDOT_VG4_M4ZZI_HtoD : sme2_multi_vec_array_vg4_index_64b<"sdot", 0b001, ZZZZ_h_mul_r, ZPR4b16>;
@@ -779,9 +779,100 @@ defm FMLS_VG4_M4ZZ_D : sme2_dot_mla_add_sub_array_vg24_single<"fmls", 0b1111001
defm FMLS_VG2_M2Z2Z_D : sme2_dot_mla_add_sub_array_vg2_multi<"fmls", 0b111001, MatrixOp64, ZZ_d_mul_r>;
defm FMLS_VG4_M4Z4Z_D : sme2_dot_mla_add_sub_array_vg4_multi<"fmls", 0b111001, MatrixOp64, ZZZZ_d_mul_r>;
-defm FADD_VG2_M2Z2Z_D : sme2_multivec_accum_add_sub_vg2_D<"fadd", 0b00>;
-defm FADD_VG4_M4Z4Z_D : sme2_multivec_accum_add_sub_vg4_D<"fadd", 0b00>;
+defm FADD_VG2_M2Z2Z_D : sme2_multivec_accum_add_sub_vg2<"fadd", 0b1000, MatrixOp64, ZZ_d_mul_r>;
+defm FADD_VG4_M4Z4Z_D : sme2_multivec_accum_add_sub_vg4<"fadd", 0b1000, MatrixOp64, ZZZZ_d_mul_r>;
-defm FSUB_VG2_M2Z2Z_D : sme2_multivec_accum_add_sub_vg2_D<"fsub", 0b01>;
-defm FSUB_VG4_M4Z4Z_D : sme2_multivec_accum_add_sub_vg4_D<"fsub", 0b01>;
+defm FSUB_VG2_M2Z2Z_D : sme2_multivec_accum_add_sub_vg2<"fsub", 0b1001, MatrixOp64, ZZ_d_mul_r>;
+defm FSUB_VG4_M4Z4Z_D : sme2_multivec_accum_add_sub_vg4<"fsub", 0b1001, MatrixOp64, ZZZZ_d_mul_r>;
+}
+
+let Predicates = [HasSME2p1] in {
+defm MOVAZ_ZMI : sme2p1_movaz_tile_to_vec<"movaz">;
+defm MOVAZ_2ZMI : sme2p1_movaz_tile_to_vec_vg2<"movaz">;
+defm MOVAZ_4ZMI : sme2p1_movaz_tile_to_vec_vg4<"movaz">;
+defm MOVAZ_VG2_2ZM : sme2p1_movaz_array_to_vec_vg2<"movaz">;
+defm MOVAZ_VG4_4ZM : sme2p1_movaz_array_to_vec_vg4<"movaz">;
+
+defm ZERO_MXI : sme2p1_zero_matrix<"zero">;
+
+defm LUTI2_S_2ZTZI : sme2p1_luti2_vector_vg2_index<"luti2">;
+defm LUTI2_S_4ZTZI : sme2p1_luti2_vector_vg4_index<"luti2">;
+
+defm LUTI4_S_2ZTZI : sme2p1_luti4_vector_vg2_index<"luti4">;
+defm LUTI4_S_4ZTZI : sme2p1_luti4_vector_vg4_index<"luti4">;
+}
+
+let Predicates = [HasSME2p1, HasSMEF16F16] in {
+defm FADD_VG2_M2Z_H : sme2_multivec_accum_add_sub_vg2<"fadd", 0b0100, MatrixOp16, ZZ_h_mul_r>;
+defm FADD_VG4_M4Z_H : sme2_multivec_accum_add_sub_vg4<"fadd", 0b0100, MatrixOp16, ZZZZ_h_mul_r>;
+defm FSUB_VG2_M2Z_H : sme2_multivec_accum_add_sub_vg2<"fsub", 0b0101, MatrixOp16, ZZ_h_mul_r>;
+defm FSUB_VG4_M4Z_H : sme2_multivec_accum_add_sub_vg4<"fsub", 0b0101, MatrixOp16, ZZZZ_h_mul_r>;
+
+defm FMLA_VG2_M2ZZI_H : sme2p1_multi_vec_array_vg2_index_16b<"fmla", 0b00>;
+defm FMLA_VG4_M4ZZI_H : sme2p1_multi_vec_array_vg4_index_16b<"fmla", 0b00>;
+defm FMLA_VG2_M2ZZ_H : sme2_dot_mla_add_sub_array_vg24_single<"fmla", 0b0011100, MatrixOp16, ZZ_h, ZPR4b16>;
+defm FMLA_VG4_M4ZZ_H : sme2_dot_mla_add_sub_array_vg24_single<"fmla", 0b0111100, MatrixOp16, ZZZZ_h, ZPR4b16>;
+defm FMLA_VG2_M2Z4Z_H : sme2_dot_mla_add_sub_array_vg2_multi<"fmla", 0b010001, MatrixOp16, ZZ_h_mul_r>;
+defm FMLA_VG4_M4Z4Z_H : sme2_dot_mla_add_sub_array_vg4_multi<"fmla", 0b010001, MatrixOp16, ZZZZ_h_mul_r>;
+
+defm FMLS_VG2_M2ZZI_H : sme2p1_multi_vec_array_vg2_index_16b<"fmls", 0b01>;
+defm FMLS_VG4_M4ZZI_H : sme2p1_multi_vec_array_vg4_index_16b<"fmls", 0b01>;
+defm FMLS_VG2_M2ZZ_H : sme2_dot_mla_add_sub_array_vg24_single<"fmls", 0b0011101, MatrixOp16, ZZ_h, ZPR4b16>;
+defm FMLS_VG4_M4ZZ_H : sme2_dot_mla_add_sub_array_vg24_single<"fmls", 0b0111101, MatrixOp16, ZZZZ_h, ZPR4b16>;
+defm FMLS_VG2_M2Z2Z_H : sme2_dot_mla_add_sub_array_vg2_multi<"fmls", 0b010011, MatrixOp16, ZZ_h_mul_r>;
+defm FMLS_VG4_M4Z2Z_H : sme2_dot_mla_add_sub_array_vg4_multi<"fmls", 0b010011, MatrixOp16, ZZZZ_h_mul_r>;
+
+defm FCVT_2ZZ_H : sme2p1_fp_cvt_vector_vg2_single<"fcvt", 0b0>;
+defm FCVTL_2ZZ_H : sme2p1_fp_cvt_vector_vg2_single<"fcvtl", 0b1>;
+
+defm FMOPA_MPPZZ_H : sme2p1_fmop_tile_fp16<"fmopa", 0b0, 0b0>;
+defm FMOPS_MPPZZ_H : sme2p1_fmop_tile_fp16<"fmops", 0b0, 0b1>;
+}
+
+let Predicates = [HasSME2p1, HasB16B16] in {
+defm BFADD_VG2_M2Z_H : sme2_multivec_accum_add_sub_vg2<"bfadd", 0b1100, MatrixOp16, ZZ_h_mul_r>;
+defm BFADD_VG4_M4Z_H : sme2_multivec_accum_add_sub_vg4<"bfadd", 0b1100, MatrixOp16, ZZZZ_h_mul_r>;
+defm BFSUB_VG2_M2Z_H : sme2_multivec_accum_add_sub_vg2<"bfsub", 0b1101, MatrixOp16, ZZ_h_mul_r>;
+defm BFSUB_VG4_M4Z_H : sme2_multivec_accum_add_sub_vg4<"bfsub", 0b1101, MatrixOp16, ZZZZ_h_mul_r>;
+
+defm BFMLA_VG2_M2ZZI : sme2p1_multi_vec_array_vg2_index_16b<"bfmla", 0b10>;
+defm BFMLA_VG4_M4ZZI : sme2p1_multi_vec_array_vg4_index_16b<"bfmla", 0b10>;
+defm BFMLA_VG2_M2ZZ : sme2_dot_mla_add_sub_array_vg24_single<"bfmla", 0b1011100, MatrixOp16, ZZ_h, ZPR4b16>;
+defm BFMLA_VG4_M4ZZ : sme2_dot_mla_add_sub_array_vg24_single<"bfmla", 0b1111100, MatrixOp16, ZZZZ_h, ZPR4b16>;
+defm BFMLA_VG2_M2Z2Z : sme2_dot_mla_add_sub_array_vg2_multi<"bfmla", 0b110001, MatrixOp16, ZZ_h_mul_r>;
+defm BFMLA_VG4_M4Z4Z : sme2_dot_mla_add_sub_array_vg4_multi<"bfmla", 0b110001, MatrixOp16, ZZZZ_h_mul_r>;
+
+defm BFMLS_VG2_M2ZZI : sme2p1_multi_vec_array_vg2_index_16b<"bfmls", 0b11>;
+defm BFMLS_VG4_M4ZZI : sme2p1_multi_vec_array_vg4_index_16b<"bfmls", 0b11>;
+defm BFMLS_VG2_M2ZZ : sme2_dot_mla_add_sub_array_vg24_single<"bfmls", 0b1011101, MatrixOp16, ZZ_h, ZPR4b16>;
+defm BFMLS_VG4_M4ZZ : sme2_dot_mla_add_sub_array_vg24_single<"bfmls", 0b1111101, MatrixOp16, ZZZZ_h, ZPR4b16>;
+defm BFMLS_VG2_M2Z2Z : sme2_dot_mla_add_sub_array_vg2_multi<"bfmls", 0b110011, MatrixOp16, ZZ_h_mul_r>;
+defm BFMLS_VG4_M4Z4Z : sme2_dot_mla_add_sub_array_vg4_multi<"bfmls", 0b110011, MatrixOp16, ZZZZ_h_mul_r>;
+
+
+defm BFMAX_VG2_2ZZ : sme2p1_bf_max_min_vector_vg2_single<"bfmax", 0b0010000>;
+defm BFMAX_VG4_4ZZ : sme2p1_bf_max_min_vector_vg4_single<"bfmax", 0b0010000>;
+defm BFMAX_VG2_2Z2Z : sme2p1_bf_max_min_vector_vg2_multi<"bfmax", 0b0010000>;
+defm BFMAX_VG4_4Z2Z : sme2p1_bf_max_min_vector_vg4_multi<"bfmax", 0b0010000>;
+
+defm BFMIN_VG2_2ZZ : sme2p1_bf_max_min_vector_vg2_single<"bfmin", 0b0010001>;
+defm BFMIN_VG4_4ZZ : sme2p1_bf_max_min_vector_vg4_single<"bfmin", 0b0010001>;
+defm BFMIN_VG2_2Z2Z : sme2p1_bf_max_min_vector_vg2_multi<"bfmin", 0b0010001>;
+defm BFMIN_VG4_4Z2Z : sme2p1_bf_max_min_vector_vg4_multi<"bfmin", 0b0010001>;
+
+defm BFMAXNM_VG2_2ZZ : sme2p1_bf_max_min_vector_vg2_single<"bfmaxnm", 0b0010010>;
+defm BFMAXNM_VG4_4ZZ : sme2p1_bf_max_min_vector_vg4_single<"bfmaxnm", 0b0010010>;
+defm BFMAXNM_VG2_2Z2Z : sme2p1_bf_max_min_vector_vg2_multi<"bfmaxnm", 0b0010010>;
+defm BFMAXNM_VG4_4Z2Z : sme2p1_bf_max_min_vector_vg4_multi<"bfmaxnm", 0b0010010>;
+
+defm BFMINNM_VG2_2ZZ : sme2p1_bf_max_min_vector_vg2_single<"bfminnm", 0b0010011>;
+defm BFMINNM_VG4_4ZZ : sme2p1_bf_max_min_vector_vg4_single<"bfminnm", 0b0010011>;
+defm BFMINNM_VG2_2Z2Z : sme2p1_bf_max_min_vector_vg2_multi<"bfminnm", 0b0010011>;
+defm BFMINNM_VG4_4Z2Z : sme2p1_bf_max_min_vector_vg4_multi<"bfminnm", 0b0010011>;
+
+defm BFCLAMP_VG2_2ZZZ: sme2p1_bfclamp_vector_vg2_multi<"bfclamp">;
+defm BFCLAMP_VG4_4ZZZ: sme2p1_bfclamp_vector_vg4_multi<"bfclamp">;
+
+defm BFMOPA_MPPZZ_H : sme2p1_fmop_tile_fp16<"bfmopa", 0b1, 0b0>;
+defm BFMOPS_MPPZZ_H : sme2p1_fmop_tile_fp16<"bfmops", 0b1, 0b1>;
}
diff --git a/llvm/lib/Target/AArch64/AArch64SchedA64FX.td b/llvm/lib/Target/AArch64/AArch64SchedA64FX.td
index afdb1d47d39f..cb88eddc2b22 100644
--- a/llvm/lib/Target/AArch64/AArch64SchedA64FX.td
+++ b/llvm/lib/Target/AArch64/AArch64SchedA64FX.td
@@ -23,7 +23,7 @@ def A64FXModel : SchedMachineModel {
list<Predicate> UnsupportedFeatures =
[HasSVE2, HasSVE2AES, HasSVE2SM4, HasSVE2SHA3, HasSVE2BitPerm, HasPAuth,
HasSVE2orSME, HasMTE, HasMatMulInt8, HasBF16, HasSME2, HasSME2p1, HasSVE2p1,
- HasSVE2p1_or_HasSME2p1];
+ HasSVE2p1_or_HasSME2p1, HasSMEF16F16];
let FullInstRWOverlapCheck = 0;
}
diff --git a/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp b/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
index b8a23876dc90..39501c2cc868 100644
--- a/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
+++ b/llvm/lib/Target/AArch64/AsmParser/AArch64AsmParser.cpp
@@ -3552,6 +3552,7 @@ static const struct Extension {
{"rme", {AArch64::FeatureRME}},
{"sme", {AArch64::FeatureSME}},
{"sme-f64f64", {AArch64::FeatureSMEF64F64}},
+ {"sme-f16f16", {AArch64::FeatureSMEF16F16}},
{"sme-i16i64", {AArch64::FeatureSMEI16I64}},
{"sme2", {AArch64::FeatureSME2}},
{"sme2p1", {AArch64::FeatureSME2p1}},
diff --git a/llvm/lib/Target/AArch64/SMEInstrFormats.td b/llvm/lib/Target/AArch64/SMEInstrFormats.td
index 2a9b3e60052d..0aede40e8b61 100644
--- a/llvm/lib/Target/AArch64/SMEInstrFormats.td
+++ b/llvm/lib/Target/AArch64/SMEInstrFormats.td
@@ -36,7 +36,7 @@ class sme_outer_product_pseudo<ZPRRegOp zpr_ty>
let usesCustomInserter = 1;
}
-class sme_fp_outer_product_inst<bit S, bit sz, MatrixTileOperand za_ty,
+class sme_fp_outer_product_inst<bit S, bits<2> sz, bit op, MatrixTileOperand za_ty,
ZPRRegOp zpr_ty, string mnemonic>
: I<(outs za_ty:$ZAda),
(ins za_ty:$_ZAda, PPR3bAny:$Pn, PPR3bAny:$Pm, zpr_ty:$Zn, zpr_ty:$Zm),
@@ -47,21 +47,22 @@ class sme_fp_outer_product_inst<bit S, bit sz, MatrixTileOperand za_ty,
bits<3> Pm;
bits<3> Pn;
bits<5> Zn;
- let Inst{31-23} = 0b100000001;
- let Inst{22} = sz;
- let Inst{21} = 0b0;
+ let Inst{31-25} = 0b1000000;
+ let Inst{24} = op;
+ let Inst{23} = 0b1;
+ let Inst{22-21} = sz;
let Inst{20-16} = Zm;
let Inst{15-13} = Pm;
let Inst{12-10} = Pn;
let Inst{9-5} = Zn;
let Inst{4} = S;
- let Inst{3} = 0b0;
+ let Inst{3} = op;
let Constraints = "$ZAda = $_ZAda";
}
multiclass sme_outer_product_fp32<bit S, string mnemonic, SDPatternOperator op> {
- def NAME : sme_fp_outer_product_inst<S, 0b0, TileOp32, ZPR32, mnemonic> {
+ def NAME : sme_fp_outer_product_inst<S, 0b00, 0b0, TileOp32, ZPR32, mnemonic> {
bits<2> ZAda;
let Inst{1-0} = ZAda;
let Inst{2} = 0b0;
@@ -75,7 +76,7 @@ multiclass sme_outer_product_fp32<bit S, string mnemonic, SDPatternOperator op>
}
multiclass sme_outer_product_fp64<bit S, string mnemonic, SDPatternOperator op> {
- def NAME : sme_fp_outer_product_inst<S, 0b1, TileOp64, ZPR64, mnemonic> {
+ def NAME : sme_fp_outer_product_inst<S, 0b10, 0b0, TileOp64, ZPR64, mnemonic> {
bits<3> ZAda;
let Inst{2-0} = ZAda;
}
@@ -87,6 +88,14 @@ multiclass sme_outer_product_fp64<bit S, string mnemonic, SDPatternOperator op>
(!cast<Instruction>(NAME # _PSEUDO) timm32_0_7:$tile, $pn, $pm, $zn, $zm)>;
}
+multiclass sme2p1_fmop_tile_fp16<string mnemonic, bit bf, bit s>{
+ def NAME : sme_fp_outer_product_inst<s, {0,bf}, 0b1, TileOp16, ZPR16, mnemonic> {
+ bits<1> ZAda;
+ let Inst{2-1} = 0b00;
+ let Inst{0} = ZAda;
+ }
+}
+
class sme_int_outer_product_inst<bits<3> opc, bit sz, bit sme2,
MatrixTileOperand za_ty, ZPRRegOp zpr_ty,
string mnemonic>
@@ -1315,81 +1324,65 @@ multiclass sme2_dot_mla_add_sub_array_vg4_multi<string mnemonic, bits<6> op,
//===----------------------------------------------------------------------===//
// SME2 multiple vectors binary two or four registers
-class sme2_multivec_accum_add_sub<string mnemonic, bit sz, bit is_int,
- bit s, MatrixOperand matrix_ty,
- RegisterOperand vector_ty,
- string vg_acronym>
+class sme2_multivec_accum_add_sub<string mnemonic, bit sz, bit vg4, bits<3> op,
+ MatrixOperand matrix_ty,
+ RegisterOperand vector_ty>
: I<(outs matrix_ty:$ZAdn),
(ins matrix_ty:$_ZAdn, MatrixIndexGPR32Op8_11:$Rv, sme_elm_idx0_7:$imm3, vector_ty:$Zm),
- mnemonic, "\t$ZAdn[$Rv, $imm3, " # vg_acronym # "], $Zm",
+ mnemonic, "\t$ZAdn[$Rv, $imm3, " # !if(vg4, "vgx4", "vgx2") # "], $Zm",
"", []>, Sched<[]> {
bits<2> Rv;
bits<3> imm3;
let Inst{31-23} = 0b110000011;
let Inst{22} = sz;
- let Inst{21-17} = 0b10000;
+ let Inst{21-19} = 0b100;
+ let Inst{18} = op{2};
+ let Inst{17} = 0b0;
+ let Inst{16} = vg4;
let Inst{15} = 0b0;
let Inst{14-13} = Rv;
let Inst{12-10} = 0b111;
let Inst{5} = 0b0;
- let Inst{4} = is_int;
- let Inst{3} = s;
+ let Inst{4-3} = op{1-0};
let Inst{2-0} = imm3;
let Constraints = "$ZAdn = $_ZAdn";
}
-class sme2_multivec_accum_add_sub_vg2<string mnemonic, bit sz, bit is_int,
- bit s, MatrixOperand matrix_ty,
+class sme2_multivec_accum_add_sub_vg2<string mnemonic, bit sz, bits<3> op,
+ MatrixOperand matrix_ty,
RegisterOperand vector_ty>
- : sme2_multivec_accum_add_sub<mnemonic, sz, is_int, s,
- matrix_ty, vector_ty , "vgx2"> {
+ : sme2_multivec_accum_add_sub<mnemonic, sz, 0b0, op, matrix_ty, vector_ty> {
bits<4> Zm;
- let Inst{16} = 0b0;
let Inst{9-6} = Zm;
}
-multiclass sme2_multivec_accum_add_sub_vg2_S<string mnemonic, bits<2> op> {
- def NAME : sme2_multivec_accum_add_sub_vg2<mnemonic, 0b0, op{1}, op{0},
- MatrixOp32, ZZ_s_mul_r>;
+multiclass sme2_multivec_accum_add_sub_vg2<string mnemonic, bits<4> op,
+ MatrixOperand matrix_ty,
+ RegisterOperand vector_ty> {
+ def NAME : sme2_multivec_accum_add_sub_vg2<mnemonic, op{3}, op{2-0}, matrix_ty, vector_ty>;
def : InstAlias<mnemonic # "\t$ZAdn[$Rv, $imm3], $Zm",
- (!cast<Instruction>(NAME) MatrixOp32:$ZAdn, MatrixIndexGPR32Op8_11:$Rv, sme_elm_idx0_7:$imm3, ZZ_s_mul_r:$Zm), 0>;
+ (!cast<Instruction>(NAME) matrix_ty:$ZAdn, MatrixIndexGPR32Op8_11:$Rv, sme_elm_idx0_7:$imm3, vector_ty:$Zm), 0>;
}
-multiclass sme2_multivec_accum_add_sub_vg2_D<string mnemonic, bits<2> op> {
- def NAME : sme2_multivec_accum_add_sub_vg2<mnemonic, 0b1, op{1}, op{0},
- MatrixOp64, ZZ_d_mul_r>;
- def : InstAlias<mnemonic # "\t$ZAdn[$Rv, $imm3], $Zm",
- (!cast<Instruction>(NAME) MatrixOp64:$ZAdn, MatrixIndexGPR32Op8_11:$Rv, sme_elm_idx0_7:$imm3, ZZ_d_mul_r:$Zm), 0>;
-
-}
-
-class sme2_multivec_accum_add_sub_vg4<string mnemonic, bit sz, bit is_int,
- bit s, MatrixOperand matrix_ty,
+class sme2_multivec_accum_add_sub_vg4<string mnemonic, bit sz, bits<3> op,
+ MatrixOperand matrix_ty,
RegisterOperand vector_ty>
- : sme2_multivec_accum_add_sub<mnemonic, sz, is_int, s,
- matrix_ty, vector_ty , "vgx4"> {
+ : sme2_multivec_accum_add_sub<mnemonic, sz, 0b1, op, matrix_ty, vector_ty> {
bits<3> Zm;
- let Inst{16} = 0b1;
let Inst{9-7} = Zm;
let Inst{6} = 0b0;
}
-multiclass sme2_multivec_accum_add_sub_vg4_S<string mnemonic, bits<2> op> {
- def NAME : sme2_multivec_accum_add_sub_vg4<mnemonic, 0b0, op{1}, op{0},
- MatrixOp32, ZZZZ_s_mul_r>;
-
- def : InstAlias<mnemonic # "\t$ZAdn[$Rv, $imm3], $Zm",
- (!cast<Instruction>(NAME) MatrixOp32:$ZAdn, MatrixIndexGPR32Op8_11:$Rv, sme_elm_idx0_7:$imm3, ZZZZ_s_mul_r:$Zm), 0>;
-}
+multiclass sme2_multivec_accum_add_sub_vg4<string mnemonic, bits<4> op,
+ MatrixOperand matrix_ty,
+ RegisterOperand vector_ty> {
+ def NAME : sme2_multivec_accum_add_sub_vg4<mnemonic, op{3}, op{2-0}, matrix_ty, vector_ty>;
-multiclass sme2_multivec_accum_add_sub_vg4_D<string mnemonic, bits<2> op> {
- def NAME : sme2_multivec_accum_add_sub_vg4<mnemonic, 0b1, op{1}, op{0},
- MatrixOp64, ZZZZ_d_mul_r>;
def : InstAlias<mnemonic # "\t$ZAdn[$Rv, $imm3], $Zm",
- (!cast<Instruction>(NAME) MatrixOp64:$ZAdn, MatrixIndexGPR32Op8_11:$Rv, sme_elm_idx0_7:$imm3, ZZZZ_d_mul_r:$Zm), 0>;
+ (!cast<Instruction>(NAME) matrix_ty:$ZAdn, MatrixIndexGPR32Op8_11:$Rv, sme_elm_idx0_7:$imm3, vector_ty:$Zm), 0>;
}
//===----------------------------------------------------------------------===//
@@ -1430,6 +1423,12 @@ multiclass sme2_int_sve_destructive_vector_vg2_single<string mnemonic, bits<7> o
def _D : sme2_sve_destructive_vector_vg2_single<0b11, op, ZZ_d_mul_r, ZPR4b64, mnemonic>;
}
+// SME2.1 fmax/fmin instructions.
+multiclass sme2p1_bf_max_min_vector_vg2_single<string mnemonic, bits<7>op> {
+ def _H : sme2_sve_destructive_vector_vg2_single<0b00, op, ZZ_h_mul_r,
+ ZPR4b16, mnemonic>;
+}
+
class sme2_sve_destructive_vector_vg4_single<bits<2> sz, bits<7> op,
RegisterOperand vector_ty,
ZPRRegOp zpr_ty,
@@ -1465,6 +1464,12 @@ multiclass sme2_int_sve_destructive_vector_vg4_single<string mnemonic, bits<7> o
def _D : sme2_sve_destructive_vector_vg4_single<0b11, op, ZZZZ_d_mul_r, ZPR4b64, mnemonic>;
}
+// SME2.1 fmax/fmin instructions.
+multiclass sme2p1_bf_max_min_vector_vg4_single<string mnemonic, bits<7>op> {
+ def _H : sme2_sve_destructive_vector_vg4_single<0b00, op, ZZZZ_h_mul_r,
+ ZPR4b16, mnemonic>;
+}
+
class sme2_sve_destructive_vector_vg2_multi<bits<2> sz, bits<7> op,
RegisterOperand vector_ty,
string mnemonic>
@@ -1498,6 +1503,12 @@ multiclass sme2_int_sve_destructive_vector_vg2_multi<string mnemonic, bits<7> op
def _D : sme2_sve_destructive_vector_vg2_multi<0b11, op, ZZ_d_mul_r, mnemonic>;
}
+// SME2.1 fmax/fmin instructions.
+multiclass sme2p1_bf_max_min_vector_vg2_multi<string mnemonic, bits<7>op> {
+ def _H : sme2_sve_destructive_vector_vg2_multi<0b00, op, ZZ_h_mul_r,
+ mnemonic>;
+}
+
class sme2_sve_destructive_vector_vg4_multi<bits<2> sz, bits<7> op,
RegisterOperand vector_ty,
string mnemonic>
@@ -1532,6 +1543,11 @@ multiclass sme2_int_sve_destructive_vector_vg4_multi<string mnemonic, bits<7> op
def _D : sme2_sve_destructive_vector_vg4_multi<0b11, op, ZZZZ_d_mul_r, mnemonic>;
}
+// SME2.1 fmax/fmin instructions.
+multiclass sme2p1_bf_max_min_vector_vg4_multi<string mnemonic, bits<7>op> {
+ def _H : sme2_sve_destructive_vector_vg4_multi<0b00, op, ZZZZ_h_mul_r,
+ mnemonic>;
+}
//===----------------------------------------------------------------------===//
// SME2 Multi-vector - Index/Single/Multi Array Vectors FMA sources
@@ -1862,8 +1878,7 @@ multiclass sme2_cvt_vg2_single<string mnemonic, bits<4> op> {
def NAME : sme2_cvt_vg2_single<mnemonic, op>;
}
-
-class sme2_unpk_vector_vg2<bits<2>sz, bit u, RegisterOperand first_ty,
+class sme2_cvt_unpk_vector_vg2<bits<2>sz, bits<3> op, bit u, RegisterOperand first_ty,
RegisterOperand second_ty, string mnemonic>
: I<(outs first_ty:$Zd), (ins second_ty:$Zn),
mnemonic, "\t$Zd, $Zn", "", []>, Sched<[]> {
@@ -1871,7 +1886,9 @@ class sme2_unpk_vector_vg2<bits<2>sz, bit u, RegisterOperand first_ty,
bits<4> Zd;
let Inst{31-24} = 0b11000001;
let Inst{23-22} = sz;
- let Inst{21-10} = 0b100101111000;
+ let Inst{21-19} = 0b100;
+ let Inst{18-16} = op;
+ let Inst{15-10} = 0b111000;
let Inst{9-5} = Zn;
let Inst{4-1} = Zd;
let Inst{0} = u;
@@ -1879,11 +1896,15 @@ class sme2_unpk_vector_vg2<bits<2>sz, bit u, RegisterOperand first_ty,
// SME2 multi-vec unpack two registers
multiclass sme2_unpk_vector_vg2<string mnemonic, bit u> {
- def _H : sme2_unpk_vector_vg2<0b01, u, ZZ_h_mul_r, ZPR8, mnemonic>;
- def _S : sme2_unpk_vector_vg2<0b10, u, ZZ_s_mul_r, ZPR16, mnemonic>;
- def _D : sme2_unpk_vector_vg2<0b11, u, ZZ_d_mul_r, ZPR32, mnemonic>;
+ def _H : sme2_cvt_unpk_vector_vg2<0b01, 0b101, u, ZZ_h_mul_r, ZPR8, mnemonic>;
+ def _S : sme2_cvt_unpk_vector_vg2<0b10, 0b101, u, ZZ_s_mul_r, ZPR16, mnemonic>;
+ def _D : sme2_cvt_unpk_vector_vg2<0b11, 0b101, u, ZZ_d_mul_r, ZPR32, mnemonic>;
}
+// SME2.1 multi-vec convert two registers
+multiclass sme2p1_fp_cvt_vector_vg2_single<string mnemonic, bit l> {
+ def _S : sme2_cvt_unpk_vector_vg2<0b10, 0b000, l, ZZ_s_mul_r, ZPR16, mnemonic>;
+}
class sme2_cvt_vg4_single<bit sz, bits<3> op, RegisterOperand first_ty,
RegisterOperand second_ty, string mnemonic>
@@ -1983,11 +2004,17 @@ multiclass sme2_zip_vector_vg2<string mnemonic, bit op> {
def _Q : sme2_zip_clamp_vector_vg2_multi<0b00, 0b101, op, ZZ_q_mul_r, ZPR128, mnemonic>;
}
+// SME2.1 multi-vec FCLAMP two registers
+multiclass sme2p1_bfclamp_vector_vg2_multi<string mnemonic> {
+ def _H : sme2_zip_clamp_vector_vg2_multi<0b00, 0b000, 0b0, ZZ_h_mul_r, ZPR16,
+ mnemonic>;
+}
+
class sme2_clamp_vector_vg4_multi<bits<2> sz, bits<3> op1, bit u,
RegisterOperand multi_vector_ty,
ZPRRegOp vector_ty, string mnemonic>
: sme2_zip_clamp_vector_vg24_multi<sz, op1, u, multi_vector_ty, vector_ty,
- mnemonic>{
+ mnemonic>{
bits<3> Zd;
let Inst{4-2} = Zd;
let Inst{1} = 0b0;
@@ -2006,30 +2033,34 @@ multiclass sme2_int_clamp_vector_vg4_multi<string mnemonic, bit u>{
def _D : sme2_clamp_vector_vg4_multi<0b11, 0b011, u, ZZZZ_d_mul_r, ZPR64, mnemonic>;
}
+// SME2.1 multi-vec FCLAMP four registers
+multiclass sme2p1_bfclamp_vector_vg4_multi<string mnemonic> {
+ def _H : sme2_clamp_vector_vg4_multi<0b00, 0b010, 0b0, ZZZZ_h_mul_r, ZPR16,
+ mnemonic>;
+}
//===----------------------------------------------------------------------===//
// SME2 Dot Products and MLA
-// SME2 multi-vec ternary indexed two registers 32-bit
-class sme2_multi_vec_array_vg2_index_32b<bits<4> op,
- RegisterOperand multi_vector_ty,
- ZPRRegOp vector_ty,
- string mnemonic>
- : I<(outs MatrixOp32:$ZAda),
- (ins MatrixOp32:$_ZAda, MatrixIndexGPR32Op8_11:$Rv, sme_elm_idx0_7:$imm3,
- multi_vector_ty:$Zn, vector_ty:$Zm, VectorIndexS:$i2),
- mnemonic, "\t$ZAda[$Rv, $imm3, vgx2], $Zn, $Zm$i2",
+class sme2_multi_vec_array_vg2_index<bit sz, bits<6> op, MatrixOperand matrix_ty,
+ RegisterOperand multi_vector_ty,
+ ZPRRegOp vector_ty, Operand index_ty,
+ string mnemonic>
+ : I<(outs matrix_ty:$ZAda),
+ (ins matrix_ty:$_ZAda, MatrixIndexGPR32Op8_11:$Rv, sme_elm_idx0_7:$imm3,
+ multi_vector_ty:$Zn, vector_ty:$Zm, index_ty:$i),
+ mnemonic, "\t$ZAda[$Rv, $imm3, vgx2], $Zn, $Zm$i",
"", []>, Sched<[]> {
bits<4> Zm;
bits<2> Rv;
- bits<2> i2;
bits<4> Zn;
bits<3> imm3;
- let Inst{31-20} = 0b110000010101;
+ let Inst{31-23} = 0b110000010;
+ let Inst{22} = sz;
+ let Inst{21-20} = 0b01;
let Inst{19-16} = Zm;
let Inst{15} = 0b0;
let Inst{14-13} = Rv;
- let Inst{12} = op{3};
- let Inst{11-10} = i2;
+ let Inst{12-10} = op{5-3};
let Inst{9-6} = Zn;
let Inst{5-3} = op{2-0};
let Inst{2-0} = imm3;
@@ -2037,15 +2068,32 @@ class sme2_multi_vec_array_vg2_index_32b<bits<4> op,
let Constraints = "$ZAda = $_ZAda";
}
+// SME2 multi-vec ternary indexed two registers 32-bit
multiclass sme2_multi_vec_array_vg2_index_32b<string mnemonic, bits<4> op,
RegisterOperand multi_vector_ty,
ZPRRegOp vector_ty> {
- def NAME : sme2_multi_vec_array_vg2_index_32b<op, multi_vector_ty, vector_ty,
- mnemonic>;
-
- def : InstAlias<mnemonic # "\t$ZAda[$Rv, $imm3], $Zn, $Zm$i2",
+ def NAME : sme2_multi_vec_array_vg2_index<0b1, {op{3},?,?,op{2-0}}, MatrixOp32, multi_vector_ty, vector_ty,
+ VectorIndexS, mnemonic> {
+ bits<2> i;
+ let Inst{11-10} = i;
+ }
+ def : InstAlias<mnemonic # "\t$ZAda[$Rv, $imm3], $Zn, $Zm$i",
(!cast<Instruction>(NAME) MatrixOp32:$ZAda, MatrixIndexGPR32Op8_11:$Rv, sme_elm_idx0_7:$imm3,
- multi_vector_ty:$Zn, vector_ty:$Zm, VectorIndexS:$i2), 0>;
+ multi_vector_ty:$Zn, vector_ty:$Zm, VectorIndexS:$i), 0>;
+}
+
+// SME2.1 multi-vec ternary indexed two registers 16-bit
+multiclass sme2p1_multi_vec_array_vg2_index_16b<string mnemonic, bits<2> op> {
+ def NAME : sme2_multi_vec_array_vg2_index<0b0, {0b1,?,?,op,?}, MatrixOp16,
+ ZZ_h_mul_r, ZPR4b16,
+ VectorIndexH, mnemonic> {
+ bits<3> i;
+ let Inst{11-10} = i{2-1};
+ let Inst{3} = i{0};
+ }
+ def : InstAlias<mnemonic # "\t$ZAda[$Rv, $imm3], $Zn, $Zm$i",
+ (!cast<Instruction>(NAME) MatrixOp16:$ZAda, MatrixIndexGPR32Op8_11:$Rv, sme_elm_idx0_7:$imm3,
+ ZZ_h_mul_r:$Zn, ZPR4b16:$Zm, VectorIndexH:$i), 0>;
}
// SME2 multi-vec ternary indexed two registers 64-bit
@@ -2089,28 +2137,26 @@ multiclass sme2_multi_vec_array_vg2_index_64b<string mnemonic, bits<2> op,
multi_vector_ty:$Zn, vector_ty:$Zm, VectorIndexD:$i1), 0>;
}
-// SME2 multi-vec ternary indexed four registers 32-bit
-
-class sme2_multi_vec_array_vg4_index_32b<bits<4> op,
- RegisterOperand multi_vector_ty,
- ZPRRegOp vector_ty,
- string mnemonic>
- : I<(outs MatrixOp32:$ZAda),
- (ins MatrixOp32:$_ZAda, MatrixIndexGPR32Op8_11:$Rv, sme_elm_idx0_7:$imm3,
- multi_vector_ty:$Zn, vector_ty:$Zm, VectorIndexS:$i2),
- mnemonic, "\t$ZAda[$Rv, $imm3, vgx4], $Zn, $Zm$i2",
+class sme2_multi_vec_array_vg4_index<bit sz, bits<6> op, MatrixOperand matrix_ty,
+ RegisterOperand multi_vector_ty,
+ ZPRRegOp vector_ty, Operand index_ty,
+ string mnemonic>
+ : I<(outs matrix_ty:$ZAda),
+ (ins matrix_ty:$_ZAda, MatrixIndexGPR32Op8_11:$Rv, sme_elm_idx0_7:$imm3,
+ multi_vector_ty:$Zn, vector_ty:$Zm, index_ty:$i),
+ mnemonic, "\t$ZAda[$Rv, $imm3, vgx4], $Zn, $Zm$i",
"", []>, Sched<[]> {
bits<4> Zm;
bits<2> Rv;
- bits<2> i2;
bits<3> Zn;
bits<3> imm3;
- let Inst{31-20} = 0b110000010101;
+ let Inst{31-23} = 0b110000010;
+ let Inst{22} = sz;
+ let Inst{21-20} = 0b01;
let Inst{19-16} = Zm;
let Inst{15} = 0b1;
let Inst{14-13} = Rv;
- let Inst{12} = op{3};
- let Inst{11-10} = i2;
+ let Inst{12-10} = op{5-3};
let Inst{9-7} = Zn;
let Inst{6} = 0b0;
let Inst{5-3} = op{2-0};
@@ -2119,15 +2165,34 @@ class sme2_multi_vec_array_vg4_index_32b<bits<4> op,
let Constraints = "$ZAda = $_ZAda";
}
+// SME2 multi-vec ternary indexed four registers 32-bit
multiclass sme2_multi_vec_array_vg4_index_32b<string mnemonic, bits<4> op,
RegisterOperand multi_vector_ty,
ZPRRegOp vector_ty> {
- def NAME : sme2_multi_vec_array_vg4_index_32b<op, multi_vector_ty, vector_ty,
- mnemonic>;
+ def NAME : sme2_multi_vec_array_vg4_index<0b1, {op{3},?,?,op{2-0}}, MatrixOp32, multi_vector_ty,
+ vector_ty, VectorIndexS, mnemonic>{
+ bits<2> i;
+ let Inst{11-10} = i;
+ }
- def : InstAlias<mnemonic # "\t$ZAda[$Rv, $imm3], $Zn, $Zm$i2",
+ def : InstAlias<mnemonic # "\t$ZAda[$Rv, $imm3], $Zn, $Zm$i",
(!cast<Instruction>(NAME) MatrixOp32:$ZAda, MatrixIndexGPR32Op8_11:$Rv, sme_elm_idx0_7:$imm3,
- multi_vector_ty:$Zn, vector_ty:$Zm, VectorIndexS:$i2), 0>;
+ multi_vector_ty:$Zn, vector_ty:$Zm, VectorIndexS:$i), 0>;
+}
+
+// SME2.1 multi-vec ternary indexed four registers 16-bit
+multiclass sme2p1_multi_vec_array_vg4_index_16b<string mnemonic, bits<2> op> {
+ def NAME : sme2_multi_vec_array_vg4_index<0b0,{0b1,?,?,op,?}, MatrixOp16,
+ ZZZZ_h_mul_r, ZPR4b16,
+ VectorIndexH, mnemonic>{
+ bits<3> i;
+ let Inst{11-10} = i{2-1};
+ let Inst{3} = i{0};
+ }
+
+ def : InstAlias<mnemonic # "\t$ZAda[$Rv, $imm3], $Zn, $Zm$i",
+ (!cast<Instruction>(NAME) MatrixOp16:$ZAda, MatrixIndexGPR32Op8_11:$Rv,
+ sme_elm_idx0_7:$imm3, ZZZZ_h_mul_r:$Zn, ZPR4b16:$Zm, VectorIndexH:$i), 0>;
}
// SME2 multi-vec ternary indexed four registers 64-bit
@@ -3153,12 +3218,12 @@ multiclass sme2_mova_vec_to_array_vg4_multi<string mnemonic> {
}
-class sme2_mova_tile_to_vec_vg2_multi_base<bits<2> sz, bit v,
+class sme2_mova_tile_to_vec_vg2_multi_base<bits<2> sz, bit v, bits<3> op,
RegisterOperand vector_ty,
RegisterOperand tile_ty,
Operand index_ty,
string mnemonic>
- : I<(outs vector_ty:$Zd),
+ : I<!if(op{1}, (outs vector_ty:$Zd, tile_ty:$_ZAn), (outs vector_ty:$Zd)),
(ins tile_ty:$ZAn, MatrixIndexGPR32Op12_15:$Rs, index_ty:$imm),
mnemonic,
"\t$Zd, $ZAn[$Rs, $imm, vgx2]",
@@ -3170,9 +3235,12 @@ class sme2_mova_tile_to_vec_vg2_multi_base<bits<2> sz, bit v,
let Inst{21-16} = 0b000110;
let Inst{15} = v;
let Inst{14-13} = Rs;
- let Inst{12-8} = 0b00000;
+ let Inst{12-11} = 0b00;
+ let Inst{10-8} = op;
let Inst{4-1} = Zd;
let Inst{0} = 0b0;
+
+ let Constraints = !if(op{1}, "$ZAn = $_ZAn", "");
}
multiclass sme2_mova_tile_or_array_to_vec_aliases<int op, Instruction inst,
@@ -3190,7 +3258,7 @@ def : InstAlias<mnemonic # "\t$Zd, $ZAn[$Rs, $imm" # !if(!eq(vg_acronym, ""), ""
// SME2 move tile to vector, two registers
multiclass sme2_mova_tile_to_vec_vg2_multi_inst<bit v, string mnemonic> {
- def _B : sme2_mova_tile_to_vec_vg2_multi_base<0b00, v, ZZ_b_mul_r,
+ def _B : sme2_mova_tile_to_vec_vg2_multi_base<0b00, v, 0b000, ZZ_b_mul_r,
!if(v, TileVectorOpV8,
TileVectorOpH8),
uimm3s2range, mnemonic> {
@@ -3198,7 +3266,7 @@ multiclass sme2_mova_tile_to_vec_vg2_multi_inst<bit v, string mnemonic> {
let Inst{7-5} = imm;
}
- def _H : sme2_mova_tile_to_vec_vg2_multi_base<0b01, v, ZZ_h_mul_r,
+ def _H : sme2_mova_tile_to_vec_vg2_multi_base<0b01, v, 0b000, ZZ_h_mul_r,
!if(v, TileVectorOpV16,
TileVectorOpH16),
uimm2s2range, mnemonic> {
@@ -3208,7 +3276,7 @@ multiclass sme2_mova_tile_to_vec_vg2_multi_inst<bit v, string mnemonic> {
let Inst{6-5} = imm;
}
- def _S : sme2_mova_tile_to_vec_vg2_multi_base<0b10, v, ZZ_s_mul_r,
+ def _S : sme2_mova_tile_to_vec_vg2_multi_base<0b10, v, 0b000, ZZ_s_mul_r,
!if(v, TileVectorOpV32,
TileVectorOpH32),
uimm1s2range, mnemonic> {
@@ -3218,9 +3286,9 @@ multiclass sme2_mova_tile_to_vec_vg2_multi_inst<bit v, string mnemonic> {
let Inst{5} = imm;
}
- def _D : sme2_mova_tile_to_vec_vg2_multi_base<0b11, v, ZZ_d_mul_r,
+ def _D : sme2_mova_tile_to_vec_vg2_multi_base<0b11, v, 0b000, ZZ_d_mul_r,
!if(v, TileVectorOpV64,
- TileVectorOpH64),
+ TileVectorOpH64),
uimm0s2range, mnemonic> {
bits<3> ZAn;
let Inst{7-5} = ZAn;
@@ -3229,7 +3297,7 @@ multiclass sme2_mova_tile_to_vec_vg2_multi_inst<bit v, string mnemonic> {
defm : sme2_mova_tile_or_array_to_vec_aliases<1,!cast<Instruction>(NAME # _B),
ZZ_b_mul_r,
!if(v, TileVectorOpV8,
- TileVectorOpH8),
+ TileVectorOpH8),
MatrixIndexGPR32Op12_15,
uimm3s2range, "mov">;
defm : sme2_mova_tile_or_array_to_vec_aliases<1,!cast<Instruction>(NAME # _H),
@@ -3254,7 +3322,7 @@ multiclass sme2_mova_tile_to_vec_vg2_multi_inst<bit v, string mnemonic> {
defm : sme2_mova_tile_or_array_to_vec_aliases<0,!cast<Instruction>(NAME # _B),
ZZ_b_mul_r,
!if(v, TileVectorOpV8,
- TileVectorOpH8),
+ TileVectorOpH8),
MatrixIndexGPR32Op12_15,
uimm3s2range, "mova">;
defm : sme2_mova_tile_or_array_to_vec_aliases<0,!cast<Instruction>(NAME # _H),
@@ -3283,13 +3351,76 @@ multiclass sme2_mova_tile_to_vec_vg2_multi<string mnemonic>{
defm _V : sme2_mova_tile_to_vec_vg2_multi_inst<0b1, mnemonic>;
}
-// SME2 move tile to vector, four registers
-class sme2_mova_tile_to_vec_vg4_multi_base<bits<2> sz, bit v, bits<3> op,
+// SME2.1 zeroing move tile to vector, two registers
+multiclass sme2p1_movaz_tile_to_vec_vg2_base<bit v, string mnemonic> {
+ def _B : sme2_mova_tile_to_vec_vg2_multi_base<0b00, v, 0b010, ZZ_b_mul_r,
+ !if(v, TileVectorOpV8, TileVectorOpH8),
+ uimm3s2range, mnemonic> {
+ bits<3> imm;
+ let Inst{7-5} = imm;
+ }
+
+ def _H : sme2_mova_tile_to_vec_vg2_multi_base<0b01, v, 0b010, ZZ_h_mul_r,
+ !if(v, TileVectorOpV16, TileVectorOpH16),
+ uimm2s2range, mnemonic> {
+ bits<1> ZAn;
+ bits<2> imm;
+ let Inst{7} = ZAn;
+ let Inst{6-5} = imm;
+ }
+
+ def _S : sme2_mova_tile_to_vec_vg2_multi_base<0b10, v, 0b010, ZZ_s_mul_r,
+ !if(v, TileVectorOpV32, TileVectorOpH32),
+ uimm1s2range, mnemonic> {
+ bits<2> ZAn;
+ bits<1> imm;
+ let Inst{7-6} = ZAn;
+ let Inst{5} = imm;
+ }
+
+ def _D : sme2_mova_tile_to_vec_vg2_multi_base<0b11, v, 0b010, ZZ_d_mul_r,
+ !if(v, TileVectorOpV64, TileVectorOpH64),
+ uimm0s2range, mnemonic> {
+ bits<3> ZAn;
+ let Inst{7-5} = ZAn;
+ }
+
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0,!cast<Instruction>(NAME # _B),
+ ZZ_b_mul_r,
+ !if(v, TileVectorOpV8,
+ TileVectorOpH8),
+ MatrixIndexGPR32Op12_15,
+ uimm3s2range, "movaz">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0,!cast<Instruction>(NAME # _H),
+ ZZ_h_mul_r,
+ !if(v, TileVectorOpV16,
+ TileVectorOpH16),
+ MatrixIndexGPR32Op12_15,
+ uimm2s2range, "movaz">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME # _S),
+ ZZ_s_mul_r,
+ !if(v, TileVectorOpV32,
+ TileVectorOpH32),
+ MatrixIndexGPR32Op12_15,
+ uimm1s2range, "movaz">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME # _D),
+ ZZ_d_mul_r,
+ !if(v, TileVectorOpV64,
+ TileVectorOpH64),
+ MatrixIndexGPR32Op12_15,
+ uimm0s2range, "movaz">;
+}
+
+multiclass sme2p1_movaz_tile_to_vec_vg2<string mnemonic>{
+ defm _H : sme2p1_movaz_tile_to_vec_vg2_base<0b0, mnemonic>;
+ defm _V : sme2p1_movaz_tile_to_vec_vg2_base<0b1, mnemonic>;
+}
+class sme2_mova_tile_to_vec_vg4_multi_base<bits<2> sz, bit v, bits<6> op,
RegisterOperand vector_ty,
RegisterOperand tile_ty,
Operand index_ty,
string mnemonic>
- : I<(outs vector_ty:$Zd),
+ : I<!if(op{4}, (outs vector_ty:$Zd, tile_ty:$_ZAn), (outs vector_ty:$Zd)),
(ins tile_ty:$ZAn, MatrixIndexGPR32Op12_15:$Rs, index_ty:$imm),
mnemonic,
"\t$Zd, $ZAn[$Rs, $imm, vgx4]",
@@ -3301,15 +3432,18 @@ class sme2_mova_tile_to_vec_vg4_multi_base<bits<2> sz, bit v, bits<3> op,
let Inst{21-16} = 0b000110;
let Inst{15} = v;
let Inst{14-13} = Rs;
- let Inst{12-8} = 0b00100;
- let Inst{7-5} = op;
+ let Inst{12-11} = 0b00;
+ let Inst{10-5} = op{5-0};
let Inst{4-2} = Zd;
let Inst{1-0} = 0b00;
+
+ let Constraints = !if(op{4}, "$ZAn = $_ZAn", "");
}
+// SME2 move tile to vector, four registers
multiclass sme2_mova_tile_to_vec_vg4_multi_base<bit v, string mnemonic> {
- def _B : sme2_mova_tile_to_vec_vg4_multi_base<0b00, v, {0,?,?},
+ def _B : sme2_mova_tile_to_vec_vg4_multi_base<0b00, v, {0b1000,?,?},
ZZZZ_b_mul_r,
!if(v, TileVectorOpV8,
TileVectorOpH8),
@@ -3318,7 +3452,7 @@ multiclass sme2_mova_tile_to_vec_vg4_multi_base<bit v, string mnemonic> {
let Inst{6-5} = imm;
}
- def _H : sme2_mova_tile_to_vec_vg4_multi_base<0b01, v, {0,?,?},
+ def _H : sme2_mova_tile_to_vec_vg4_multi_base<0b01, v, {0b1000,?,?},
ZZZZ_h_mul_r,
!if(v, TileVectorOpV16,
TileVectorOpH16),
@@ -3329,7 +3463,7 @@ multiclass sme2_mova_tile_to_vec_vg4_multi_base<bit v, string mnemonic> {
let Inst{5} = imm;
}
- def _S : sme2_mova_tile_to_vec_vg4_multi_base<0b10, v, {0,?,?},
+ def _S : sme2_mova_tile_to_vec_vg4_multi_base<0b10, v, {0b1000,?,?},
ZZZZ_s_mul_r,
!if(v, TileVectorOpV32,
TileVectorOpH32),
@@ -3338,7 +3472,7 @@ multiclass sme2_mova_tile_to_vec_vg4_multi_base<bit v, string mnemonic> {
let Inst{6-5} = ZAn;
}
- def _D : sme2_mova_tile_to_vec_vg4_multi_base<0b11, v, {?,?,?},
+ def _D : sme2_mova_tile_to_vec_vg4_multi_base<0b11, v, {0b100,?,?,?},
ZZZZ_d_mul_r,
!if(v, TileVectorOpV64,
TileVectorOpH64),
@@ -3348,11 +3482,11 @@ multiclass sme2_mova_tile_to_vec_vg4_multi_base<bit v, string mnemonic> {
}
defm : sme2_mova_tile_or_array_to_vec_aliases<1, !cast<Instruction>(NAME # _B),
- ZZZZ_b_mul_r,
- !if(v, TileVectorOpV8,
+ ZZZZ_b_mul_r,
+ !if(v, TileVectorOpV8,
TileVectorOpH8),
- MatrixIndexGPR32Op12_15,
- uimm2s4range, "mov">;
+ MatrixIndexGPR32Op12_15,
+ uimm2s4range, "mov">;
defm : sme2_mova_tile_or_array_to_vec_aliases<1, !cast<Instruction>(NAME # _H),
ZZZZ_h_mul_r,
!if(v, TileVectorOpV16,
@@ -3403,11 +3537,74 @@ multiclass sme2_mova_tile_to_vec_vg4_multi<string mnemonic>{
defm _V : sme2_mova_tile_to_vec_vg4_multi_base<0b1, mnemonic>;
}
-// SME Move from Array
-class sme2_mova_array_to_vec_vg24_multi<bits<4> op, RegisterOperand vector_ty,
+// SME2.1 zeroing move tile to vector, four registers
+multiclass sme2p1_movaz_tile_to_vec_vg4_base<bit v, string mnemonic> {
+ def _B : sme2_mova_tile_to_vec_vg4_multi_base<0b00, v, {0b1100,?,?}, ZZZZ_b_mul_r,
+ !if(v, TileVectorOpV8, TileVectorOpH8),
+ uimm2s4range, mnemonic> {
+ bits<2> imm;
+ let Inst{6-5} = imm;
+ }
+
+ def _H : sme2_mova_tile_to_vec_vg4_multi_base<0b01, v, {0b1100,?,?}, ZZZZ_h_mul_r,
+ !if(v, TileVectorOpV16, TileVectorOpH16),
+ uimm1s4range, mnemonic> {
+ bits<1> ZAn;
+ bits<1> imm;
+ let Inst{6} = ZAn;
+ let Inst{5} = imm;
+ }
+
+ def _S : sme2_mova_tile_to_vec_vg4_multi_base<0b10, v, {0b1100,?,?}, ZZZZ_s_mul_r,
+ !if(v, TileVectorOpV32, TileVectorOpH32),
+ uimm0s4range, mnemonic> {
+ bits<2> ZAn;
+ let Inst{6-5} = ZAn;
+ }
+
+ def _D : sme2_mova_tile_to_vec_vg4_multi_base<0b11, v, {0b110,?,?,?}, ZZZZ_d_mul_r,
+ !if(v, TileVectorOpV64, TileVectorOpH64),
+ uimm0s4range, mnemonic> {
+ bits<3> ZAn;
+ let Inst{7-5} = ZAn;
+ }
+
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME # _B),
+ ZZZZ_b_mul_r,
+ !if(v, TileVectorOpV8,
+ TileVectorOpH8),
+ MatrixIndexGPR32Op12_15,
+ uimm2s4range, "movaz">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME # _H),
+ ZZZZ_h_mul_r,
+ !if(v, TileVectorOpV16,
+ TileVectorOpH16),
+ MatrixIndexGPR32Op12_15,
+ uimm1s4range, "movaz">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME # _S),
+ ZZZZ_s_mul_r,
+ !if(v, TileVectorOpV32,
+ TileVectorOpH32),
+ MatrixIndexGPR32Op12_15,
+ uimm0s4range, "movaz">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME # _D),
+ ZZZZ_d_mul_r,
+ !if(v, TileVectorOpV64,
+ TileVectorOpH64),
+ MatrixIndexGPR32Op12_15,
+ uimm0s4range, "movaz">;
+}
+
+multiclass sme2p1_movaz_tile_to_vec_vg4<string mnemonic>{
+ defm _H : sme2p1_movaz_tile_to_vec_vg4_base<0b0, mnemonic>;
+ defm _V : sme2p1_movaz_tile_to_vec_vg4_base<0b1, mnemonic>;
+}
+
+
+class sme2_mova_array_to_vec_vg24_multi<bits<4>op, RegisterOperand vector_ty,
RegisterOperand array_ty,
string mnemonic, string vg_acronym>
- : I<(outs vector_ty:$Zd),
+ : I<!if(op{2}, (outs vector_ty:$Zd, array_ty:$_ZAn), (outs vector_ty:$Zd)),
(ins array_ty:$ZAn, MatrixIndexGPR32Op8_11:$Rs, sme_elm_idx0_7:$imm),
mnemonic,
"\t$Zd, $ZAn[$Rs, $imm, " # vg_acronym # "]",
@@ -3417,20 +3614,26 @@ class sme2_mova_array_to_vec_vg24_multi<bits<4> op, RegisterOperand vector_ty,
let Inst{31-15} = 0b11000000000001100;
let Inst{14-13} = Rs;
let Inst{12-11} = 0b01;
- let Inst{10} = op{3};
- let Inst{9-8} = op{2-1};
+ let Inst{10-8} = op{3-1};
let Inst{7-5} = imm;
let Inst{1} = op{0};
let Inst{0} = 0b0;
+ let Constraints = !if(op{2}, "$ZAn = $_ZAn", "");
+}
+
+class sme2_mova_array_to_vec_vg2_multi<bits<4> op, RegisterOperand vector_ty,
+ RegisterOperand array_ty,
+ string mnemonic>
+ : sme2_mova_array_to_vec_vg24_multi<op, vector_ty, array_ty, mnemonic,
+ "vgx2"> {
+ bits<4> Zd;
+ let Inst{4-1} = Zd;
}
// MOVA (array to vector, two registers)
multiclass sme2_mova_array_to_vec_vg2_multi<string mnemonic> {
- def NAME : sme2_mova_array_to_vec_vg24_multi<{0b000,?}, ZZ_d_mul_r, MatrixOp64,
- mnemonic, "vgx2">{
- bits<4> Zd;
- let Inst{4-1} = Zd;
- }
+ def NAME : sme2_mova_array_to_vec_vg2_multi<{0b000,?}, ZZ_d_mul_r, MatrixOp64,
+ mnemonic>;
defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
ZZ_b_mul_r, MatrixOp8,
@@ -3497,13 +3700,55 @@ multiclass sme2_mova_array_to_vec_vg2_multi<string mnemonic> {
sme_elm_idx0_7, "mov", "vgx2">;
}
+// SME2.1 MOVAZ (array to vector, two registers)
+multiclass sme2p1_movaz_array_to_vec_vg2<string mnemonic> {
+ def NAME : sme2_mova_array_to_vec_vg2_multi<{0b010,?}, ZZ_d_mul_r, MatrixOp64,
+ mnemonic>;
+
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
+ ZZ_b_mul_r, MatrixOp8,
+ MatrixIndexGPR32Op8_11,
+ sme_elm_idx0_7, "movaz">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
+ ZZ_h_mul_r, MatrixOp16,
+ MatrixIndexGPR32Op8_11,
+ sme_elm_idx0_7, "movaz">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
+ ZZ_s_mul_r, MatrixOp32,
+ MatrixIndexGPR32Op8_11,
+ sme_elm_idx0_7, "movaz">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
+ ZZ_d_mul_r, MatrixOp64,
+ MatrixIndexGPR32Op8_11,
+ sme_elm_idx0_7, "movaz">;
+
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
+ ZZ_b_mul_r, MatrixOp8,
+ MatrixIndexGPR32Op8_11,
+ sme_elm_idx0_7, "movaz", "vgx2">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
+ ZZ_h_mul_r, MatrixOp16,
+ MatrixIndexGPR32Op8_11,
+ sme_elm_idx0_7, "movaz", "vgx2">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
+ ZZ_s_mul_r, MatrixOp32,
+ MatrixIndexGPR32Op8_11,
+ sme_elm_idx0_7, "movaz", "vgx2">;
+}
+
+class sme2_mova_array_to_vec_vg4_multi<bits<4> op, RegisterOperand vector_ty,
+ RegisterOperand array_ty,
+ string mnemonic>
+ : sme2_mova_array_to_vec_vg24_multi<op, vector_ty, array_ty, mnemonic,
+ "vgx4"> {
+ bits<3> Zd;
+ let Inst{4-2} = Zd;
+}
+
// MOVA (array to vector, four registers)
multiclass sme2_mova_array_to_vec_vg4_multi<string mnemonic> {
- def NAME : sme2_mova_array_to_vec_vg24_multi<0b1000, ZZZZ_d_mul_r, MatrixOp64,
- mnemonic, "vgx4">{
- bits<3> Zd;
- let Inst{4-2} = Zd;
- }
+ def NAME : sme2_mova_array_to_vec_vg4_multi<0b1000, ZZZZ_d_mul_r, MatrixOp64,
+ mnemonic>;
defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
ZZZZ_b_mul_r, MatrixOp8,
@@ -3570,6 +3815,42 @@ multiclass sme2_mova_array_to_vec_vg4_multi<string mnemonic> {
sme_elm_idx0_7, "mov", "vgx4">;
}
+// SME2.1 MOVAZ (array to vector, four registers)
+multiclass sme2p1_movaz_array_to_vec_vg4<string mnemonic> {
+ def NAME : sme2_mova_array_to_vec_vg4_multi<0b1100, ZZZZ_d_mul_r, MatrixOp64,
+ mnemonic>;
+
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
+ ZZZZ_b_mul_r, MatrixOp8,
+ MatrixIndexGPR32Op8_11,
+ sme_elm_idx0_7, "movaz">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
+ ZZZZ_h_mul_r, MatrixOp16,
+ MatrixIndexGPR32Op8_11,
+ sme_elm_idx0_7, "movaz">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
+ ZZZZ_s_mul_r, MatrixOp32,
+ MatrixIndexGPR32Op8_11,
+ sme_elm_idx0_7, "movaz">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
+ ZZZZ_d_mul_r, MatrixOp64,
+ MatrixIndexGPR32Op8_11,
+ sme_elm_idx0_7, "movaz">;
+
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
+ ZZZZ_b_mul_r, MatrixOp8,
+ MatrixIndexGPR32Op8_11,
+ sme_elm_idx0_7, "movaz", "vgx4">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
+ ZZZZ_h_mul_r, MatrixOp16,
+ MatrixIndexGPR32Op8_11,
+ sme_elm_idx0_7, "movaz", "vgx4">;
+ defm : sme2_mova_tile_or_array_to_vec_aliases<0, !cast<Instruction>(NAME),
+ ZZZZ_s_mul_r, MatrixOp32,
+ MatrixIndexGPR32Op8_11,
+ sme_elm_idx0_7, "movaz", "vgx4">;
+}
+
//===----------------------------------------------------------------------===//
// SME2 multi-vec saturating shift right narrow
class sme2_sat_shift_vector_vg2<string mnemonic, bit op, bit u>
@@ -3897,3 +4178,226 @@ multiclass sme2_st_vector_vg4_multi_scalar_immediate<bits<2> msz, bit n,
def : InstAlias<mnemonic # "\t$Zt, $PNg, [$Rn]",
(!cast<Instruction>(NAME) multi_vector_ty:$Zt, PNRAny_p8to15:$PNg, GPR64sp:$Rn,0), 1>;
}
+
+//===----------------------------------------------------------------------===//
+// SME2.1
+//===----------------------------------------------------------------------===//
+// SME zeroing move array to vector
+class sme2p1_movaz_tile_to_vec_base<bits<2> sz, bit q, bit v, ZPRRegOp vector_ty,
+ RegisterOperand tile_ty, Operand index_ty,
+ string mnemonic>
+ : I<(outs vector_ty:$Zd, tile_ty:$ZAn),
+ (ins tile_ty:$_ZAn, MatrixIndexGPR32Op12_15:$Rs, index_ty:$imm),
+ mnemonic, "\t$Zd, $ZAn[$Rs, $imm]",
+ "", []>, Sched<[]> {
+ bits<2> Rs;
+ bits<5> Zd;
+ let Inst{31-24} = 0b11000000;
+ let Inst{23-22} = sz;
+ let Inst{21-17} = 0b00001;
+ let Inst{16} = q;
+ let Inst{15} = v;
+ let Inst{14-13} = Rs;
+ let Inst{12-9} = 0b0001;
+ let Inst{4-0} = Zd;
+ let Constraints = "$ZAn = $_ZAn";
+}
+
+multiclass sme2p1_movaz_tile_to_vec_base<bit v, string mnemonic> {
+ def _B : sme2p1_movaz_tile_to_vec_base<0b00, 0b0, v, ZPR8,
+ !if(v, TileVectorOpV8, TileVectorOpH8),
+ sme_elm_idx0_15, mnemonic> {
+ bits<4> imm;
+ let Inst{8-5} = imm;
+ }
+
+ def _H : sme2p1_movaz_tile_to_vec_base<0b01, 0b0, v, ZPR16,
+ !if(v, TileVectorOpV16, TileVectorOpH16),
+ sme_elm_idx0_7, mnemonic> {
+ bits<1> ZAn;
+ bits<3> imm;
+ let Inst{8} = ZAn;
+ let Inst{7-5} = imm;
+ }
+
+ def _S : sme2p1_movaz_tile_to_vec_base<0b10, 0b0, v, ZPR32,
+ !if(v, TileVectorOpV32, TileVectorOpH32),
+ sme_elm_idx0_3, mnemonic> {
+ bits<2> ZAn;
+ bits<2> imm;
+ let Inst{8-7} = ZAn;
+ let Inst{6-5} = imm;
+ }
+
+ def _D : sme2p1_movaz_tile_to_vec_base<0b11, 0b0, v, ZPR64,
+ !if(v, TileVectorOpV64, TileVectorOpH64),
+ sme_elm_idx0_1, mnemonic> {
+ bits<3> ZAn;
+ bits<1> imm;
+ let Inst{8-6} = ZAn;
+ let Inst{5} = imm;
+ }
+
+ def _Q : sme2p1_movaz_tile_to_vec_base<0b11, 0b1, v, ZPR128,
+ !if(v, TileVectorOpV128, TileVectorOpH128),
+ sme_elm_idx0_0, mnemonic> {
+ bits<4> ZAn;
+ let Inst{8-5} = ZAn;
+ }
+}
+
+multiclass sme2p1_movaz_tile_to_vec<string mnemonic>{
+ defm _H : sme2p1_movaz_tile_to_vec_base<0b0, mnemonic>;
+ defm _V : sme2p1_movaz_tile_to_vec_base<0b1, mnemonic>;
+}
+
+//===----------------------------------------------------------------------===//
+// SME2.1 multiple vectors zero array
+
+class sme2p1_zero_matrix<bits<6> opc, Operand index_ty, string mnemonic,
+ string vg_acronym="">
+ : I<(outs MatrixOp64:$ZAd),
+ (ins MatrixOp64:$_ZAd, MatrixIndexGPR32Op8_11:$Rv, index_ty:$imm),
+ mnemonic, "\t$ZAd[$Rv, $imm" # !if(!eq(vg_acronym, ""), "", ", " # vg_acronym) # "]",
+ "", []>, Sched<[]> {
+ bits <2> Rv;
+ let Inst{31-18} = 0b11000000000011;
+ let Inst{17-15} = opc{5-3};
+ let Inst{14-13} = Rv;
+ let Inst{12-3} = 0b0000000000;
+ let Inst{2-0} = opc{2-0};
+ let Constraints = "$ZAd = $_ZAd";
+}
+
+multiclass sme2p1_zero_matrix<string mnemonic> {
+ def _VG2_Z : sme2p1_zero_matrix<{0b000,?,?,?}, sme_elm_idx0_7, mnemonic, "vgx2"> {
+ bits<3> imm;
+ let Inst{2-0} = imm;
+ }
+ def _2Z : sme2p1_zero_matrix<{0b001,?,?,?}, uimm3s2range, mnemonic> {
+ bits<3> imm;
+ let Inst{2-0} = imm;
+ }
+ def _VG2_2Z : sme2p1_zero_matrix<{0b0100,?,?}, uimm2s2range, mnemonic, "vgx2"> {
+ bits<2> imm;
+ let Inst{1-0} = imm;
+ }
+ def _VG4_2Z : sme2p1_zero_matrix<{0b0110,?,?}, uimm2s2range, mnemonic, "vgx4"> {
+ bits<2> imm;
+ let Inst{1-0} = imm;
+ }
+ def _VG4_Z : sme2p1_zero_matrix<{0b100,?,?,?}, sme_elm_idx0_7, mnemonic, "vgx4"> {
+ bits<3> imm;
+ let Inst{2-0} = imm;
+ }
+ def _4Z : sme2p1_zero_matrix<{0b1010,?,?}, uimm2s4range, mnemonic> {
+ bits<2> imm;
+ let Inst{1-0} = imm;
+ }
+ def _VG2_4Z :sme2p1_zero_matrix<{0b11000,?}, uimm1s4range, mnemonic, "vgx2"> {
+ bits<1> imm;
+ let Inst{0} = imm;
+ }
+ def _VG4_4Z :sme2p1_zero_matrix<{0b11100,?}, uimm1s4range, mnemonic, "vgx4"> {
+ bits<1> imm;
+ let Inst{0} = imm;
+ }
+}
+
+//===----------------------------------------------------------------------===//
+// SME2.1 lookup table expand two non-contiguous registers
+
+class sme2p1_luti_vector_vg2_index<bits<4> op, bits<2> sz, RegisterOperand vector_ty,
+ AsmVectorIndexOpnd index_ty,
+ string mnemonic>
+ : I<(outs vector_ty:$Zd), (ins ZTR:$ZTt, ZPRAny:$Zn, index_ty:$i),
+ mnemonic, "\t$Zd, $ZTt, $Zn$i",
+ "", []>, Sched<[]> {
+ bits<5> Zn;
+ bits<4> Zd;
+ let Inst{31-19} = 0b1100000010011;
+ let Inst{18-15} = op;
+ let Inst{14} = 0b1;
+ let Inst{13-12} = sz;
+ let Inst{11-10} = 0b00;
+ let Inst{9-5} = Zn;
+ let Inst{4} = Zd{3};
+ let Inst{3} = 0b0;
+ let Inst{2-0} = Zd{2-0};
+}
+
+class sme2p1_luti2_vector_vg2_index<bits<2> sz, RegisterOperand vector_ty,
+ AsmVectorIndexOpnd index_ty,
+ string mnemonic>
+ : sme2p1_luti_vector_vg2_index<{1,?,?,?}, sz, vector_ty, index_ty, mnemonic> {
+ bits<3> i;
+ let Inst{17-15} = i;
+}
+
+multiclass sme2p1_luti2_vector_vg2_index<string mnemonic> {
+ def _B : sme2p1_luti2_vector_vg2_index<0b00, ZZ_b_strided, VectorIndexH,
+ mnemonic>;
+ def _H : sme2p1_luti2_vector_vg2_index<0b01, ZZ_h_strided, VectorIndexH,
+ mnemonic>;
+}
+
+class sme2p1_luti4_vector_vg2_index<bits<2> sz, RegisterOperand vector_ty,
+ AsmVectorIndexOpnd index_ty,
+ string mnemonic>
+ : sme2p1_luti_vector_vg2_index<{0b01,?,?}, sz, vector_ty, index_ty, mnemonic> {
+ bits<2> i;
+ let Inst{16-15} = i;
+}
+multiclass sme2p1_luti4_vector_vg2_index<string mnemonic> {
+ def _B : sme2p1_luti4_vector_vg2_index<0b00, ZZ_b_strided, VectorIndexS,
+ mnemonic>;
+ def _H : sme2p1_luti4_vector_vg2_index<0b01, ZZ_h_strided, VectorIndexS,
+ mnemonic>;
+}
+
+// SME2.1 lookup table expand four non-contiguous registers
+class sme2p1_luti_vector_vg4_index<bits<3> op, bits<2> sz, RegisterOperand vector_ty,
+ AsmVectorIndexOpnd index_ty,
+ string mnemonic>
+ : I<(outs vector_ty:$Zd), (ins ZTR:$ZTt, ZPRAny:$Zn, index_ty:$i),
+ mnemonic, "\t$Zd, $ZTt, $Zn$i",
+ "", []>, Sched<[]> {
+ bits<5> Zn;
+ bits<3> Zd;
+ let Inst{31-19} = 0b1100000010011;
+ let Inst{18-16} = op;
+ let Inst{15-14} = 0b10;
+ let Inst{13-12} = sz;
+ let Inst{11-10} = 0b00;
+ let Inst{9-5} = Zn;
+ let Inst{4} = Zd{2};
+ let Inst{3-2} = 0b00;
+ let Inst{1-0} = Zd{1-0};
+}
+
+class sme2p1_luti2_vector_vg4_index<bits<2> sz, RegisterOperand vector_ty,
+ AsmVectorIndexOpnd index_ty,
+ string mnemonic>
+ : sme2p1_luti_vector_vg4_index<{1,?,?}, sz, vector_ty, index_ty, mnemonic> {
+ bits<2> i;
+ let Inst{17-16} = i;
+}
+
+multiclass sme2p1_luti2_vector_vg4_index<string mnemonic> {
+ def _B : sme2p1_luti2_vector_vg4_index<0b00, ZZZZ_b_strided, VectorIndexS,
+ mnemonic>;
+ def _H : sme2p1_luti2_vector_vg4_index<0b01, ZZZZ_h_strided, VectorIndexS,
+ mnemonic>;
+}
+
+class sme2p1_luti4_vector_vg4_index<bits<2> sz, RegisterOperand vector_ty,
+ AsmVectorIndexOpnd index_ty,
+ string mnemonic>
+ : sme2p1_luti_vector_vg4_index<{0b01,?}, sz, vector_ty, index_ty, mnemonic> {
+ bit i;
+ let Inst{16} = i;
+}
+
+multiclass sme2p1_luti4_vector_vg4_index<string mnemonic> {
+ def _H: sme2p1_luti4_vector_vg4_index<0b01, ZZZZ_h_strided, VectorIndexD, mnemonic>;
+}
diff --git a/llvm/test/MC/AArch64/SME2/fmla-diagnostics.s b/llvm/test/MC/AArch64/SME2/fmla-diagnostics.s
index c8b275ea4dda..d24f6161186e 100644
--- a/llvm/test/MC/AArch64/SME2/fmla-diagnostics.s
+++ b/llvm/test/MC/AArch64/SME2/fmla-diagnostics.s
@@ -44,9 +44,9 @@ fmla za.s[w7, 0], {z0.s - z1.s}, {z2.s - z3.s}
// --------------------------------------------------------------------------//
// Invalid Matrix Operand
-fmla za.h[w8, #0], {z0.h-z3.h}, z4.h
+fmla za.b[w8, #0], {z0.b-z3.b}, z4.b
// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand, expected suffix .d
-// CHECK-NEXT: fmla za.h[w8, #0], {z0.h-z3.h}, z4.h
+// CHECK-NEXT: fmla za.b[w8, #0], {z0.b-z3.b}, z4.b
// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
// --------------------------------------------------------------------------//
diff --git a/llvm/test/MC/AArch64/SME2/fmls-diagnostics.s b/llvm/test/MC/AArch64/SME2/fmls-diagnostics.s
index e04785639c1c..0433518858d5 100644
--- a/llvm/test/MC/AArch64/SME2/fmls-diagnostics.s
+++ b/llvm/test/MC/AArch64/SME2/fmls-diagnostics.s
@@ -29,9 +29,9 @@ fmls za.s[w12, 0], {z0.s-z1.s}, z0.s
// --------------------------------------------------------------------------//
// Invalid Matrix Operand
-fmls za.h[w8, #0], {z0.h-z3.h}, z4.h
+fmls za.b[w8, #0], {z0.b-z3.b}, z4.b
// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand, expected suffix .d
-// CHECK-NEXT: fmls za.h[w8, #0], {z0.h-z3.h}, z4.h
+// CHECK-NEXT: fmls za.b[w8, #0], {z0.b-z3.b}, z4.b
// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/bfadd-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/bfadd-diagnostics.s
new file mode 100644
index 000000000000..bc9a11238eca
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfadd-diagnostics.s
@@ -0,0 +1,53 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Out of range index offset
+
+bfadd za.h[w8, 8], {z20.h-z21.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: bfadd za.h[w8, 8], {z20.h-z21.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfadd za.h[w8, -1, vgx4], {z0.h-z3.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: bfadd za.h[w8, -1, vgx4], {z0.h-z3.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector select register
+
+bfadd za.h[w7, 0], {z20.h-z21.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: bfadd za.h[w7, 0], {z20.h-z21.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfadd za.h[w12, 0, vgx4], {z20.h-z23.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: bfadd za.h[w12, 0, vgx4], {z20.h-z23.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+bfadd za.h[w8, 3], {z20.h-z22.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfadd za.h[w8, 3], {z20.h-z22.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfadd za.h[w8, 3, vgx4], {z21.h-z24.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 4 consecutive SVE vectors, where the first vector is a multiple of 4 and with matching element types
+// CHECK-NEXT: bfadd za.h[w8, 3, vgx4], {z21.h-z24.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid suffixes
+
+bfadd za.h[w8, 3, vgx4], {z20.s-z23.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfadd za.h[w8, 3, vgx4], {z20.s-z23.s}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfadd za.d[w8, 3, vgx4], {z20.h-z23.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand, expected suffix .h
+// CHECK-NEXT: bfadd za.d[w8, 3, vgx4], {z20.h-z23.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/bfadd.s b/llvm/test/MC/AArch64/SME2p1/bfadd.s
new file mode 100644
index 000000000000..0fb553d0a1ed
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfadd.s
@@ -0,0 +1,300 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+b16b16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+b16b16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+bfadd za.h[w8, 0, vgx2], {z0.h, z1.h} // 11000001-11100100-00011100-00000000
+// CHECK-INST: bfadd za.h[w8, 0, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x00,0x1c,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41c00 <unknown>
+
+bfadd za.h[w8, 0], {z0.h - z1.h} // 11000001-11100100-00011100-00000000
+// CHECK-INST: bfadd za.h[w8, 0, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x00,0x1c,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41c00 <unknown>
+
+bfadd za.h[w10, 5, vgx2], {z10.h, z11.h} // 11000001-11100100-01011101-01000101
+// CHECK-INST: bfadd za.h[w10, 5, vgx2], { z10.h, z11.h }
+// CHECK-ENCODING: [0x45,0x5d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e45d45 <unknown>
+
+bfadd za.h[w10, 5], {z10.h - z11.h} // 11000001-11100100-01011101-01000101
+// CHECK-INST: bfadd za.h[w10, 5, vgx2], { z10.h, z11.h }
+// CHECK-ENCODING: [0x45,0x5d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e45d45 <unknown>
+
+bfadd za.h[w11, 7, vgx2], {z12.h, z13.h} // 11000001-11100100-01111101-10000111
+// CHECK-INST: bfadd za.h[w11, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x87,0x7d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e47d87 <unknown>
+
+bfadd za.h[w11, 7], {z12.h - z13.h} // 11000001-11100100-01111101-10000111
+// CHECK-INST: bfadd za.h[w11, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x87,0x7d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e47d87 <unknown>
+
+bfadd za.h[w11, 7, vgx2], {z30.h, z31.h} // 11000001-11100100-01111111-11000111
+// CHECK-INST: bfadd za.h[w11, 7, vgx2], { z30.h, z31.h }
+// CHECK-ENCODING: [0xc7,0x7f,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e47fc7 <unknown>
+
+bfadd za.h[w11, 7], {z30.h - z31.h} // 11000001-11100100-01111111-11000111
+// CHECK-INST: bfadd za.h[w11, 7, vgx2], { z30.h, z31.h }
+// CHECK-ENCODING: [0xc7,0x7f,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e47fc7 <unknown>
+
+bfadd za.h[w8, 5, vgx2], {z16.h, z17.h} // 11000001-11100100-00011110-00000101
+// CHECK-INST: bfadd za.h[w8, 5, vgx2], { z16.h, z17.h }
+// CHECK-ENCODING: [0x05,0x1e,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41e05 <unknown>
+
+bfadd za.h[w8, 5], {z16.h - z17.h} // 11000001-11100100-00011110-00000101
+// CHECK-INST: bfadd za.h[w8, 5, vgx2], { z16.h, z17.h }
+// CHECK-ENCODING: [0x05,0x1e,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41e05 <unknown>
+
+bfadd za.h[w8, 1, vgx2], {z0.h, z1.h} // 11000001-11100100-00011100-00000001
+// CHECK-INST: bfadd za.h[w8, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x01,0x1c,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41c01 <unknown>
+
+bfadd za.h[w8, 1], {z0.h - z1.h} // 11000001-11100100-00011100-00000001
+// CHECK-INST: bfadd za.h[w8, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x01,0x1c,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41c01 <unknown>
+
+bfadd za.h[w10, 0, vgx2], {z18.h, z19.h} // 11000001-11100100-01011110, 01000000
+// CHECK-INST: bfadd za.h[w10, 0, vgx2], { z18.h, z19.h }
+// CHECK-ENCODING: [0x40,0x5e,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e45e40 <unknown>
+
+bfadd za.h[w10, 0], {z18.h - z19.h} // 11000001-11100100-01011110-01000000
+// CHECK-INST: bfadd za.h[w10, 0, vgx2], { z18.h, z19.h }
+// CHECK-ENCODING: [0x40,0x5e,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e45e40 <unknown>
+
+bfadd za.h[w8, 0, vgx2], {z12.h, z13.h} // 11000001-11100100-00011101-10000000
+// CHECK-INST: bfadd za.h[w8, 0, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x80,0x1d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41d80 <unknown>
+
+bfadd za.h[w8, 0], {z12.h - z13.h} // 11000001-11100100-00011101-10000000
+// CHECK-INST: bfadd za.h[w8, 0, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x80,0x1d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41d80 <unknown>
+
+bfadd za.h[w10, 1, vgx2], {z0.h, z1.h} // 11000001-11100100-01011100-00000001
+// CHECK-INST: bfadd za.h[w10, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x01,0x5c,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e45c01 <unknown>
+
+bfadd za.h[w10, 1], {z0.h - z1.h} // 11000001-11100100-01011100-00000001
+// CHECK-INST: bfadd za.h[w10, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x01,0x5c,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e45c01 <unknown>
+
+bfadd za.h[w8, 5, vgx2], {z22.h, z23.h} // 11000001-11100100-00011110, 11000101
+// CHECK-INST: bfadd za.h[w8, 5, vgx2], { z22.h, z23.h }
+// CHECK-ENCODING: [0xc5,0x1e,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41ec5 <unknown>
+
+bfadd za.h[w8, 5], {z22.h - z23.h} // 11000001-11100100-00011110-11000101
+// CHECK-INST: bfadd za.h[w8, 5, vgx2], { z22.h, z23.h }
+// CHECK-ENCODING: [0xc5,0x1e,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41ec5 <unknown>
+
+bfadd za.h[w11, 2, vgx2], {z8.h, z9.h} // 11000001-11100100-01111101-00000010
+// CHECK-INST: bfadd za.h[w11, 2, vgx2], { z8.h, z9.h }
+// CHECK-ENCODING: [0x02,0x7d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e47d02 <unknown>
+
+bfadd za.h[w11, 2], {z8.h - z9.h} // 11000001-11100100-01111101-00000010
+// CHECK-INST: bfadd za.h[w11, 2, vgx2], { z8.h, z9.h }
+// CHECK-ENCODING: [0x02,0x7d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e47d02 <unknown>
+
+bfadd za.h[w9, 7, vgx2], {z12.h, z13.h} // 11000001-11100100-00111101-10000111
+// CHECK-INST: bfadd za.h[w9, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x87,0x3d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e43d87 <unknown>
+
+bfadd za.h[w9, 7], {z12.h - z13.h} // 11000001-11100100-00111101-10000111
+// CHECK-INST: bfadd za.h[w9, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x87,0x3d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e43d87 <unknown>
+
+bfadd za.h[w8, 0, vgx4], {z0.h - z3.h} // 11000001-11100101-00011100-00000000
+// CHECK-INST: bfadd za.h[w8, 0, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x00,0x1c,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51c00 <unknown>
+
+bfadd za.h[w8, 0], {z0.h - z3.h} // 11000001-11100101-00011100-00000000
+// CHECK-INST: bfadd za.h[w8, 0, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x00,0x1c,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51c00 <unknown>
+
+bfadd za.h[w10, 5, vgx4], {z8.h - z11.h} // 11000001-11100101-01011101-00000101
+// CHECK-INST: bfadd za.h[w10, 5, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x05,0x5d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e55d05 <unknown>
+
+bfadd za.h[w10, 5], {z8.h - z11.h} // 11000001-11100101-01011101-00000101
+// CHECK-INST: bfadd za.h[w10, 5, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x05,0x5d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e55d05 <unknown>
+
+bfadd za.h[w11, 7, vgx4], {z12.h - z15.h} // 11000001-11100101-01111101-10000111
+// CHECK-INST: bfadd za.h[w11, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x87,0x7d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e57d87 <unknown>
+
+bfadd za.h[w11, 7], {z12.h - z15.h} // 11000001-11100101-01111101-10000111
+// CHECK-INST: bfadd za.h[w11, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x87,0x7d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e57d87 <unknown>
+
+bfadd za.h[w11, 7, vgx4], {z28.h - z31.h} // 11000001-11100101-01111111-10000111
+// CHECK-INST: bfadd za.h[w11, 7, vgx4], { z28.h - z31.h }
+// CHECK-ENCODING: [0x87,0x7f,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e57f87 <unknown>
+
+bfadd za.h[w11, 7], {z28.h - z31.h} // 11000001-11100101-01111111-10000111
+// CHECK-INST: bfadd za.h[w11, 7, vgx4], { z28.h - z31.h }
+// CHECK-ENCODING: [0x87,0x7f,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e57f87 <unknown>
+
+bfadd za.h[w8, 5, vgx4], {z16.h - z19.h} // 11000001-11100101-00011110-00000101
+// CHECK-INST: bfadd za.h[w8, 5, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x05,0x1e,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51e05 <unknown>
+
+bfadd za.h[w8, 5], {z16.h - z19.h} // 11000001-11100101-00011110-00000101
+// CHECK-INST: bfadd za.h[w8, 5, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x05,0x1e,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51e05 <unknown>
+
+bfadd za.h[w8, 1, vgx4], {z0.h - z3.h} // 11000001-11100101-00011100-00000001
+// CHECK-INST: bfadd za.h[w8, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x01,0x1c,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51c01 <unknown>
+
+bfadd za.h[w8, 1], {z0.h - z3.h} // 11000001-11100101-00011100-00000001
+// CHECK-INST: bfadd za.h[w8, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x01,0x1c,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51c01 <unknown>
+
+bfadd za.h[w10, 0, vgx4], {z16.h - z19.h} // 11000001-11100101-01011110-00000000
+// CHECK-INST: bfadd za.h[w10, 0, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x00,0x5e,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e55e00 <unknown>
+
+bfadd za.h[w10, 0], {z16.h - z19.h} // 11000001-11100101-01011110-00000000
+// CHECK-INST: bfadd za.h[w10, 0, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x00,0x5e,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e55e00 <unknown>
+
+bfadd za.h[w8, 0, vgx4], {z12.h - z15.h} // 11000001-11100101-00011101-10000000
+// CHECK-INST: bfadd za.h[w8, 0, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x80,0x1d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51d80 <unknown>
+
+bfadd za.h[w8, 0], {z12.h - z15.h} // 11000001-11100101-00011101-10000000
+// CHECK-INST: bfadd za.h[w8, 0, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x80,0x1d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51d80 <unknown>
+
+bfadd za.h[w10, 1, vgx4], {z0.h - z3.h} // 11000001-11100101-01011100-00000001
+// CHECK-INST: bfadd za.h[w10, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x01,0x5c,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e55c01 <unknown>
+
+bfadd za.h[w10, 1], {z0.h - z3.h} // 11000001-11100101-01011100-00000001
+// CHECK-INST: bfadd za.h[w10, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x01,0x5c,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e55c01 <unknown>
+
+bfadd za.h[w8, 5, vgx4], {z20.h - z23.h} // 11000001-11100101-00011110-10000101
+// CHECK-INST: bfadd za.h[w8, 5, vgx4], { z20.h - z23.h }
+// CHECK-ENCODING: [0x85,0x1e,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51e85 <unknown>
+
+bfadd za.h[w8, 5], {z20.h - z23.h} // 11000001-11100101-00011110-10000101
+// CHECK-INST: bfadd za.h[w8, 5, vgx4], { z20.h - z23.h }
+// CHECK-ENCODING: [0x85,0x1e,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51e85 <unknown>
+
+bfadd za.h[w11, 2, vgx4], {z8.h - z11.h} // 11000001-11100101-01111101-00000010
+// CHECK-INST: bfadd za.h[w11, 2, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x02,0x7d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e57d02 <unknown>
+
+bfadd za.h[w11, 2], {z8.h - z11.h} // 11000001-11100101-01111101-00000010
+// CHECK-INST: bfadd za.h[w11, 2, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x02,0x7d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e57d02 <unknown>
+
+bfadd za.h[w9, 7, vgx4], {z12.h - z15.h} // 11000001-11100101-00111101-10000111
+// CHECK-INST: bfadd za.h[w9, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x87,0x3d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e53d87 <unknown>
+
+bfadd za.h[w9, 7], {z12.h - z15.h} // 11000001-11100101-00111101-10000111
+// CHECK-INST: bfadd za.h[w9, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x87,0x3d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e53d87 <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/bfclamp-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/bfclamp-diagnostics.s
new file mode 100644
index 000000000000..ee48a212f568
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfclamp-diagnostics.s
@@ -0,0 +1,33 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+bfclamp {z0.h-z2.h}, z0.h, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfclamp {z0.h-z2.h}, z0.h, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfclamp {z23.h-z24.h}, z13.h, z8.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors, where the first vector is a multiple of 2 and with matching element types
+// CHECK-NEXT: bfclamp {z23.h-z24.h}, z13.h, z8.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfclamp {z21.h-z24.h}, z10.h, z21.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 4 consecutive SVE vectors, where the first vector is a multiple of 4 and with matching element types
+// CHECK-NEXT: bfclamp {z21.h-z24.h}, z10.h, z21.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+
+// --------------------------------------------------------------------------//
+// Invalid Register Suffix
+
+bfclamp {z0.s-z1.s}, z0.h, z4.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfclamp {z0.s-z1.s}, z0.h, z4.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfclamp {z0.h-z3.h}, z5.d, z6.d
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid element width
+// CHECK-NEXT: bfclamp {z0.h-z3.h}, z5.d, z6.d
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/bfclamp.s b/llvm/test/MC/AArch64/SME2p1/bfclamp.s
new file mode 100644
index 000000000000..ebf8500eac19
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfclamp.s
@@ -0,0 +1,60 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+b16b16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+b16b16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+bfclamp {z0.h, z1.h}, z0.h, z0.h // 11000001-00100000-11000000-00000000
+// CHECK-INST: bfclamp { z0.h, z1.h }, z0.h, z0.h
+// CHECK-ENCODING: [0x00,0xc0,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120c000 <unknown>
+
+bfclamp {z20.h, z21.h}, z10.h, z21.h // 11000001-00110101-11000001-01010100
+// CHECK-INST: bfclamp { z20.h, z21.h }, z10.h, z21.h
+// CHECK-ENCODING: [0x54,0xc1,0x35,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c135c154 <unknown>
+
+bfclamp {z22.h, z23.h}, z13.h, z8.h // 11000001-00101000-11000001-10110110
+// CHECK-INST: bfclamp { z22.h, z23.h }, z13.h, z8.h
+// CHECK-ENCODING: [0xb6,0xc1,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128c1b6 <unknown>
+
+bfclamp {z30.h, z31.h}, z31.h, z31.h // 11000001-00111111-11000011-11111110
+// CHECK-INST: bfclamp { z30.h, z31.h }, z31.h, z31.h
+// CHECK-ENCODING: [0xfe,0xc3,0x3f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c13fc3fe <unknown>
+
+bfclamp {z0.h - z3.h}, z0.h, z0.h // 11000001-00100000-11001000-00000000
+// CHECK-INST: bfclamp { z0.h - z3.h }, z0.h, z0.h
+// CHECK-ENCODING: [0x00,0xc8,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120c800 <unknown>
+
+bfclamp {z20.h - z23.h}, z10.h, z21.h // 11000001-00110101-11001001-01010100
+// CHECK-INST: bfclamp { z20.h - z23.h }, z10.h, z21.h
+// CHECK-ENCODING: [0x54,0xc9,0x35,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c135c954 <unknown>
+
+bfclamp {z20.h - z23.h}, z13.h, z8.h // 11000001-00101000-11001001-10110100
+// CHECK-INST: bfclamp { z20.h - z23.h }, z13.h, z8.h
+// CHECK-ENCODING: [0xb4,0xc9,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128c9b4 <unknown>
+
+bfclamp {z28.h - z31.h}, z31.h, z31.h // 11000001-00111111-11001011-11111100
+// CHECK-INST: bfclamp { z28.h - z31.h }, z31.h, z31.h
+// CHECK-ENCODING: [0xfc,0xcb,0x3f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c13fcbfc <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/bfmax-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/bfmax-diagnostics.s
new file mode 100644
index 000000000000..55e5755ea6ec
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfmax-diagnostics.s
@@ -0,0 +1,45 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+bfmax {z0.h-z1.h}, {z0.h-z2.h}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfmax {z0.h-z1.h}, {z0.h-z2.h}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmax {z1.h-z2.h}, {z0.h-z1.h}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors, where the first vector is a multiple of 2 and with matching element types
+// CHECK-NEXT: bfmax {z1.h-z2.h}, {z0.h-z1.h}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmax {z1.h-z4.h}, {z0.h-z3.h}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 4 consecutive SVE vectors, where the first vector is a multiple of 4 and with matching element types
+// CHECK-NEXT: bfmax {z1.h-z4.h}, {z0.h-z3.h}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid single register
+
+bfmax {z0.h-z1.h}, {z2.h-z3.h}, z31.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: bfmax {z0.h-z1.h}, {z2.h-z3.h}, z31.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid Register Suffix
+
+bfmax {z0.h-z1.h}, {z2.h-z3.h}, z14.d
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: bfmax {z0.h-z1.h}, {z2.h-z3.h}, z14.d
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmax {z0.h-z1.h}, {z2.s-z3.s}, z14.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfmax {z0.h-z1.h}, {z2.s-z3.s}, z14.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmax {z0.h-z1.h}, {z2.h-z3.s}, z14.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: mismatched register size suffix
+// CHECK-NEXT: bfmax {z0.h-z1.h}, {z2.h-z3.s}, z14.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/bfmax.s b/llvm/test/MC/AArch64/SME2p1/bfmax.s
new file mode 100644
index 000000000000..d5af905c9d49
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfmax.s
@@ -0,0 +1,108 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 --mattr=+sme2p1,+b16b16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+b16b16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+bfmax {z0.h, z1.h}, {z0.h, z1.h}, z0.h // 11000001-00100000-10100001-00000000
+// CHECK-INST: bfmax { z0.h, z1.h }, { z0.h, z1.h }, z0.h
+// CHECK-ENCODING: [0x00,0xa1,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120a100 <unknown>
+
+bfmax {z20.h, z21.h}, {z20.h, z21.h}, z5.h // 11000001-00100101-10100001-00010100
+// CHECK-INST: bfmax { z20.h, z21.h }, { z20.h, z21.h }, z5.h
+// CHECK-ENCODING: [0x14,0xa1,0x25,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c125a114 <unknown>
+
+bfmax {z22.h, z23.h}, {z22.h, z23.h}, z8.h // 11000001-00101000-10100001-00010110
+// CHECK-INST: bfmax { z22.h, z23.h }, { z22.h, z23.h }, z8.h
+// CHECK-ENCODING: [0x16,0xa1,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128a116 <unknown>
+
+bfmax {z30.h, z31.h}, {z30.h, z31.h}, z15.h // 11000001-00101111-10100001-00011110
+// CHECK-INST: bfmax { z30.h, z31.h }, { z30.h, z31.h }, z15.h
+// CHECK-ENCODING: [0x1e,0xa1,0x2f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c12fa11e <unknown>
+
+bfmax {z0.h, z1.h}, {z0.h, z1.h}, {z0.h, z1.h} // 11000001-00100000-10110001-00000000
+// CHECK-INST: bfmax { z0.h, z1.h }, { z0.h, z1.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x00,0xb1,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120b100 <unknown>
+
+bfmax {z20.h, z21.h}, {z20.h, z21.h}, {z20.h, z21.h} // 11000001-00110100-10110001-00010100
+// CHECK-INST: bfmax { z20.h, z21.h }, { z20.h, z21.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x14,0xb1,0x34,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c134b114 <unknown>
+
+bfmax {z22.h, z23.h}, {z22.h, z23.h}, {z8.h, z9.h} // 11000001-00101000-10110001-00010110
+// CHECK-INST: bfmax { z22.h, z23.h }, { z22.h, z23.h }, { z8.h, z9.h }
+// CHECK-ENCODING: [0x16,0xb1,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128b116 <unknown>
+
+bfmax {z30.h, z31.h}, {z30.h, z31.h}, {z30.h, z31.h} // 11000001-00111110-10110001-00011110
+// CHECK-INST: bfmax { z30.h, z31.h }, { z30.h, z31.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0x1e,0xb1,0x3e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c13eb11e <unknown>
+
+bfmax {z0.h - z3.h}, {z0.h - z3.h}, z0.h // 11000001-00100000-10101001-00000000
+// CHECK-INST: bfmax { z0.h - z3.h }, { z0.h - z3.h }, z0.h
+// CHECK-ENCODING: [0x00,0xa9,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120a900 <unknown>
+
+bfmax {z20.h - z23.h}, {z20.h - z23.h}, z5.h // 11000001-00100101-10101001-00010100
+// CHECK-INST: bfmax { z20.h - z23.h }, { z20.h - z23.h }, z5.h
+// CHECK-ENCODING: [0x14,0xa9,0x25,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c125a914 <unknown>
+
+bfmax {z20.h - z23.h}, {z20.h - z23.h}, z8.h // 11000001-00101000-10101001-00010100
+// CHECK-INST: bfmax { z20.h - z23.h }, { z20.h - z23.h }, z8.h
+// CHECK-ENCODING: [0x14,0xa9,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128a914 <unknown>
+
+bfmax {z28.h - z31.h}, {z28.h - z31.h}, z15.h // 11000001-00101111-10101001-00011100
+// CHECK-INST: bfmax { z28.h - z31.h }, { z28.h - z31.h }, z15.h
+// CHECK-ENCODING: [0x1c,0xa9,0x2f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c12fa91c <unknown>
+
+bfmax {z0.h - z3.h}, {z0.h - z3.h}, {z0.h - z3.h} // 11000001-00100000-10111001-00000000
+// CHECK-INST: bfmax { z0.h - z3.h }, { z0.h - z3.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x00,0xb9,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120b900 <unknown>
+
+bfmax {z20.h - z23.h}, {z20.h - z23.h}, {z20.h - z23.h} // 11000001-00110100-10111001-00010100
+// CHECK-INST: bfmax { z20.h - z23.h }, { z20.h - z23.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x14,0xb9,0x34,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c134b914 <unknown>
+
+bfmax {z20.h - z23.h}, {z20.h - z23.h}, {z8.h - z11.h} // 11000001-00101000-10111001-00010100
+// CHECK-INST: bfmax { z20.h - z23.h }, { z20.h - z23.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x14,0xb9,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128b914 <unknown>
+
+bfmax {z28.h - z31.h}, {z28.h - z31.h}, {z28.h - z31.h} // 11000001-00111100-10111001-00011100
+// CHECK-INST: bfmax { z28.h - z31.h }, { z28.h - z31.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x1c,0xb9,0x3c,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c13cb91c <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/bfmaxnm-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/bfmaxnm-diagnostics.s
new file mode 100644
index 000000000000..b1f15112f8e3
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfmaxnm-diagnostics.s
@@ -0,0 +1,45 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+bfmaxnm {z0.h-z1.h}, {z0.h-z2.h}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfmaxnm {z0.h-z1.h}, {z0.h-z2.h}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmaxnm {z1.h-z2.h}, {z0.h-z1.h}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors, where the first vector is a multiple of 2 and with matching element types
+// CHECK-NEXT: bfmaxnm {z1.h-z2.h}, {z0.h-z1.h}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmaxnm {z1.h-z4.h}, {z0.h-z3.h}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 4 consecutive SVE vectors, where the first vector is a multiple of 4 and with matching element types
+// CHECK-NEXT: bfmaxnm {z1.h-z4.h}, {z0.h-z3.h}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid single register
+
+bfmaxnm {z0.h-z1.h}, {z2.h-z3.h}, z31.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: bfmaxnm {z0.h-z1.h}, {z2.h-z3.h}, z31.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid Register Suffix
+
+bfmaxnm {z0.h-z1.h}, {z2.h-z3.h}, z14.d
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: bfmaxnm {z0.h-z1.h}, {z2.h-z3.h}, z14.d
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmaxnm {z0.h-z1.h}, {z2.s-z3.s}, z14.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfmaxnm {z0.h-z1.h}, {z2.s-z3.s}, z14.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmaxnm {z0.h-z1.h}, {z2.h-z3.s}, z14.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: mismatched register size suffix
+// CHECK-NEXT: bfmaxnm {z0.h-z1.h}, {z2.h-z3.s}, z14.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/bfmaxnm.s b/llvm/test/MC/AArch64/SME2p1/bfmaxnm.s
new file mode 100644
index 000000000000..e5b97e89e811
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfmaxnm.s
@@ -0,0 +1,108 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 --mattr=+sme2p1,+b16b16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+b16b16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+bfmaxnm {z0.h, z1.h}, {z0.h, z1.h}, z0.h // 11000001-00100000-10100001-00100000
+// CHECK-INST: bfmaxnm { z0.h, z1.h }, { z0.h, z1.h }, z0.h
+// CHECK-ENCODING: [0x20,0xa1,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120a120 <unknown>
+
+bfmaxnm {z20.h, z21.h}, {z20.h, z21.h}, z5.h // 11000001-00100101-10100001-00110100
+// CHECK-INST: bfmaxnm { z20.h, z21.h }, { z20.h, z21.h }, z5.h
+// CHECK-ENCODING: [0x34,0xa1,0x25,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c125a134 <unknown>
+
+bfmaxnm {z22.h, z23.h}, {z22.h, z23.h}, z8.h // 11000001-00101000-10100001-00110110
+// CHECK-INST: bfmaxnm { z22.h, z23.h }, { z22.h, z23.h }, z8.h
+// CHECK-ENCODING: [0x36,0xa1,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128a136 <unknown>
+
+bfmaxnm {z30.h, z31.h}, {z30.h, z31.h}, z15.h // 11000001-00101111-10100001-00111110
+// CHECK-INST: bfmaxnm { z30.h, z31.h }, { z30.h, z31.h }, z15.h
+// CHECK-ENCODING: [0x3e,0xa1,0x2f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c12fa13e <unknown>
+
+bfmaxnm {z0.h, z1.h}, {z0.h, z1.h}, {z0.h, z1.h} // 11000001-00100000-10110001-00100000
+// CHECK-INST: bfmaxnm { z0.h, z1.h }, { z0.h, z1.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x20,0xb1,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120b120 <unknown>
+
+bfmaxnm {z20.h, z21.h}, {z20.h, z21.h}, {z20.h, z21.h} // 11000001-00110100-10110001-00110100
+// CHECK-INST: bfmaxnm { z20.h, z21.h }, { z20.h, z21.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x34,0xb1,0x34,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c134b134 <unknown>
+
+bfmaxnm {z22.h, z23.h}, {z22.h, z23.h}, {z8.h, z9.h} // 11000001-00101000-10110001-00110110
+// CHECK-INST: bfmaxnm { z22.h, z23.h }, { z22.h, z23.h }, { z8.h, z9.h }
+// CHECK-ENCODING: [0x36,0xb1,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128b136 <unknown>
+
+bfmaxnm {z30.h, z31.h}, {z30.h, z31.h}, {z30.h, z31.h} // 11000001-00111110-10110001-00111110
+// CHECK-INST: bfmaxnm { z30.h, z31.h }, { z30.h, z31.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0x3e,0xb1,0x3e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c13eb13e <unknown>
+
+bfmaxnm {z0.h - z3.h}, {z0.h - z3.h}, z0.h // 11000001-00100000-10101001-00100000
+// CHECK-INST: bfmaxnm { z0.h - z3.h }, { z0.h - z3.h }, z0.h
+// CHECK-ENCODING: [0x20,0xa9,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120a920 <unknown>
+
+bfmaxnm {z20.h - z23.h}, {z20.h - z23.h}, z5.h // 11000001-00100101-10101001-00110100
+// CHECK-INST: bfmaxnm { z20.h - z23.h }, { z20.h - z23.h }, z5.h
+// CHECK-ENCODING: [0x34,0xa9,0x25,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c125a934 <unknown>
+
+bfmaxnm {z20.h - z23.h}, {z20.h - z23.h}, z8.h // 11000001-00101000-10101001-00110100
+// CHECK-INST: bfmaxnm { z20.h - z23.h }, { z20.h - z23.h }, z8.h
+// CHECK-ENCODING: [0x34,0xa9,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128a934 <unknown>
+
+bfmaxnm {z28.h - z31.h}, {z28.h - z31.h}, z15.h // 11000001-00101111-10101001-00111100
+// CHECK-INST: bfmaxnm { z28.h - z31.h }, { z28.h - z31.h }, z15.h
+// CHECK-ENCODING: [0x3c,0xa9,0x2f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c12fa93c <unknown>
+
+bfmaxnm {z0.h - z3.h}, {z0.h - z3.h}, {z0.h - z3.h} // 11000001-00100000-10111001-00100000
+// CHECK-INST: bfmaxnm { z0.h - z3.h }, { z0.h - z3.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x20,0xb9,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120b920 <unknown>
+
+bfmaxnm {z20.h - z23.h}, {z20.h - z23.h}, {z20.h - z23.h} // 11000001-00110100-10111001-00110100
+// CHECK-INST: bfmaxnm { z20.h - z23.h }, { z20.h - z23.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x34,0xb9,0x34,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c134b934 <unknown>
+
+bfmaxnm {z20.h - z23.h}, {z20.h - z23.h}, {z8.h - z11.h} // 11000001-00101000-10111001-00110100
+// CHECK-INST: bfmaxnm { z20.h - z23.h }, { z20.h - z23.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x34,0xb9,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128b934 <unknown>
+
+bfmaxnm {z28.h - z31.h}, {z28.h - z31.h}, {z28.h - z31.h} // 11000001-00111100-10111001-00111100
+// CHECK-INST: bfmaxnm { z28.h - z31.h }, { z28.h - z31.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x3c,0xb9,0x3c,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c13cb93c <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/bfmin-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/bfmin-diagnostics.s
new file mode 100644
index 000000000000..72ee8184cf54
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfmin-diagnostics.s
@@ -0,0 +1,45 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+bfmin {z0.h-z1.h}, {z0.h-z2.h}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfmin {z0.h-z1.h}, {z0.h-z2.h}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmin {z1.h-z2.h}, {z0.h-z1.h}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors, where the first vector is a multiple of 2 and with matching element types
+// CHECK-NEXT: bfmin {z1.h-z2.h}, {z0.h-z1.h}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmin {z1.h-z4.h}, {z0.h-z3.h}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 4 consecutive SVE vectors, where the first vector is a multiple of 4 and with matching element types
+// CHECK-NEXT: bfmin {z1.h-z4.h}, {z0.h-z3.h}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid single register
+
+bfmin {z0.h-z1.h}, {z2.h-z3.h}, z31.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: bfmin {z0.h-z1.h}, {z2.h-z3.h}, z31.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid Register Suffix
+
+bfmin {z0.h-z1.h}, {z2.h-z3.h}, z14.d
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: bfmin {z0.h-z1.h}, {z2.h-z3.h}, z14.d
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmin {z0.h-z1.h}, {z2.s-z3.s}, z14.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfmin {z0.h-z1.h}, {z2.s-z3.s}, z14.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmin {z0.h-z1.h}, {z2.h-z3.s}, z14.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: mismatched register size suffix
+// CHECK-NEXT: bfmin {z0.h-z1.h}, {z2.h-z3.s}, z14.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/bfmin.s b/llvm/test/MC/AArch64/SME2p1/bfmin.s
new file mode 100644
index 000000000000..3ba2be5e7394
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfmin.s
@@ -0,0 +1,108 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 --mattr=+sme2p1,+b16b16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+b16b16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+bfmin {z0.h, z1.h}, {z0.h, z1.h}, z0.h // 11000001-00100000-10100001-00000001
+// CHECK-INST: bfmin { z0.h, z1.h }, { z0.h, z1.h }, z0.h
+// CHECK-ENCODING: [0x01,0xa1,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120a101 <unknown>
+
+bfmin {z20.h, z21.h}, {z20.h, z21.h}, z5.h // 11000001-00100101-10100001-00010101
+// CHECK-INST: bfmin { z20.h, z21.h }, { z20.h, z21.h }, z5.h
+// CHECK-ENCODING: [0x15,0xa1,0x25,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c125a115 <unknown>
+
+bfmin {z22.h, z23.h}, {z22.h, z23.h}, z8.h // 11000001-00101000-10100001-00010111
+// CHECK-INST: bfmin { z22.h, z23.h }, { z22.h, z23.h }, z8.h
+// CHECK-ENCODING: [0x17,0xa1,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128a117 <unknown>
+
+bfmin {z30.h, z31.h}, {z30.h, z31.h}, z15.h // 11000001-00101111-10100001-00011111
+// CHECK-INST: bfmin { z30.h, z31.h }, { z30.h, z31.h }, z15.h
+// CHECK-ENCODING: [0x1f,0xa1,0x2f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c12fa11f <unknown>
+
+bfmin {z0.h, z1.h}, {z0.h, z1.h}, {z0.h, z1.h} // 11000001-00100000-10110001-00000001
+// CHECK-INST: bfmin { z0.h, z1.h }, { z0.h, z1.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x01,0xb1,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120b101 <unknown>
+
+bfmin {z20.h, z21.h}, {z20.h, z21.h}, {z20.h, z21.h} // 11000001-00110100-10110001-00010101
+// CHECK-INST: bfmin { z20.h, z21.h }, { z20.h, z21.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x15,0xb1,0x34,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c134b115 <unknown>
+
+bfmin {z22.h, z23.h}, {z22.h, z23.h}, {z8.h, z9.h} // 11000001-00101000-10110001-00010111
+// CHECK-INST: bfmin { z22.h, z23.h }, { z22.h, z23.h }, { z8.h, z9.h }
+// CHECK-ENCODING: [0x17,0xb1,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128b117 <unknown>
+
+bfmin {z30.h, z31.h}, {z30.h, z31.h}, {z30.h, z31.h} // 11000001-00111110-10110001-00011111
+// CHECK-INST: bfmin { z30.h, z31.h }, { z30.h, z31.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0x1f,0xb1,0x3e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c13eb11f <unknown>
+
+bfmin {z0.h - z3.h}, {z0.h - z3.h}, z0.h // 11000001-00100000-10101001-00000001
+// CHECK-INST: bfmin { z0.h - z3.h }, { z0.h - z3.h }, z0.h
+// CHECK-ENCODING: [0x01,0xa9,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120a901 <unknown>
+
+bfmin {z20.h - z23.h}, {z20.h - z23.h}, z5.h // 11000001-00100101-10101001-00010101
+// CHECK-INST: bfmin { z20.h - z23.h }, { z20.h - z23.h }, z5.h
+// CHECK-ENCODING: [0x15,0xa9,0x25,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c125a915 <unknown>
+
+bfmin {z20.h - z23.h}, {z20.h - z23.h}, z8.h // 11000001-00101000-10101001-00010101
+// CHECK-INST: bfmin { z20.h - z23.h }, { z20.h - z23.h }, z8.h
+// CHECK-ENCODING: [0x15,0xa9,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128a915 <unknown>
+
+bfmin {z28.h - z31.h}, {z28.h - z31.h}, z15.h // 11000001-00101111-10101001-00011101
+// CHECK-INST: bfmin { z28.h - z31.h }, { z28.h - z31.h }, z15.h
+// CHECK-ENCODING: [0x1d,0xa9,0x2f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c12fa91d <unknown>
+
+bfmin {z0.h - z3.h}, {z0.h - z3.h}, {z0.h - z3.h} // 11000001-00100000-10111001-00000001
+// CHECK-INST: bfmin { z0.h - z3.h }, { z0.h - z3.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x01,0xb9,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120b901 <unknown>
+
+bfmin {z20.h - z23.h}, {z20.h - z23.h}, {z20.h - z23.h} // 11000001-00110100-10111001-00010101
+// CHECK-INST: bfmin { z20.h - z23.h }, { z20.h - z23.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x15,0xb9,0x34,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c134b915 <unknown>
+
+bfmin {z20.h - z23.h}, {z20.h - z23.h}, {z8.h - z11.h} // 11000001-00101000-10111001-00010101
+// CHECK-INST: bfmin { z20.h - z23.h }, { z20.h - z23.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x15,0xb9,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128b915 <unknown>
+
+bfmin {z28.h - z31.h}, {z28.h - z31.h}, {z28.h - z31.h} // 11000001-00111100-10111001-00011101
+// CHECK-INST: bfmin { z28.h - z31.h }, { z28.h - z31.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x1d,0xb9,0x3c,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c13cb91d <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/bfminnm-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/bfminnm-diagnostics.s
new file mode 100644
index 000000000000..2d161bf040f8
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfminnm-diagnostics.s
@@ -0,0 +1,45 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+bfminnm {z0.h-z1.h}, {z0.h-z2.h}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfminnm {z0.h-z1.h}, {z0.h-z2.h}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfminnm {z1.h-z2.h}, {z0.h-z1.h}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors, where the first vector is a multiple of 2 and with matching element types
+// CHECK-NEXT: bfminnm {z1.h-z2.h}, {z0.h-z1.h}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfminnm {z1.h-z4.h}, {z0.h-z3.h}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 4 consecutive SVE vectors, where the first vector is a multiple of 4 and with matching element types
+// CHECK-NEXT: bfminnm {z1.h-z4.h}, {z0.h-z3.h}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid single register
+
+bfminnm {z0.h-z1.h}, {z2.h-z3.h}, z31.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: bfminnm {z0.h-z1.h}, {z2.h-z3.h}, z31.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid Register Suffix
+
+bfminnm {z0.h-z1.h}, {z2.h-z3.h}, z14.d
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: bfminnm {z0.h-z1.h}, {z2.h-z3.h}, z14.d
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfminnm {z0.h-z1.h}, {z2.s-z3.s}, z14.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfminnm {z0.h-z1.h}, {z2.s-z3.s}, z14.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfminnm {z0.h-z1.h}, {z2.h-z3.s}, z14.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: mismatched register size suffix
+// CHECK-NEXT: bfminnm {z0.h-z1.h}, {z2.h-z3.s}, z14.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/bfminnm.s b/llvm/test/MC/AArch64/SME2p1/bfminnm.s
new file mode 100644
index 000000000000..cfaa3c1c2ad9
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfminnm.s
@@ -0,0 +1,113 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 --mattr=+sme2p1,+b16b16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+b16b16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+
+bfminnm {z0.h, z1.h}, {z0.h, z1.h}, z0.h // 11000001-00100000-10100001-00100001
+// CHECK-INST: bfminnm { z0.h, z1.h }, { z0.h, z1.h }, z0.h
+// CHECK-ENCODING: [0x21,0xa1,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120a121 <unknown>
+
+bfminnm {z20.h, z21.h}, {z20.h, z21.h}, z5.h // 11000001-00100101-10100001-00110101
+// CHECK-INST: bfminnm { z20.h, z21.h }, { z20.h, z21.h }, z5.h
+// CHECK-ENCODING: [0x35,0xa1,0x25,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c125a135 <unknown>
+
+bfminnm {z22.h, z23.h}, {z22.h, z23.h}, z8.h // 11000001-00101000-10100001-00110111
+// CHECK-INST: bfminnm { z22.h, z23.h }, { z22.h, z23.h }, z8.h
+// CHECK-ENCODING: [0x37,0xa1,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128a137 <unknown>
+
+bfminnm {z30.h, z31.h}, {z30.h, z31.h}, z15.h // 11000001-00101111-10100001-00111111
+// CHECK-INST: bfminnm { z30.h, z31.h }, { z30.h, z31.h }, z15.h
+// CHECK-ENCODING: [0x3f,0xa1,0x2f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c12fa13f <unknown>
+
+
+bfminnm {z0.h, z1.h}, {z0.h, z1.h}, {z0.h, z1.h} // 11000001-00100000-10110001-00100001
+// CHECK-INST: bfminnm { z0.h, z1.h }, { z0.h, z1.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x21,0xb1,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120b121 <unknown>
+
+bfminnm {z20.h, z21.h}, {z20.h, z21.h}, {z20.h, z21.h} // 11000001-00110100-10110001-00110101
+// CHECK-INST: bfminnm { z20.h, z21.h }, { z20.h, z21.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x35,0xb1,0x34,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c134b135 <unknown>
+
+bfminnm {z22.h, z23.h}, {z22.h, z23.h}, {z8.h, z9.h} // 11000001-00101000-10110001-00110111
+// CHECK-INST: bfminnm { z22.h, z23.h }, { z22.h, z23.h }, { z8.h, z9.h }
+// CHECK-ENCODING: [0x37,0xb1,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128b137 <unknown>
+
+bfminnm {z30.h, z31.h}, {z30.h, z31.h}, {z30.h, z31.h} // 11000001-00111110-10110001-00111111
+// CHECK-INST: bfminnm { z30.h, z31.h }, { z30.h, z31.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0x3f,0xb1,0x3e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c13eb13f <unknown>
+
+
+bfminnm {z0.h - z3.h}, {z0.h - z3.h}, z0.h // 11000001-00100000-10101001-00100001
+// CHECK-INST: bfminnm { z0.h - z3.h }, { z0.h - z3.h }, z0.h
+// CHECK-ENCODING: [0x21,0xa9,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120a921 <unknown>
+
+bfminnm {z20.h - z23.h}, {z20.h - z23.h}, z5.h // 11000001-00100101-10101001-00110101
+// CHECK-INST: bfminnm { z20.h - z23.h }, { z20.h - z23.h }, z5.h
+// CHECK-ENCODING: [0x35,0xa9,0x25,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c125a935 <unknown>
+
+bfminnm {z20.h - z23.h}, {z20.h - z23.h}, z8.h // 11000001-00101000-10101001-00110101
+// CHECK-INST: bfminnm { z20.h - z23.h }, { z20.h - z23.h }, z8.h
+// CHECK-ENCODING: [0x35,0xa9,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128a935 <unknown>
+
+bfminnm {z28.h - z31.h}, {z28.h - z31.h}, z15.h // 11000001-00101111-10101001-00111101
+// CHECK-INST: bfminnm { z28.h - z31.h }, { z28.h - z31.h }, z15.h
+// CHECK-ENCODING: [0x3d,0xa9,0x2f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c12fa93d <unknown>
+
+
+bfminnm {z0.h - z3.h}, {z0.h - z3.h}, {z0.h - z3.h} // 11000001-00100000-10111001-00100001
+// CHECK-INST: bfminnm { z0.h - z3.h }, { z0.h - z3.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x21,0xb9,0x20,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c120b921 <unknown>
+
+bfminnm {z20.h - z23.h}, {z20.h - z23.h}, {z20.h - z23.h} // 11000001-00110100-10111001-00110101
+// CHECK-INST: bfminnm { z20.h - z23.h }, { z20.h - z23.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x35,0xb9,0x34,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c134b935 <unknown>
+
+bfminnm {z20.h - z23.h}, {z20.h - z23.h}, {z8.h - z11.h} // 11000001-00101000-10111001-00110101
+// CHECK-INST: bfminnm { z20.h - z23.h }, { z20.h - z23.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x35,0xb9,0x28,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c128b935 <unknown>
+
+bfminnm {z28.h - z31.h}, {z28.h - z31.h}, {z28.h - z31.h} // 11000001-00111100-10111001-00111101
+// CHECK-INST: bfminnm { z28.h - z31.h }, { z28.h - z31.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x3d,0xb9,0x3c,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c13cb93d <unknown>
+
diff --git a/llvm/test/MC/AArch64/SME2p1/bfmla-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/bfmla-diagnostics.s
new file mode 100644
index 000000000000..42bb35da9dcf
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfmla-diagnostics.s
@@ -0,0 +1,94 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+bfmla za.h[w11, 2, vgx2], {z12.h-z14.h}, z8.h[3]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfmla za.h[w11, 2, vgx2], {z12.h-z14.h}, z8.h[3]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmla za.h[w11, 2, vgx4], {z12.h-z17.h}, z7.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid number of vectors
+// CHECK-NEXT: bfmla za.h[w11, 2, vgx4], {z12.h-z17.h}, z7.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmla za.h[w10, 3, vgx2], {z10.h-z11.h}, {z21.h-z22.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors, where the first vector is a multiple of 2 and with matching element types
+// CHECK-NEXT: bfmla za.h[w10, 3, vgx2], {z10.h-z11.h}, {z21.h-z22.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmla za.h[w11, 7, vgx4], {z12.h-z15.h}, {z9.h-z12.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 4 consecutive SVE vectors, where the first vector is a multiple of 4 and with matching element types
+// CHECK-NEXT: bfmla za.h[w11, 7, vgx4], {z12.h-z15.h}, {z9.h-z12.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid indexed-vector or single-vector register
+
+bfmla za.h[w8, 0], {z0.h-z1.h}, z16.h[0]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: bfmla za.h[w8, 0], {z0.h-z1.h}, z16.h[0]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmla za.h[w8, 1], {z0.h-z3.h}, z16.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: bfmla za.h[w8, 1], {z0.h-z3.h}, z16.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector select register
+
+bfmla za.h[w7, 7, vgx4], {z12.h-z15.h}, {z8.h-z11.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: bfmla za.h[w7, 7, vgx4], {z12.h-z15.h}, {z8.h-z11.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmla za.h[w12, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: bfmla za.h[w12, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector select offset
+
+bfmla za.h[w8, -1, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: bfmla za.h[w8, -1, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmla za.h[w8, 8, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: bfmla za.h[w8, 8, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid Register Suffix
+
+bfmla za.d[w8, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand, expected suffix .h
+// CHECK-NEXT: bfmla za.d[w8, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector lane index
+
+bfmla za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[8]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: bfmla za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[8]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmla za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: bfmla za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmla za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: bfmla za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmla za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[8]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: bfmla za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[8]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/bfmla.s b/llvm/test/MC/AArch64/SME2p1/bfmla.s
new file mode 100644
index 000000000000..4c053fea0ff1
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfmla.s
@@ -0,0 +1,876 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+b16b16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+b16b16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+bfmla za.h[w8, 0, vgx2], {z0.h, z1.h}, z0.h // 11000001-01100000-00011100-00000000
+// CHECK-INST: bfmla za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h
+// CHECK-ENCODING: [0x00,0x1c,0x60,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1601c00 <unknown>
+
+bfmla za.h[w8, 0], {z0.h - z1.h}, z0.h // 11000001-01100000-00011100-00000000
+// CHECK-INST: bfmla za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h
+// CHECK-ENCODING: [0x00,0x1c,0x60,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1601c00 <unknown>
+
+bfmla za.h[w10, 5, vgx2], {z10.h, z11.h}, z5.h // 11000001-01100101-01011101-01000101
+// CHECK-INST: bfmla za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h
+// CHECK-ENCODING: [0x45,0x5d,0x65,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1655d45 <unknown>
+
+bfmla za.h[w10, 5], {z10.h - z11.h}, z5.h // 11000001-01100101-01011101-01000101
+// CHECK-INST: bfmla za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h
+// CHECK-ENCODING: [0x45,0x5d,0x65,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1655d45 <unknown>
+
+bfmla za.h[w11, 7, vgx2], {z13.h, z14.h}, z8.h // 11000001-01101000-01111101-10100111
+// CHECK-INST: bfmla za.h[w11, 7, vgx2], { z13.h, z14.h }, z8.h
+// CHECK-ENCODING: [0xa7,0x7d,0x68,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1687da7 <unknown>
+
+bfmla za.h[w11, 7], {z13.h - z14.h}, z8.h // 11000001-01101000-01111101-10100111
+// CHECK-INST: bfmla za.h[w11, 7, vgx2], { z13.h, z14.h }, z8.h
+// CHECK-ENCODING: [0xa7,0x7d,0x68,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1687da7 <unknown>
+
+bfmla za.h[w11, 7, vgx2], {z31.h, z0.h}, z15.h // 11000001-01101111-01111111-11100111
+// CHECK-INST: bfmla za.h[w11, 7, vgx2], { z31.h, z0.h }, z15.h
+// CHECK-ENCODING: [0xe7,0x7f,0x6f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16f7fe7 <unknown>
+
+bfmla za.h[w11, 7], {z31.h - z0.h}, z15.h // 11000001-01101111-01111111-11100111
+// CHECK-INST: bfmla za.h[w11, 7, vgx2], { z31.h, z0.h }, z15.h
+// CHECK-ENCODING: [0xe7,0x7f,0x6f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16f7fe7 <unknown>
+
+bfmla za.h[w8, 5, vgx2], {z17.h, z18.h}, z0.h // 11000001-01100000-00011110-00100101
+// CHECK-INST: bfmla za.h[w8, 5, vgx2], { z17.h, z18.h }, z0.h
+// CHECK-ENCODING: [0x25,0x1e,0x60,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1601e25 <unknown>
+
+bfmla za.h[w8, 5], {z17.h - z18.h}, z0.h // 11000001-01100000-00011110-00100101
+// CHECK-INST: bfmla za.h[w8, 5, vgx2], { z17.h, z18.h }, z0.h
+// CHECK-ENCODING: [0x25,0x1e,0x60,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1601e25 <unknown>
+
+bfmla za.h[w8, 1, vgx2], {z1.h, z2.h}, z14.h // 11000001-01101110-00011100-00100001
+// CHECK-INST: bfmla za.h[w8, 1, vgx2], { z1.h, z2.h }, z14.h
+// CHECK-ENCODING: [0x21,0x1c,0x6e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16e1c21 <unknown>
+
+bfmla za.h[w8, 1], {z1.h - z2.h}, z14.h // 11000001-01101110-00011100-00100001
+// CHECK-INST: bfmla za.h[w8, 1, vgx2], { z1.h, z2.h }, z14.h
+// CHECK-ENCODING: [0x21,0x1c,0x6e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16e1c21 <unknown>
+
+bfmla za.h[w10, 0, vgx2], {z19.h, z20.h}, z4.h // 11000001-01100100-01011110-01100000
+// CHECK-INST: bfmla za.h[w10, 0, vgx2], { z19.h, z20.h }, z4.h
+// CHECK-ENCODING: [0x60,0x5e,0x64,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1645e60 <unknown>
+
+bfmla za.h[w10, 0], {z19.h - z20.h}, z4.h // 11000001-01100100-01011110-01100000
+// CHECK-INST: bfmla za.h[w10, 0, vgx2], { z19.h, z20.h }, z4.h
+// CHECK-ENCODING: [0x60,0x5e,0x64,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1645e60 <unknown>
+
+bfmla za.h[w8, 0, vgx2], {z12.h, z13.h}, z2.h // 11000001-01100010-00011101-10000000
+// CHECK-INST: bfmla za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h
+// CHECK-ENCODING: [0x80,0x1d,0x62,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1621d80 <unknown>
+
+bfmla za.h[w8, 0], {z12.h - z13.h}, z2.h // 11000001-01100010-00011101-10000000
+// CHECK-INST: bfmla za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h
+// CHECK-ENCODING: [0x80,0x1d,0x62,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1621d80 <unknown>
+
+bfmla za.h[w10, 1, vgx2], {z1.h, z2.h}, z10.h // 11000001-01101010-01011100-00100001
+// CHECK-INST: bfmla za.h[w10, 1, vgx2], { z1.h, z2.h }, z10.h
+// CHECK-ENCODING: [0x21,0x5c,0x6a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16a5c21 <unknown>
+
+bfmla za.h[w10, 1], {z1.h - z2.h}, z10.h // 11000001-01101010-01011100-00100001
+// CHECK-INST: bfmla za.h[w10, 1, vgx2], { z1.h, z2.h }, z10.h
+// CHECK-ENCODING: [0x21,0x5c,0x6a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16a5c21 <unknown>
+
+bfmla za.h[w8, 5, vgx2], {z22.h, z23.h}, z14.h // 11000001-01101110-00011110-11000101
+// CHECK-INST: bfmla za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h
+// CHECK-ENCODING: [0xc5,0x1e,0x6e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16e1ec5 <unknown>
+
+bfmla za.h[w8, 5], {z22.h - z23.h}, z14.h // 11000001-01101110-00011110-11000101
+// CHECK-INST: bfmla za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h
+// CHECK-ENCODING: [0xc5,0x1e,0x6e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16e1ec5 <unknown>
+
+bfmla za.h[w11, 2, vgx2], {z9.h, z10.h}, z1.h // 11000001-01100001-01111101-00100010
+// CHECK-INST: bfmla za.h[w11, 2, vgx2], { z9.h, z10.h }, z1.h
+// CHECK-ENCODING: [0x22,0x7d,0x61,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1617d22 <unknown>
+
+bfmla za.h[w11, 2], {z9.h - z10.h}, z1.h // 11000001-01100001-01111101-00100010
+// CHECK-INST: bfmla za.h[w11, 2, vgx2], { z9.h, z10.h }, z1.h
+// CHECK-ENCODING: [0x22,0x7d,0x61,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1617d22 <unknown>
+
+bfmla za.h[w9, 7, vgx2], {z12.h, z13.h}, z11.h // 11000001-01101011-00111101-10000111
+// CHECK-INST: bfmla za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h
+// CHECK-ENCODING: [0x87,0x3d,0x6b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16b3d87 <unknown>
+
+bfmla za.h[w9, 7], {z12.h - z13.h}, z11.h // 11000001-01101011-00111101-10000111
+// CHECK-INST: bfmla za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h
+// CHECK-ENCODING: [0x87,0x3d,0x6b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16b3d87 <unknown>
+
+bfmla za.h[w8, 0, vgx2], {z0.h, z1.h}, z0.h[0] // 11000001-00010000-00010000-00100000
+// CHECK-INST: bfmla za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h[0]
+// CHECK-ENCODING: [0x20,0x10,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1101020 <unknown>
+
+bfmla za.h[w8, 0], {z0.h - z1.h}, z0.h[0] // 11000001-00010000-00010000-00100000
+// CHECK-INST: bfmla za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h[0]
+// CHECK-ENCODING: [0x20,0x10,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1101020 <unknown>
+
+bfmla za.h[w10, 5, vgx2], {z10.h, z11.h}, z5.h[2] // 11000001-00010101-01010101-01100101
+// CHECK-INST: bfmla za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x65,0x55,0x15,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1155565 <unknown>
+
+bfmla za.h[w10, 5], {z10.h - z11.h}, z5.h[2] // 11000001-00010101-01010101-01100101
+// CHECK-INST: bfmla za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x65,0x55,0x15,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1155565 <unknown>
+
+bfmla za.h[w11, 7, vgx2], {z12.h, z13.h}, z8.h[6] // 11000001-00011000-01111101-10100111
+// CHECK-INST: bfmla za.h[w11, 7, vgx2], { z12.h, z13.h }, z8.h[6]
+// CHECK-ENCODING: [0xa7,0x7d,0x18,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1187da7 <unknown>
+
+bfmla za.h[w11, 7], {z12.h - z13.h}, z8.h[6] // 11000001-00011000-01111101-10100111
+// CHECK-INST: bfmla za.h[w11, 7, vgx2], { z12.h, z13.h }, z8.h[6]
+// CHECK-ENCODING: [0xa7,0x7d,0x18,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1187da7 <unknown>
+
+bfmla za.h[w11, 7, vgx2], {z30.h, z31.h}, z15.h[7] // 11000001-00011111-01111111-11101111
+// CHECK-INST: bfmla za.h[w11, 7, vgx2], { z30.h, z31.h }, z15.h[7]
+// CHECK-ENCODING: [0xef,0x7f,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11f7fef <unknown>
+
+bfmla za.h[w11, 7], {z30.h - z31.h}, z15.h[7] // 11000001-00011111-01111111-11101111
+// CHECK-INST: bfmla za.h[w11, 7, vgx2], { z30.h, z31.h }, z15.h[7]
+// CHECK-ENCODING: [0xef,0x7f,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11f7fef <unknown>
+
+bfmla za.h[w8, 5, vgx2], {z16.h, z17.h}, z0.h[6] // 11000001-00010000-00011110-00100101
+// CHECK-INST: bfmla za.h[w8, 5, vgx2], { z16.h, z17.h }, z0.h[6]
+// CHECK-ENCODING: [0x25,0x1e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1101e25 <unknown>
+
+bfmla za.h[w8, 5], {z16.h - z17.h}, z0.h[6] // 11000001-00010000-00011110-00100101
+// CHECK-INST: bfmla za.h[w8, 5, vgx2], { z16.h, z17.h }, z0.h[6]
+// CHECK-ENCODING: [0x25,0x1e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1101e25 <unknown>
+
+bfmla za.h[w8, 1, vgx2], {z0.h, z1.h}, z14.h[2] // 11000001-00011110-00010100-00100001
+// CHECK-INST: bfmla za.h[w8, 1, vgx2], { z0.h, z1.h }, z14.h[2]
+// CHECK-ENCODING: [0x21,0x14,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e1421 <unknown>
+
+bfmla za.h[w8, 1], {z0.h - z1.h}, z14.h[2] // 11000001-00011110-00010100-00100001
+// CHECK-INST: bfmla za.h[w8, 1, vgx2], { z0.h, z1.h }, z14.h[2]
+// CHECK-ENCODING: [0x21,0x14,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e1421 <unknown>
+
+bfmla za.h[w10, 0, vgx2], {z18.h, z19.h}, z4.h[3] // 11000001-00010100-01010110-01101000
+// CHECK-INST: bfmla za.h[w10, 0, vgx2], { z18.h, z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x68,0x56,0x14,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1145668 <unknown>
+
+bfmla za.h[w10, 0], {z18.h - z19.h}, z4.h[3] // 11000001-00010100-01010110-01101000
+// CHECK-INST: bfmla za.h[w10, 0, vgx2], { z18.h, z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x68,0x56,0x14,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1145668 <unknown>
+
+bfmla za.h[w8, 0, vgx2], {z12.h, z13.h}, z2.h[4] // 11000001-00010010-00011001-10100000
+// CHECK-INST: bfmla za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h[4]
+// CHECK-ENCODING: [0xa0,0x19,0x12,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11219a0 <unknown>
+
+bfmla za.h[w8, 0], {z12.h - z13.h}, z2.h[4] // 11000001-00010010-00011001-10100000
+// CHECK-INST: bfmla za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h[4]
+// CHECK-ENCODING: [0xa0,0x19,0x12,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11219a0 <unknown>
+
+bfmla za.h[w10, 1, vgx2], {z0.h, z1.h}, z10.h[4] // 11000001-00011010-01011000-00100001
+// CHECK-INST: bfmla za.h[w10, 1, vgx2], { z0.h, z1.h }, z10.h[4]
+// CHECK-ENCODING: [0x21,0x58,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11a5821 <unknown>
+
+bfmla za.h[w10, 1], {z0.h - z1.h}, z10.h[4] // 11000001-00011010-01011000-00100001
+// CHECK-INST: bfmla za.h[w10, 1, vgx2], { z0.h, z1.h }, z10.h[4]
+// CHECK-ENCODING: [0x21,0x58,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11a5821 <unknown>
+
+bfmla za.h[w8, 5, vgx2], {z22.h, z23.h}, z14.h[5] // 11000001-00011110-00011010-11101101
+// CHECK-INST: bfmla za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h[5]
+// CHECK-ENCODING: [0xed,0x1a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e1aed <unknown>
+
+bfmla za.h[w8, 5], {z22.h - z23.h}, z14.h[5] // 11000001-00011110-00011010-11101101
+// CHECK-INST: bfmla za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h[5]
+// CHECK-ENCODING: [0xed,0x1a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e1aed <unknown>
+
+bfmla za.h[w11, 2, vgx2], {z8.h, z9.h}, z1.h[2] // 11000001-00010001-01110101-00100010
+// CHECK-INST: bfmla za.h[w11, 2, vgx2], { z8.h, z9.h }, z1.h[2]
+// CHECK-ENCODING: [0x22,0x75,0x11,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1117522 <unknown>
+
+bfmla za.h[w11, 2], {z8.h - z9.h}, z1.h[2] // 11000001-00010001-01110101-00100010
+// CHECK-INST: bfmla za.h[w11, 2, vgx2], { z8.h, z9.h }, z1.h[2]
+// CHECK-ENCODING: [0x22,0x75,0x11,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1117522 <unknown>
+
+bfmla za.h[w9, 7, vgx2], {z12.h, z13.h}, z11.h[4] // 11000001-00011011-00111001-10100111
+// CHECK-INST: bfmla za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h[4]
+// CHECK-ENCODING: [0xa7,0x39,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11b39a7 <unknown>
+
+bfmla za.h[w9, 7], {z12.h - z13.h}, z11.h[4] // 11000001-00011011-00111001-10100111
+// CHECK-INST: bfmla za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h[4]
+// CHECK-ENCODING: [0xa7,0x39,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11b39a7 <unknown>
+
+bfmla za.h[w8, 0, vgx2], {z0.h, z1.h}, {z0.h, z1.h} // 11000001, 11100000-00010000-00001000
+// CHECK-INST: bfmla za.h[w8, 0, vgx2], { z0.h, z1.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x08,0x10,0xe0,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e01008 <unknown>
+
+bfmla za.h[w8, 0], {z0.h - z1.h}, {z0.h - z1.h} // 11000001-11100000-00010000-00001000
+// CHECK-INST: bfmla za.h[w8, 0, vgx2], { z0.h, z1.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x08,0x10,0xe0,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e01008 <unknown>
+
+bfmla za.h[w10, 5, vgx2], {z10.h, z11.h}, {z20.h, z21.h} // 11000001, 11110100-01010001-01001101
+// CHECK-INST: bfmla za.h[w10, 5, vgx2], { z10.h, z11.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x4d,0x51,0xf4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f4514d <unknown>
+
+bfmla za.h[w10, 5], {z10.h - z11.h}, {z20.h - z21.h} // 11000001-11110100-01010001-01001101
+// CHECK-INST: bfmla za.h[w10, 5, vgx2], { z10.h, z11.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x4d,0x51,0xf4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f4514d <unknown>
+
+bfmla za.h[w11, 7, vgx2], {z12.h, z13.h}, {z8.h, z9.h} // 11000001, 11101000-01110001-10001111
+// CHECK-INST: bfmla za.h[w11, 7, vgx2], { z12.h, z13.h }, { z8.h, z9.h }
+// CHECK-ENCODING: [0x8f,0x71,0xe8,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e8718f <unknown>
+
+bfmla za.h[w11, 7], {z12.h - z13.h}, {z8.h - z9.h} // 11000001-11101000-01110001-10001111
+// CHECK-INST: bfmla za.h[w11, 7, vgx2], { z12.h, z13.h }, { z8.h, z9.h }
+// CHECK-ENCODING: [0x8f,0x71,0xe8,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e8718f <unknown>
+
+bfmla za.h[w11, 7, vgx2], {z30.h, z31.h}, {z30.h, z31.h} // 11000001, 11111110-01110011-11001111
+// CHECK-INST: bfmla za.h[w11, 7, vgx2], { z30.h, z31.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xcf,0x73,0xfe,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fe73cf <unknown>
+
+bfmla za.h[w11, 7], {z30.h - z31.h}, {z30.h - z31.h} // 11000001-11111110-01110011-11001111
+// CHECK-INST: bfmla za.h[w11, 7, vgx2], { z30.h, z31.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xcf,0x73,0xfe,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fe73cf <unknown>
+
+bfmla za.h[w8, 5, vgx2], {z16.h, z17.h}, {z16.h, z17.h} // 11000001, 11110000-00010010-00001101
+// CHECK-INST: bfmla za.h[w8, 5, vgx2], { z16.h, z17.h }, { z16.h, z17.h }
+// CHECK-ENCODING: [0x0d,0x12,0xf0,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f0120d <unknown>
+
+bfmla za.h[w8, 5], {z16.h - z17.h}, {z16.h - z17.h} // 11000001-11110000-00010010-00001101
+// CHECK-INST: bfmla za.h[w8, 5, vgx2], { z16.h, z17.h }, { z16.h, z17.h }
+// CHECK-ENCODING: [0x0d,0x12,0xf0,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f0120d <unknown>
+
+bfmla za.h[w8, 1, vgx2], {z0.h, z1.h}, {z30.h, z31.h} // 11000001, 11111110-00010000-00001001
+// CHECK-INST: bfmla za.h[w8, 1, vgx2], { z0.h, z1.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0x09,0x10,0xfe,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fe1009 <unknown>
+
+bfmla za.h[w8, 1], {z0.h - z1.h}, {z30.h - z31.h} // 11000001-11111110-00010000-00001001
+// CHECK-INST: bfmla za.h[w8, 1, vgx2], { z0.h, z1.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0x09,0x10,0xfe,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fe1009 <unknown>
+
+bfmla za.h[w10, 0, vgx2], {z18.h, z19.h}, {z20.h, z21.h} // 11000001, 11110100-01010010-01001000
+// CHECK-INST: bfmla za.h[w10, 0, vgx2], { z18.h, z19.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x48,0x52,0xf4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f45248 <unknown>
+
+bfmla za.h[w10, 0], {z18.h - z19.h}, {z20.h - z21.h} // 11000001-11110100-01010010-01001000
+// CHECK-INST: bfmla za.h[w10, 0, vgx2], { z18.h, z19.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x48,0x52,0xf4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f45248 <unknown>
+
+bfmla za.h[w8, 0, vgx2], {z12.h, z13.h}, {z2.h, z3.h} // 11000001, 11100010-00010001-10001000
+// CHECK-INST: bfmla za.h[w8, 0, vgx2], { z12.h, z13.h }, { z2.h, z3.h }
+// CHECK-ENCODING: [0x88,0x11,0xe2,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e21188 <unknown>
+
+bfmla za.h[w8, 0], {z12.h - z13.h}, {z2.h - z3.h} // 11000001-11100010-00010001-10001000
+// CHECK-INST: bfmla za.h[w8, 0, vgx2], { z12.h, z13.h }, { z2.h, z3.h }
+// CHECK-ENCODING: [0x88,0x11,0xe2,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e21188 <unknown>
+
+bfmla za.h[w10, 1, vgx2], {z0.h, z1.h}, {z26.h, z27.h} // 11000001, 11111010-01010000-00001001
+// CHECK-INST: bfmla za.h[w10, 1, vgx2], { z0.h, z1.h }, { z26.h, z27.h }
+// CHECK-ENCODING: [0x09,0x50,0xfa,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fa5009 <unknown>
+
+bfmla za.h[w10, 1], {z0.h - z1.h}, {z26.h - z27.h} // 11000001-11111010-01010000-00001001
+// CHECK-INST: bfmla za.h[w10, 1, vgx2], { z0.h, z1.h }, { z26.h, z27.h }
+// CHECK-ENCODING: [0x09,0x50,0xfa,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fa5009 <unknown>
+
+bfmla za.h[w8, 5, vgx2], {z22.h, z23.h}, {z30.h, z31.h} // 11000001, 11111110-00010010-11001101
+// CHECK-INST: bfmla za.h[w8, 5, vgx2], { z22.h, z23.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xcd,0x12,0xfe,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fe12cd <unknown>
+
+bfmla za.h[w8, 5], {z22.h - z23.h}, {z30.h - z31.h} // 11000001-11111110-00010010-11001101
+// CHECK-INST: bfmla za.h[w8, 5, vgx2], { z22.h, z23.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xcd,0x12,0xfe,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fe12cd <unknown>
+
+bfmla za.h[w11, 2, vgx2], {z8.h, z9.h}, {z0.h, z1.h} // 11000001, 11100000-01110001-00001010
+// CHECK-INST: bfmla za.h[w11, 2, vgx2], { z8.h, z9.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x0a,0x71,0xe0,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e0710a <unknown>
+
+bfmla za.h[w11, 2], {z8.h - z9.h}, {z0.h - z1.h} // 11000001-11100000-01110001-00001010
+// CHECK-INST: bfmla za.h[w11, 2, vgx2], { z8.h, z9.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x0a,0x71,0xe0,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e0710a <unknown>
+
+bfmla za.h[w9, 7, vgx2], {z12.h, z13.h}, {z10.h, z11.h} // 11000001, 11101010-00110001-10001111
+// CHECK-INST: bfmla za.h[w9, 7, vgx2], { z12.h, z13.h }, { z10.h, z11.h }
+// CHECK-ENCODING: [0x8f,0x31,0xea,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1ea318f <unknown>
+
+bfmla za.h[w9, 7], {z12.h - z13.h}, {z10.h - z11.h} // 11000001-11101010-00110001-10001111
+// CHECK-INST: bfmla za.h[w9, 7, vgx2], { z12.h, z13.h }, { z10.h, z11.h }
+// CHECK-ENCODING: [0x8f,0x31,0xea,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1ea318f <unknown>
+
+bfmla za.h[w8, 0, vgx4], {z0.h - z3.h}, z0.h // 11000001-01110000-00011100-00000000
+// CHECK-INST: bfmla za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h
+// CHECK-ENCODING: [0x00,0x1c,0x70,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1701c00 <unknown>
+
+bfmla za.h[w8, 0], {z0.h - z3.h}, z0.h // 11000001-01110000-00011100-00000000
+// CHECK-INST: bfmla za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h
+// CHECK-ENCODING: [0x00,0x1c,0x70,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1701c00 <unknown>
+
+bfmla za.h[w10, 5, vgx4], {z10.h - z13.h}, z5.h // 11000001-01110101-01011101-01000101
+// CHECK-INST: bfmla za.h[w10, 5, vgx4], { z10.h - z13.h }, z5.h
+// CHECK-ENCODING: [0x45,0x5d,0x75,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1755d45 <unknown>
+
+bfmla za.h[w10, 5], {z10.h - z13.h}, z5.h // 11000001-01110101-01011101-01000101
+// CHECK-INST: bfmla za.h[w10, 5, vgx4], { z10.h - z13.h }, z5.h
+// CHECK-ENCODING: [0x45,0x5d,0x75,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1755d45 <unknown>
+
+bfmla za.h[w11, 7, vgx4], {z13.h - z16.h}, z8.h // 11000001-01111000-01111101-10100111
+// CHECK-INST: bfmla za.h[w11, 7, vgx4], { z13.h - z16.h }, z8.h
+// CHECK-ENCODING: [0xa7,0x7d,0x78,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1787da7 <unknown>
+
+bfmla za.h[w11, 7], {z13.h - z16.h}, z8.h // 11000001-01111000-01111101-10100111
+// CHECK-INST: bfmla za.h[w11, 7, vgx4], { z13.h - z16.h }, z8.h
+// CHECK-ENCODING: [0xa7,0x7d,0x78,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1787da7 <unknown>
+
+bfmla za.h[w11, 7, vgx4], {z31.h, z0.h, z1.h, z2.h}, z15.h // 11000001-01111111-01111111-11100111
+// CHECK-INST: bfmla za.h[w11, 7, vgx4], { z31.h, z0.h, z1.h, z2.h }, z15.h
+// CHECK-ENCODING: [0xe7,0x7f,0x7f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17f7fe7 <unknown>
+
+bfmla za.h[w11, 7], {z31.h, z0.h, z1.h, z2.h}, z15.h // 11000001-01111111-01111111-11100111
+// CHECK-INST: bfmla za.h[w11, 7, vgx4], { z31.h, z0.h, z1.h, z2.h }, z15.h
+// CHECK-ENCODING: [0xe7,0x7f,0x7f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17f7fe7 <unknown>
+
+bfmla za.h[w8, 5, vgx4], {z17.h - z20.h}, z0.h // 11000001-01110000-00011110-00100101
+// CHECK-INST: bfmla za.h[w8, 5, vgx4], { z17.h - z20.h }, z0.h
+// CHECK-ENCODING: [0x25,0x1e,0x70,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1701e25 <unknown>
+
+bfmla za.h[w8, 5], {z17.h - z20.h}, z0.h // 11000001-01110000-00011110-00100101
+// CHECK-INST: bfmla za.h[w8, 5, vgx4], { z17.h - z20.h }, z0.h
+// CHECK-ENCODING: [0x25,0x1e,0x70,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1701e25 <unknown>
+
+bfmla za.h[w8, 1, vgx4], {z1.h - z4.h}, z14.h // 11000001-01111110-00011100-00100001
+// CHECK-INST: bfmla za.h[w8, 1, vgx4], { z1.h - z4.h }, z14.h
+// CHECK-ENCODING: [0x21,0x1c,0x7e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17e1c21 <unknown>
+
+bfmla za.h[w8, 1], {z1.h - z4.h}, z14.h // 11000001-01111110-00011100-00100001
+// CHECK-INST: bfmla za.h[w8, 1, vgx4], { z1.h - z4.h }, z14.h
+// CHECK-ENCODING: [0x21,0x1c,0x7e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17e1c21 <unknown>
+
+bfmla za.h[w10, 0, vgx4], {z19.h - z22.h}, z4.h // 11000001-01110100-01011110-01100000
+// CHECK-INST: bfmla za.h[w10, 0, vgx4], { z19.h - z22.h }, z4.h
+// CHECK-ENCODING: [0x60,0x5e,0x74,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1745e60 <unknown>
+
+bfmla za.h[w10, 0], {z19.h - z22.h}, z4.h // 11000001-01110100-01011110-01100000
+// CHECK-INST: bfmla za.h[w10, 0, vgx4], { z19.h - z22.h }, z4.h
+// CHECK-ENCODING: [0x60,0x5e,0x74,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1745e60 <unknown>
+
+bfmla za.h[w8, 0, vgx4], {z12.h - z15.h}, z2.h // 11000001-01110010-00011101-10000000
+// CHECK-INST: bfmla za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h
+// CHECK-ENCODING: [0x80,0x1d,0x72,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1721d80 <unknown>
+
+bfmla za.h[w8, 0], {z12.h - z15.h}, z2.h // 11000001-01110010-00011101-10000000
+// CHECK-INST: bfmla za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h
+// CHECK-ENCODING: [0x80,0x1d,0x72,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1721d80 <unknown>
+
+bfmla za.h[w10, 1, vgx4], {z1.h - z4.h}, z10.h // 11000001-01111010-01011100-00100001
+// CHECK-INST: bfmla za.h[w10, 1, vgx4], { z1.h - z4.h }, z10.h
+// CHECK-ENCODING: [0x21,0x5c,0x7a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17a5c21 <unknown>
+
+bfmla za.h[w10, 1], {z1.h - z4.h}, z10.h // 11000001-01111010-01011100-00100001
+// CHECK-INST: bfmla za.h[w10, 1, vgx4], { z1.h - z4.h }, z10.h
+// CHECK-ENCODING: [0x21,0x5c,0x7a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17a5c21 <unknown>
+
+bfmla za.h[w8, 5, vgx4], {z22.h - z25.h}, z14.h // 11000001-01111110-00011110-11000101
+// CHECK-INST: bfmla za.h[w8, 5, vgx4], { z22.h - z25.h }, z14.h
+// CHECK-ENCODING: [0xc5,0x1e,0x7e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17e1ec5 <unknown>
+
+bfmla za.h[w8, 5], {z22.h - z25.h}, z14.h // 11000001-01111110-00011110-11000101
+// CHECK-INST: bfmla za.h[w8, 5, vgx4], { z22.h - z25.h }, z14.h
+// CHECK-ENCODING: [0xc5,0x1e,0x7e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17e1ec5 <unknown>
+
+bfmla za.h[w11, 2, vgx4], {z9.h - z12.h}, z1.h // 11000001-01110001-01111101-00100010
+// CHECK-INST: bfmla za.h[w11, 2, vgx4], { z9.h - z12.h }, z1.h
+// CHECK-ENCODING: [0x22,0x7d,0x71,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1717d22 <unknown>
+
+bfmla za.h[w11, 2], {z9.h - z12.h}, z1.h // 11000001-01110001-01111101-00100010
+// CHECK-INST: bfmla za.h[w11, 2, vgx4], { z9.h - z12.h }, z1.h
+// CHECK-ENCODING: [0x22,0x7d,0x71,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1717d22 <unknown>
+
+bfmla za.h[w9, 7, vgx4], {z12.h - z15.h}, z11.h // 11000001-01111011-00111101-10000111
+// CHECK-INST: bfmla za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h
+// CHECK-ENCODING: [0x87,0x3d,0x7b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17b3d87 <unknown>
+
+bfmla za.h[w9, 7], {z12.h - z15.h}, z11.h // 11000001-01111011-00111101-10000111
+// CHECK-INST: bfmla za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h
+// CHECK-ENCODING: [0x87,0x3d,0x7b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17b3d87 <unknown>
+
+bfmla za.h[w8, 0, vgx4], {z0.h - z3.h}, z0.h[0] // 11000001-00010000-10010000-00100000
+// CHECK-INST: bfmla za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h[0]
+// CHECK-ENCODING: [0x20,0x90,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1109020 <unknown>
+
+bfmla za.h[w8, 0], {z0.h - z3.h}, z0.h[0] // 11000001-00010000-10010000-00100000
+// CHECK-INST: bfmla za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h[0]
+// CHECK-ENCODING: [0x20,0x90,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1109020 <unknown>
+
+bfmla za.h[w10, 5, vgx4], {z8.h - z11.h}, z5.h[2] // 11000001-00010101-11010101-00100101
+// CHECK-INST: bfmla za.h[w10, 5, vgx4], { z8.h - z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x25,0xd5,0x15,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c115d525 <unknown>
+
+bfmla za.h[w10, 5], {z8.h - z11.h}, z5.h[2] // 11000001-00010101-11010101-00100101
+// CHECK-INST: bfmla za.h[w10, 5, vgx4], { z8.h - z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x25,0xd5,0x15,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c115d525 <unknown>
+
+bfmla za.h[w11, 7, vgx4], {z12.h - z15.h}, z8.h[6] // 11000001-00011000-11111101-10100111
+// CHECK-INST: bfmla za.h[w11, 7, vgx4], { z12.h - z15.h }, z8.h[6]
+// CHECK-ENCODING: [0xa7,0xfd,0x18,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c118fda7 <unknown>
+
+bfmla za.h[w11, 7], {z12.h - z15.h}, z8.h[6] // 11000001-00011000-11111101-10100111
+// CHECK-INST: bfmla za.h[w11, 7, vgx4], { z12.h - z15.h }, z8.h[6]
+// CHECK-ENCODING: [0xa7,0xfd,0x18,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c118fda7 <unknown>
+
+bfmla za.h[w11, 7, vgx4], {z28.h - z31.h}, z15.h[7] // 11000001-00011111-11111111-10101111
+// CHECK-INST: bfmla za.h[w11, 7, vgx4], { z28.h - z31.h }, z15.h[7]
+// CHECK-ENCODING: [0xaf,0xff,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11fffaf <unknown>
+
+bfmla za.h[w11, 7], {z28.h - z31.h}, z15.h[7] // 11000001-00011111-11111111-10101111
+// CHECK-INST: bfmla za.h[w11, 7, vgx4], { z28.h - z31.h }, z15.h[7]
+// CHECK-ENCODING: [0xaf,0xff,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11fffaf <unknown>
+
+bfmla za.h[w8, 5, vgx4], {z16.h - z19.h}, z0.h[6] // 11000001-00010000-10011110-00100101
+// CHECK-INST: bfmla za.h[w8, 5, vgx4], { z16.h - z19.h }, z0.h[6]
+// CHECK-ENCODING: [0x25,0x9e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1109e25 <unknown>
+
+bfmla za.h[w8, 5], {z16.h - z19.h}, z0.h[6] // 11000001-00010000-10011110-00100101
+// CHECK-INST: bfmla za.h[w8, 5, vgx4], { z16.h - z19.h }, z0.h[6]
+// CHECK-ENCODING: [0x25,0x9e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1109e25 <unknown>
+
+bfmla za.h[w8, 1, vgx4], {z0.h - z3.h}, z14.h[2] // 11000001-00011110-10010100-00100001
+// CHECK-INST: bfmla za.h[w8, 1, vgx4], { z0.h - z3.h }, z14.h[2]
+// CHECK-ENCODING: [0x21,0x94,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e9421 <unknown>
+
+bfmla za.h[w8, 1], {z0.h - z3.h}, z14.h[2] // 11000001-00011110-10010100-00100001
+// CHECK-INST: bfmla za.h[w8, 1, vgx4], { z0.h - z3.h }, z14.h[2]
+// CHECK-ENCODING: [0x21,0x94,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e9421 <unknown>
+
+bfmla za.h[w10, 0, vgx4], {z16.h - z19.h}, z4.h[3] // 11000001-00010100-11010110-00101000
+// CHECK-INST: bfmla za.h[w10, 0, vgx4], { z16.h - z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x28,0xd6,0x14,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c114d628 <unknown>
+
+bfmla za.h[w10, 0], {z16.h - z19.h}, z4.h[3] // 11000001-00010100-11010110-00101000
+// CHECK-INST: bfmla za.h[w10, 0, vgx4], { z16.h - z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x28,0xd6,0x14,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c114d628 <unknown>
+
+bfmla za.h[w8, 0, vgx4], {z12.h - z15.h}, z2.h[4] // 11000001-00010010-10011001-10100000
+// CHECK-INST: bfmla za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h[4]
+// CHECK-ENCODING: [0xa0,0x99,0x12,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11299a0 <unknown>
+
+bfmla za.h[w8, 0], {z12.h - z15.h}, z2.h[4] // 11000001-00010010-10011001-10100000
+// CHECK-INST: bfmla za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h[4]
+// CHECK-ENCODING: [0xa0,0x99,0x12,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11299a0 <unknown>
+
+bfmla za.h[w10, 1, vgx4], {z0.h - z3.h}, z10.h[4] // 11000001-00011010-11011000-00100001
+// CHECK-INST: bfmla za.h[w10, 1, vgx4], { z0.h - z3.h }, z10.h[4]
+// CHECK-ENCODING: [0x21,0xd8,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11ad821 <unknown>
+
+bfmla za.h[w10, 1], {z0.h - z3.h}, z10.h[4] // 11000001-00011010-11011000-00100001
+// CHECK-INST: bfmla za.h[w10, 1, vgx4], { z0.h - z3.h }, z10.h[4]
+// CHECK-ENCODING: [0x21,0xd8,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11ad821 <unknown>
+
+bfmla za.h[w8, 5, vgx4], {z20.h - z23.h}, z14.h[5] // 11000001-00011110-10011010-10101101
+// CHECK-INST: bfmla za.h[w8, 5, vgx4], { z20.h - z23.h }, z14.h[5]
+// CHECK-ENCODING: [0xad,0x9a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e9aad <unknown>
+
+bfmla za.h[w8, 5], {z20.h - z23.h}, z14.h[5] // 11000001-00011110-10011010-10101101
+// CHECK-INST: bfmla za.h[w8, 5, vgx4], { z20.h - z23.h }, z14.h[5]
+// CHECK-ENCODING: [0xad,0x9a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e9aad <unknown>
+
+bfmla za.h[w11, 2, vgx4], {z8.h - z11.h}, z1.h[2] // 11000001-00010001-11110101-00100010
+// CHECK-INST: bfmla za.h[w11, 2, vgx4], { z8.h - z11.h }, z1.h[2]
+// CHECK-ENCODING: [0x22,0xf5,0x11,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c111f522 <unknown>
+
+bfmla za.h[w11, 2], {z8.h - z11.h}, z1.h[2] // 11000001-00010001-11110101-00100010
+// CHECK-INST: bfmla za.h[w11, 2, vgx4], { z8.h - z11.h }, z1.h[2]
+// CHECK-ENCODING: [0x22,0xf5,0x11,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c111f522 <unknown>
+
+bfmla za.h[w9, 7, vgx4], {z12.h - z15.h}, z11.h[4] // 11000001-00011011-10111001-10100111
+// CHECK-INST: bfmla za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h[4]
+// CHECK-ENCODING: [0xa7,0xb9,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11bb9a7 <unknown>
+
+bfmla za.h[w9, 7], {z12.h - z15.h}, z11.h[4] // 11000001-00011011-10111001-10100111
+// CHECK-INST: bfmla za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h[4]
+// CHECK-ENCODING: [0xa7,0xb9,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11bb9a7 <unknown>
+
+bfmla za.h[w8, 0, vgx4], {z0.h - z3.h}, {z0.h - z3.h} // 11000001-11100001-00010000-00001000
+// CHECK-INST: bfmla za.h[w8, 0, vgx4], { z0.h - z3.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x08,0x10,0xe1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e11008 <unknown>
+
+bfmla za.h[w8, 0], {z0.h - z3.h}, {z0.h - z3.h} // 11000001-11100001-00010000-00001000
+// CHECK-INST: bfmla za.h[w8, 0, vgx4], { z0.h - z3.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x08,0x10,0xe1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e11008 <unknown>
+
+bfmla za.h[w10, 5, vgx4], {z8.h - z11.h}, {z20.h - z23.h} // 11000001-11110101-01010001-00001101
+// CHECK-INST: bfmla za.h[w10, 5, vgx4], { z8.h - z11.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x0d,0x51,0xf5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f5510d <unknown>
+
+bfmla za.h[w10, 5], {z8.h - z11.h}, {z20.h - z23.h} // 11000001-11110101-01010001-00001101
+// CHECK-INST: bfmla za.h[w10, 5, vgx4], { z8.h - z11.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x0d,0x51,0xf5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f5510d <unknown>
+
+bfmla za.h[w11, 7, vgx4], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-11101001-01110001-10001111
+// CHECK-INST: bfmla za.h[w11, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x8f,0x71,0xe9,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e9718f <unknown>
+
+bfmla za.h[w11, 7], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-11101001-01110001-10001111
+// CHECK-INST: bfmla za.h[w11, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x8f,0x71,0xe9,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e9718f <unknown>
+
+bfmla za.h[w11, 7, vgx4], {z28.h - z31.h}, {z28.h - z31.h} // 11000001-11111101-01110011-10001111
+// CHECK-INST: bfmla za.h[w11, 7, vgx4], { z28.h - z31.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x8f,0x73,0xfd,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fd738f <unknown>
+
+bfmla za.h[w11, 7], {z28.h - z31.h}, {z28.h - z31.h} // 11000001-11111101-01110011-10001111
+// CHECK-INST: bfmla za.h[w11, 7, vgx4], { z28.h - z31.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x8f,0x73,0xfd,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fd738f <unknown>
+
+bfmla za.h[w8, 5, vgx4], {z16.h - z19.h}, {z16.h - z19.h} // 11000001-11110001-00010010-00001101
+// CHECK-INST: bfmla za.h[w8, 5, vgx4], { z16.h - z19.h }, { z16.h - z19.h }
+// CHECK-ENCODING: [0x0d,0x12,0xf1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f1120d <unknown>
+
+bfmla za.h[w8, 5], {z16.h - z19.h}, {z16.h - z19.h} // 11000001-11110001-00010010-00001101
+// CHECK-INST: bfmla za.h[w8, 5, vgx4], { z16.h - z19.h }, { z16.h - z19.h }
+// CHECK-ENCODING: [0x0d,0x12,0xf1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f1120d <unknown>
+
+bfmla za.h[w8, 1, vgx4], {z0.h - z3.h}, {z28.h - z31.h} // 11000001-11111101-00010000-00001001
+// CHECK-INST: bfmla za.h[w8, 1, vgx4], { z0.h - z3.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x09,0x10,0xfd,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fd1009 <unknown>
+
+bfmla za.h[w8, 1], {z0.h - z3.h}, {z28.h - z31.h} // 11000001-11111101-00010000-00001001
+// CHECK-INST: bfmla za.h[w8, 1, vgx4], { z0.h - z3.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x09,0x10,0xfd,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fd1009 <unknown>
+
+bfmla za.h[w10, 0, vgx4], {z16.h - z19.h}, {z20.h - z23.h} // 11000001-11110101-01010010-00001000
+// CHECK-INST: bfmla za.h[w10, 0, vgx4], { z16.h - z19.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x08,0x52,0xf5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f55208 <unknown>
+
+bfmla za.h[w10, 0], {z16.h - z19.h}, {z20.h - z23.h} // 11000001-11110101-01010010-00001000
+// CHECK-INST: bfmla za.h[w10, 0, vgx4], { z16.h - z19.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x08,0x52,0xf5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f55208 <unknown>
+
+bfmla za.h[w8, 0, vgx4], {z12.h - z15.h}, {z0.h - z3.h} // 11000001-11100001-00010001-10001000
+// CHECK-INST: bfmla za.h[w8, 0, vgx4], { z12.h - z15.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x88,0x11,0xe1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e11188 <unknown>
+
+bfmla za.h[w8, 0], {z12.h - z15.h}, {z0.h - z3.h} // 11000001-11100001-00010001-10001000
+// CHECK-INST: bfmla za.h[w8, 0, vgx4], { z12.h - z15.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x88,0x11,0xe1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e11188 <unknown>
+
+bfmla za.h[w10, 1, vgx4], {z0.h - z3.h}, {z24.h - z27.h} // 11000001-11111001-01010000-00001001
+// CHECK-INST: bfmla za.h[w10, 1, vgx4], { z0.h - z3.h }, { z24.h - z27.h }
+// CHECK-ENCODING: [0x09,0x50,0xf9,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f95009 <unknown>
+
+bfmla za.h[w10, 1], {z0.h - z3.h}, {z24.h - z27.h} // 11000001-11111001-01010000-00001001
+// CHECK-INST: bfmla za.h[w10, 1, vgx4], { z0.h - z3.h }, { z24.h - z27.h }
+// CHECK-ENCODING: [0x09,0x50,0xf9,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f95009 <unknown>
+
+bfmla za.h[w8, 5, vgx4], {z20.h - z23.h}, {z28.h - z31.h} // 11000001-11111101-00010010-10001101
+// CHECK-INST: bfmla za.h[w8, 5, vgx4], { z20.h - z23.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x8d,0x12,0xfd,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fd128d <unknown>
+
+bfmla za.h[w8, 5], {z20.h - z23.h}, {z28.h - z31.h} // 11000001-11111101-00010010-10001101
+// CHECK-INST: bfmla za.h[w8, 5, vgx4], { z20.h - z23.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x8d,0x12,0xfd,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fd128d <unknown>
+
+bfmla za.h[w11, 2, vgx4], {z8.h - z11.h}, {z0.h - z3.h} // 11000001-11100001-01110001-00001010
+// CHECK-INST: bfmla za.h[w11, 2, vgx4], { z8.h - z11.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x0a,0x71,0xe1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e1710a <unknown>
+
+bfmla za.h[w11, 2], {z8.h - z11.h}, {z0.h - z3.h} // 11000001-11100001-01110001-00001010
+// CHECK-INST: bfmla za.h[w11, 2, vgx4], { z8.h - z11.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x0a,0x71,0xe1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e1710a <unknown>
+
+bfmla za.h[w9, 7, vgx4], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-11101001-00110001-10001111
+// CHECK-INST: bfmla za.h[w9, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x8f,0x31,0xe9,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e9318f <unknown>
+
+bfmla za.h[w9, 7], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-11101001-00110001-10001111
+// CHECK-INST: bfmla za.h[w9, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x8f,0x31,0xe9,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e9318f <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/bfmls-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/bfmls-diagnostics.s
new file mode 100644
index 000000000000..4174e244d1a4
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfmls-diagnostics.s
@@ -0,0 +1,94 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+bfmls za.h[w11, 2, vgx2], {z12.h-z14.h}, z8.h[3]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfmls za.h[w11, 2, vgx2], {z12.h-z14.h}, z8.h[3]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmls za.h[w11, 2, vgx4], {z12.h-z17.h}, z7.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid number of vectors
+// CHECK-NEXT: bfmls za.h[w11, 2, vgx4], {z12.h-z17.h}, z7.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmls za.h[w10, 3, vgx2], {z10.h-z11.h}, {z21.h-z22.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors, where the first vector is a multiple of 2 and with matching element types
+// CHECK-NEXT: bfmls za.h[w10, 3, vgx2], {z10.h-z11.h}, {z21.h-z22.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmls za.h[w11, 7, vgx4], {z12.h-z15.h}, {z9.h-z12.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 4 consecutive SVE vectors, where the first vector is a multiple of 4 and with matching element types
+// CHECK-NEXT: bfmls za.h[w11, 7, vgx4], {z12.h-z15.h}, {z9.h-z12.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid indexed-vector or single-vector register
+
+bfmls za.h[w8, 0], {z0.h-z1.h}, z16.h[0]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: bfmls za.h[w8, 0], {z0.h-z1.h}, z16.h[0]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmls za.h[w8, 1], {z0.h-z3.h}, z16.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: bfmls za.h[w8, 1], {z0.h-z3.h}, z16.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector select register
+
+bfmls za.h[w7, 7, vgx4], {z12.h-z15.h}, {z8.h-z11.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: bfmls za.h[w7, 7, vgx4], {z12.h-z15.h}, {z8.h-z11.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmls za.h[w12, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: bfmls za.h[w12, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector select offset
+
+bfmls za.h[w8, -1, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: bfmls za.h[w8, -1, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmls za.h[w8, 8, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: bfmls za.h[w8, 8, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid Register Suffix
+
+bfmls za.d[w8, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand, expected suffix .h
+// CHECK-NEXT: bfmls za.d[w8, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector lane index
+
+bfmls za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[8]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: bfmls za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[8]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmls za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: bfmls za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmls za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: bfmls za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmls za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[8]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: bfmls za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[8]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/bfmls.s b/llvm/test/MC/AArch64/SME2p1/bfmls.s
new file mode 100644
index 000000000000..631da1e5058d
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfmls.s
@@ -0,0 +1,876 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+b16b16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+b16b16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+bfmls za.h[w8, 0, vgx2], {z0.h, z1.h}, z0.h // 11000001-01100000-00011100-00001000
+// CHECK-INST: bfmls za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h
+// CHECK-ENCODING: [0x08,0x1c,0x60,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1601c08 <unknown>
+
+bfmls za.h[w8, 0], {z0.h - z1.h}, z0.h // 11000001-01100000-00011100-00001000
+// CHECK-INST: bfmls za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h
+// CHECK-ENCODING: [0x08,0x1c,0x60,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1601c08 <unknown>
+
+bfmls za.h[w10, 5, vgx2], {z10.h, z11.h}, z5.h // 11000001-01100101-01011101-01001101
+// CHECK-INST: bfmls za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h
+// CHECK-ENCODING: [0x4d,0x5d,0x65,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1655d4d <unknown>
+
+bfmls za.h[w10, 5], {z10.h - z11.h}, z5.h // 11000001-01100101-01011101-01001101
+// CHECK-INST: bfmls za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h
+// CHECK-ENCODING: [0x4d,0x5d,0x65,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1655d4d <unknown>
+
+bfmls za.h[w11, 7, vgx2], {z13.h, z14.h}, z8.h // 11000001-01101000-01111101-10101111
+// CHECK-INST: bfmls za.h[w11, 7, vgx2], { z13.h, z14.h }, z8.h
+// CHECK-ENCODING: [0xaf,0x7d,0x68,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1687daf <unknown>
+
+bfmls za.h[w11, 7], {z13.h - z14.h}, z8.h // 11000001-01101000-01111101-10101111
+// CHECK-INST: bfmls za.h[w11, 7, vgx2], { z13.h, z14.h }, z8.h
+// CHECK-ENCODING: [0xaf,0x7d,0x68,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1687daf <unknown>
+
+bfmls za.h[w11, 7, vgx2], {z31.h, z0.h}, z15.h // 11000001-01101111-01111111-11101111
+// CHECK-INST: bfmls za.h[w11, 7, vgx2], { z31.h, z0.h }, z15.h
+// CHECK-ENCODING: [0xef,0x7f,0x6f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16f7fef <unknown>
+
+bfmls za.h[w11, 7], {z31.h - z0.h}, z15.h // 11000001-01101111-01111111-11101111
+// CHECK-INST: bfmls za.h[w11, 7, vgx2], { z31.h, z0.h }, z15.h
+// CHECK-ENCODING: [0xef,0x7f,0x6f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16f7fef <unknown>
+
+bfmls za.h[w8, 5, vgx2], {z17.h, z18.h}, z0.h // 11000001-01100000-00011110-00101101
+// CHECK-INST: bfmls za.h[w8, 5, vgx2], { z17.h, z18.h }, z0.h
+// CHECK-ENCODING: [0x2d,0x1e,0x60,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1601e2d <unknown>
+
+bfmls za.h[w8, 5], {z17.h - z18.h}, z0.h // 11000001-01100000-00011110-00101101
+// CHECK-INST: bfmls za.h[w8, 5, vgx2], { z17.h, z18.h }, z0.h
+// CHECK-ENCODING: [0x2d,0x1e,0x60,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1601e2d <unknown>
+
+bfmls za.h[w8, 1, vgx2], {z1.h, z2.h}, z14.h // 11000001-01101110-00011100-00101001
+// CHECK-INST: bfmls za.h[w8, 1, vgx2], { z1.h, z2.h }, z14.h
+// CHECK-ENCODING: [0x29,0x1c,0x6e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16e1c29 <unknown>
+
+bfmls za.h[w8, 1], {z1.h - z2.h}, z14.h // 11000001-01101110-00011100-00101001
+// CHECK-INST: bfmls za.h[w8, 1, vgx2], { z1.h, z2.h }, z14.h
+// CHECK-ENCODING: [0x29,0x1c,0x6e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16e1c29 <unknown>
+
+bfmls za.h[w10, 0, vgx2], {z19.h, z20.h}, z4.h // 11000001-01100100-01011110-01101000
+// CHECK-INST: bfmls za.h[w10, 0, vgx2], { z19.h, z20.h }, z4.h
+// CHECK-ENCODING: [0x68,0x5e,0x64,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1645e68 <unknown>
+
+bfmls za.h[w10, 0], {z19.h - z20.h}, z4.h // 11000001-01100100-01011110-01101000
+// CHECK-INST: bfmls za.h[w10, 0, vgx2], { z19.h, z20.h }, z4.h
+// CHECK-ENCODING: [0x68,0x5e,0x64,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1645e68 <unknown>
+
+bfmls za.h[w8, 0, vgx2], {z12.h, z13.h}, z2.h // 11000001-01100010-00011101-10001000
+// CHECK-INST: bfmls za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h
+// CHECK-ENCODING: [0x88,0x1d,0x62,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1621d88 <unknown>
+
+bfmls za.h[w8, 0], {z12.h - z13.h}, z2.h // 11000001-01100010-00011101-10001000
+// CHECK-INST: bfmls za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h
+// CHECK-ENCODING: [0x88,0x1d,0x62,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1621d88 <unknown>
+
+bfmls za.h[w10, 1, vgx2], {z1.h, z2.h}, z10.h // 11000001-01101010-01011100-00101001
+// CHECK-INST: bfmls za.h[w10, 1, vgx2], { z1.h, z2.h }, z10.h
+// CHECK-ENCODING: [0x29,0x5c,0x6a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16a5c29 <unknown>
+
+bfmls za.h[w10, 1], {z1.h - z2.h}, z10.h // 11000001-01101010-01011100-00101001
+// CHECK-INST: bfmls za.h[w10, 1, vgx2], { z1.h, z2.h }, z10.h
+// CHECK-ENCODING: [0x29,0x5c,0x6a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16a5c29 <unknown>
+
+bfmls za.h[w8, 5, vgx2], {z22.h, z23.h}, z14.h // 11000001-01101110-00011110-11001101
+// CHECK-INST: bfmls za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h
+// CHECK-ENCODING: [0xcd,0x1e,0x6e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16e1ecd <unknown>
+
+bfmls za.h[w8, 5], {z22.h - z23.h}, z14.h // 11000001-01101110-00011110-11001101
+// CHECK-INST: bfmls za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h
+// CHECK-ENCODING: [0xcd,0x1e,0x6e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16e1ecd <unknown>
+
+bfmls za.h[w11, 2, vgx2], {z9.h, z10.h}, z1.h // 11000001-01100001-01111101-00101010
+// CHECK-INST: bfmls za.h[w11, 2, vgx2], { z9.h, z10.h }, z1.h
+// CHECK-ENCODING: [0x2a,0x7d,0x61,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1617d2a <unknown>
+
+bfmls za.h[w11, 2], {z9.h - z10.h}, z1.h // 11000001-01100001-01111101-00101010
+// CHECK-INST: bfmls za.h[w11, 2, vgx2], { z9.h, z10.h }, z1.h
+// CHECK-ENCODING: [0x2a,0x7d,0x61,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1617d2a <unknown>
+
+bfmls za.h[w9, 7, vgx2], {z12.h, z13.h}, z11.h // 11000001-01101011-00111101-10001111
+// CHECK-INST: bfmls za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h
+// CHECK-ENCODING: [0x8f,0x3d,0x6b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16b3d8f <unknown>
+
+bfmls za.h[w9, 7], {z12.h - z13.h}, z11.h // 11000001-01101011-00111101-10001111
+// CHECK-INST: bfmls za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h
+// CHECK-ENCODING: [0x8f,0x3d,0x6b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c16b3d8f <unknown>
+
+bfmls za.h[w8, 0, vgx2], {z0.h, z1.h}, z0.h[0] // 11000001-00010000-00010000-00110000
+// CHECK-INST: bfmls za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h[0]
+// CHECK-ENCODING: [0x30,0x10,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1101030 <unknown>
+
+bfmls za.h[w8, 0], {z0.h - z1.h}, z0.h[0] // 11000001-00010000-00010000-00110000
+// CHECK-INST: bfmls za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h[0]
+// CHECK-ENCODING: [0x30,0x10,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1101030 <unknown>
+
+bfmls za.h[w10, 5, vgx2], {z10.h, z11.h}, z5.h[2] // 11000001-00010101-01010101-01110101
+// CHECK-INST: bfmls za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x75,0x55,0x15,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1155575 <unknown>
+
+bfmls za.h[w10, 5], {z10.h - z11.h}, z5.h[2] // 11000001-00010101-01010101-01110101
+// CHECK-INST: bfmls za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x75,0x55,0x15,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1155575 <unknown>
+
+bfmls za.h[w11, 7, vgx2], {z12.h, z13.h}, z8.h[6] // 11000001-00011000-01111101-10110111
+// CHECK-INST: bfmls za.h[w11, 7, vgx2], { z12.h, z13.h }, z8.h[6]
+// CHECK-ENCODING: [0xb7,0x7d,0x18,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1187db7 <unknown>
+
+bfmls za.h[w11, 7], {z12.h - z13.h}, z8.h[6] // 11000001-00011000-01111101-10110111
+// CHECK-INST: bfmls za.h[w11, 7, vgx2], { z12.h, z13.h }, z8.h[6]
+// CHECK-ENCODING: [0xb7,0x7d,0x18,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1187db7 <unknown>
+
+bfmls za.h[w11, 7, vgx2], {z30.h, z31.h}, z15.h[7] // 11000001-00011111-01111111-11111111
+// CHECK-INST: bfmls za.h[w11, 7, vgx2], { z30.h, z31.h }, z15.h[7]
+// CHECK-ENCODING: [0xff,0x7f,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11f7fff <unknown>
+
+bfmls za.h[w11, 7], {z30.h - z31.h}, z15.h[7] // 11000001-00011111-01111111-11111111
+// CHECK-INST: bfmls za.h[w11, 7, vgx2], { z30.h, z31.h }, z15.h[7]
+// CHECK-ENCODING: [0xff,0x7f,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11f7fff <unknown>
+
+bfmls za.h[w8, 5, vgx2], {z16.h, z17.h}, z0.h[6] // 11000001-00010000-00011110-00110101
+// CHECK-INST: bfmls za.h[w8, 5, vgx2], { z16.h, z17.h }, z0.h[6]
+// CHECK-ENCODING: [0x35,0x1e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1101e35 <unknown>
+
+bfmls za.h[w8, 5], {z16.h - z17.h}, z0.h[6] // 11000001-00010000-00011110-00110101
+// CHECK-INST: bfmls za.h[w8, 5, vgx2], { z16.h, z17.h }, z0.h[6]
+// CHECK-ENCODING: [0x35,0x1e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1101e35 <unknown>
+
+bfmls za.h[w8, 1, vgx2], {z0.h, z1.h}, z14.h[2] // 11000001-00011110-00010100-00110001
+// CHECK-INST: bfmls za.h[w8, 1, vgx2], { z0.h, z1.h }, z14.h[2]
+// CHECK-ENCODING: [0x31,0x14,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e1431 <unknown>
+
+bfmls za.h[w8, 1], {z0.h - z1.h}, z14.h[2] // 11000001-00011110-00010100-00110001
+// CHECK-INST: bfmls za.h[w8, 1, vgx2], { z0.h, z1.h }, z14.h[2]
+// CHECK-ENCODING: [0x31,0x14,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e1431 <unknown>
+
+bfmls za.h[w10, 0, vgx2], {z18.h, z19.h}, z4.h[3] // 11000001-00010100-01010110-01111000
+// CHECK-INST: bfmls za.h[w10, 0, vgx2], { z18.h, z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x78,0x56,0x14,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1145678 <unknown>
+
+bfmls za.h[w10, 0], {z18.h - z19.h}, z4.h[3] // 11000001-00010100-01010110-01111000
+// CHECK-INST: bfmls za.h[w10, 0, vgx2], { z18.h, z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x78,0x56,0x14,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1145678 <unknown>
+
+bfmls za.h[w8, 0, vgx2], {z12.h, z13.h}, z2.h[4] // 11000001-00010010-00011001-10110000
+// CHECK-INST: bfmls za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h[4]
+// CHECK-ENCODING: [0xb0,0x19,0x12,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11219b0 <unknown>
+
+bfmls za.h[w8, 0], {z12.h - z13.h}, z2.h[4] // 11000001-00010010-00011001-10110000
+// CHECK-INST: bfmls za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h[4]
+// CHECK-ENCODING: [0xb0,0x19,0x12,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11219b0 <unknown>
+
+bfmls za.h[w10, 1, vgx2], {z0.h, z1.h}, z10.h[4] // 11000001-00011010-01011000-00110001
+// CHECK-INST: bfmls za.h[w10, 1, vgx2], { z0.h, z1.h }, z10.h[4]
+// CHECK-ENCODING: [0x31,0x58,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11a5831 <unknown>
+
+bfmls za.h[w10, 1], {z0.h - z1.h}, z10.h[4] // 11000001-00011010-01011000-00110001
+// CHECK-INST: bfmls za.h[w10, 1, vgx2], { z0.h, z1.h }, z10.h[4]
+// CHECK-ENCODING: [0x31,0x58,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11a5831 <unknown>
+
+bfmls za.h[w8, 5, vgx2], {z22.h, z23.h}, z14.h[5] // 11000001-00011110-00011010-11111101
+// CHECK-INST: bfmls za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h[5]
+// CHECK-ENCODING: [0xfd,0x1a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e1afd <unknown>
+
+bfmls za.h[w8, 5], {z22.h - z23.h}, z14.h[5] // 11000001-00011110-00011010-11111101
+// CHECK-INST: bfmls za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h[5]
+// CHECK-ENCODING: [0xfd,0x1a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e1afd <unknown>
+
+bfmls za.h[w11, 2, vgx2], {z8.h, z9.h}, z1.h[2] // 11000001-00010001-01110101-00110010
+// CHECK-INST: bfmls za.h[w11, 2, vgx2], { z8.h, z9.h }, z1.h[2]
+// CHECK-ENCODING: [0x32,0x75,0x11,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1117532 <unknown>
+
+bfmls za.h[w11, 2], {z8.h - z9.h}, z1.h[2] // 11000001-00010001-01110101-00110010
+// CHECK-INST: bfmls za.h[w11, 2, vgx2], { z8.h, z9.h }, z1.h[2]
+// CHECK-ENCODING: [0x32,0x75,0x11,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1117532 <unknown>
+
+bfmls za.h[w9, 7, vgx2], {z12.h, z13.h}, z11.h[4] // 11000001-00011011-00111001-10110111
+// CHECK-INST: bfmls za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h[4]
+// CHECK-ENCODING: [0xb7,0x39,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11b39b7 <unknown>
+
+bfmls za.h[w9, 7], {z12.h - z13.h}, z11.h[4] // 11000001-00011011-00111001-10110111
+// CHECK-INST: bfmls za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h[4]
+// CHECK-ENCODING: [0xb7,0x39,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11b39b7 <unknown>
+
+bfmls za.h[w8, 0, vgx2], {z0.h, z1.h}, {z0.h, z1.h} // 11000001, 11100000-00010000-00011000
+// CHECK-INST: bfmls za.h[w8, 0, vgx2], { z0.h, z1.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x18,0x10,0xe0,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e01018 <unknown>
+
+bfmls za.h[w8, 0], {z0.h - z1.h}, {z0.h - z1.h} // 11000001-11100000-00010000-00011000
+// CHECK-INST: bfmls za.h[w8, 0, vgx2], { z0.h, z1.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x18,0x10,0xe0,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e01018 <unknown>
+
+bfmls za.h[w10, 5, vgx2], {z10.h, z11.h}, {z20.h, z21.h} // 11000001, 11110100-01010001-01011101
+// CHECK-INST: bfmls za.h[w10, 5, vgx2], { z10.h, z11.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x5d,0x51,0xf4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f4515d <unknown>
+
+bfmls za.h[w10, 5], {z10.h - z11.h}, {z20.h - z21.h} // 11000001-11110100-01010001-01011101
+// CHECK-INST: bfmls za.h[w10, 5, vgx2], { z10.h, z11.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x5d,0x51,0xf4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f4515d <unknown>
+
+bfmls za.h[w11, 7, vgx2], {z12.h, z13.h}, {z8.h, z9.h} // 11000001, 11101000-01110001-10011111
+// CHECK-INST: bfmls za.h[w11, 7, vgx2], { z12.h, z13.h }, { z8.h, z9.h }
+// CHECK-ENCODING: [0x9f,0x71,0xe8,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e8719f <unknown>
+
+bfmls za.h[w11, 7], {z12.h - z13.h}, {z8.h - z9.h} // 11000001-11101000-01110001-10011111
+// CHECK-INST: bfmls za.h[w11, 7, vgx2], { z12.h, z13.h }, { z8.h, z9.h }
+// CHECK-ENCODING: [0x9f,0x71,0xe8,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e8719f <unknown>
+
+bfmls za.h[w11, 7, vgx2], {z30.h, z31.h}, {z30.h, z31.h} // 11000001, 11111110-01110011-11011111
+// CHECK-INST: bfmls za.h[w11, 7, vgx2], { z30.h, z31.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xdf,0x73,0xfe,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fe73df <unknown>
+
+bfmls za.h[w11, 7], {z30.h - z31.h}, {z30.h - z31.h} // 11000001-11111110-01110011-11011111
+// CHECK-INST: bfmls za.h[w11, 7, vgx2], { z30.h, z31.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xdf,0x73,0xfe,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fe73df <unknown>
+
+bfmls za.h[w8, 5, vgx2], {z16.h, z17.h}, {z16.h, z17.h} // 11000001, 11110000-00010010-00011101
+// CHECK-INST: bfmls za.h[w8, 5, vgx2], { z16.h, z17.h }, { z16.h, z17.h }
+// CHECK-ENCODING: [0x1d,0x12,0xf0,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f0121d <unknown>
+
+bfmls za.h[w8, 5], {z16.h - z17.h}, {z16.h - z17.h} // 11000001-11110000-00010010-00011101
+// CHECK-INST: bfmls za.h[w8, 5, vgx2], { z16.h, z17.h }, { z16.h, z17.h }
+// CHECK-ENCODING: [0x1d,0x12,0xf0,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f0121d <unknown>
+
+bfmls za.h[w8, 1, vgx2], {z0.h, z1.h}, {z30.h, z31.h} // 11000001, 11111110-00010000-00011001
+// CHECK-INST: bfmls za.h[w8, 1, vgx2], { z0.h, z1.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0x19,0x10,0xfe,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fe1019 <unknown>
+
+bfmls za.h[w8, 1], {z0.h - z1.h}, {z30.h - z31.h} // 11000001-11111110-00010000-00011001
+// CHECK-INST: bfmls za.h[w8, 1, vgx2], { z0.h, z1.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0x19,0x10,0xfe,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fe1019 <unknown>
+
+bfmls za.h[w10, 0, vgx2], {z18.h, z19.h}, {z20.h, z21.h} // 11000001, 11110100-01010010-01011000
+// CHECK-INST: bfmls za.h[w10, 0, vgx2], { z18.h, z19.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x58,0x52,0xf4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f45258 <unknown>
+
+bfmls za.h[w10, 0], {z18.h - z19.h}, {z20.h - z21.h} // 11000001-11110100-01010010-01011000
+// CHECK-INST: bfmls za.h[w10, 0, vgx2], { z18.h, z19.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x58,0x52,0xf4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f45258 <unknown>
+
+bfmls za.h[w8, 0, vgx2], {z12.h, z13.h}, {z2.h, z3.h} // 11000001, 11100010-00010001-10011000
+// CHECK-INST: bfmls za.h[w8, 0, vgx2], { z12.h, z13.h }, { z2.h, z3.h }
+// CHECK-ENCODING: [0x98,0x11,0xe2,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e21198 <unknown>
+
+bfmls za.h[w8, 0], {z12.h - z13.h}, {z2.h - z3.h} // 11000001-11100010-00010001-10011000
+// CHECK-INST: bfmls za.h[w8, 0, vgx2], { z12.h, z13.h }, { z2.h, z3.h }
+// CHECK-ENCODING: [0x98,0x11,0xe2,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e21198 <unknown>
+
+bfmls za.h[w10, 1, vgx2], {z0.h, z1.h}, {z26.h, z27.h} // 11000001, 11111010-01010000-00011001
+// CHECK-INST: bfmls za.h[w10, 1, vgx2], { z0.h, z1.h }, { z26.h, z27.h }
+// CHECK-ENCODING: [0x19,0x50,0xfa,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fa5019 <unknown>
+
+bfmls za.h[w10, 1], {z0.h - z1.h}, {z26.h - z27.h} // 11000001-11111010-01010000-00011001
+// CHECK-INST: bfmls za.h[w10, 1, vgx2], { z0.h, z1.h }, { z26.h, z27.h }
+// CHECK-ENCODING: [0x19,0x50,0xfa,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fa5019 <unknown>
+
+bfmls za.h[w8, 5, vgx2], {z22.h, z23.h}, {z30.h, z31.h} // 11000001, 11111110-00010010-11011101
+// CHECK-INST: bfmls za.h[w8, 5, vgx2], { z22.h, z23.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xdd,0x12,0xfe,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fe12dd <unknown>
+
+bfmls za.h[w8, 5], {z22.h - z23.h}, {z30.h - z31.h} // 11000001-11111110-00010010-11011101
+// CHECK-INST: bfmls za.h[w8, 5, vgx2], { z22.h, z23.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xdd,0x12,0xfe,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fe12dd <unknown>
+
+bfmls za.h[w11, 2, vgx2], {z8.h, z9.h}, {z0.h, z1.h} // 11000001, 11100000-01110001-00011010
+// CHECK-INST: bfmls za.h[w11, 2, vgx2], { z8.h, z9.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x1a,0x71,0xe0,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e0711a <unknown>
+
+bfmls za.h[w11, 2], {z8.h - z9.h}, {z0.h - z1.h} // 11000001-11100000-01110001-00011010
+// CHECK-INST: bfmls za.h[w11, 2, vgx2], { z8.h, z9.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x1a,0x71,0xe0,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e0711a <unknown>
+
+bfmls za.h[w9, 7, vgx2], {z12.h, z13.h}, {z10.h, z11.h} // 11000001, 11101010-00110001-10011111
+// CHECK-INST: bfmls za.h[w9, 7, vgx2], { z12.h, z13.h }, { z10.h, z11.h }
+// CHECK-ENCODING: [0x9f,0x31,0xea,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1ea319f <unknown>
+
+bfmls za.h[w9, 7], {z12.h - z13.h}, {z10.h - z11.h} // 11000001-11101010-00110001-10011111
+// CHECK-INST: bfmls za.h[w9, 7, vgx2], { z12.h, z13.h }, { z10.h, z11.h }
+// CHECK-ENCODING: [0x9f,0x31,0xea,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1ea319f <unknown>
+
+bfmls za.h[w8, 0, vgx4], {z0.h - z3.h}, z0.h // 11000001-01110000-00011100-00001000
+// CHECK-INST: bfmls za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h
+// CHECK-ENCODING: [0x08,0x1c,0x70,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1701c08 <unknown>
+
+bfmls za.h[w8, 0], {z0.h - z3.h}, z0.h // 11000001-01110000-00011100-00001000
+// CHECK-INST: bfmls za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h
+// CHECK-ENCODING: [0x08,0x1c,0x70,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1701c08 <unknown>
+
+bfmls za.h[w10, 5, vgx4], {z10.h - z13.h}, z5.h // 11000001-01110101-01011101-01001101
+// CHECK-INST: bfmls za.h[w10, 5, vgx4], { z10.h - z13.h }, z5.h
+// CHECK-ENCODING: [0x4d,0x5d,0x75,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1755d4d <unknown>
+
+bfmls za.h[w10, 5], {z10.h - z13.h}, z5.h // 11000001-01110101-01011101-01001101
+// CHECK-INST: bfmls za.h[w10, 5, vgx4], { z10.h - z13.h }, z5.h
+// CHECK-ENCODING: [0x4d,0x5d,0x75,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1755d4d <unknown>
+
+bfmls za.h[w11, 7, vgx4], {z13.h - z16.h}, z8.h // 11000001-01111000-01111101-10101111
+// CHECK-INST: bfmls za.h[w11, 7, vgx4], { z13.h - z16.h }, z8.h
+// CHECK-ENCODING: [0xaf,0x7d,0x78,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1787daf <unknown>
+
+bfmls za.h[w11, 7], {z13.h - z16.h}, z8.h // 11000001-01111000-01111101-10101111
+// CHECK-INST: bfmls za.h[w11, 7, vgx4], { z13.h - z16.h }, z8.h
+// CHECK-ENCODING: [0xaf,0x7d,0x78,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1787daf <unknown>
+
+bfmls za.h[w11, 7, vgx4], {z31.h, z0.h, z1.h, z2.h}, z15.h // 11000001-01111111-01111111-11101111
+// CHECK-INST: bfmls za.h[w11, 7, vgx4], { z31.h, z0.h, z1.h, z2.h }, z15.h
+// CHECK-ENCODING: [0xef,0x7f,0x7f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17f7fef <unknown>
+
+bfmls za.h[w11, 7], {z31.h, z0.h, z1.h, z2.h}, z15.h // 11000001-01111111-01111111-11101111
+// CHECK-INST: bfmls za.h[w11, 7, vgx4], { z31.h, z0.h, z1.h, z2.h }, z15.h
+// CHECK-ENCODING: [0xef,0x7f,0x7f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17f7fef <unknown>
+
+bfmls za.h[w8, 5, vgx4], {z17.h - z20.h}, z0.h // 11000001-01110000-00011110-00101101
+// CHECK-INST: bfmls za.h[w8, 5, vgx4], { z17.h - z20.h }, z0.h
+// CHECK-ENCODING: [0x2d,0x1e,0x70,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1701e2d <unknown>
+
+bfmls za.h[w8, 5], {z17.h - z20.h}, z0.h // 11000001-01110000-00011110-00101101
+// CHECK-INST: bfmls za.h[w8, 5, vgx4], { z17.h - z20.h }, z0.h
+// CHECK-ENCODING: [0x2d,0x1e,0x70,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1701e2d <unknown>
+
+bfmls za.h[w8, 1, vgx4], {z1.h - z4.h}, z14.h // 11000001-01111110-00011100-00101001
+// CHECK-INST: bfmls za.h[w8, 1, vgx4], { z1.h - z4.h }, z14.h
+// CHECK-ENCODING: [0x29,0x1c,0x7e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17e1c29 <unknown>
+
+bfmls za.h[w8, 1], {z1.h - z4.h}, z14.h // 11000001-01111110-00011100-00101001
+// CHECK-INST: bfmls za.h[w8, 1, vgx4], { z1.h - z4.h }, z14.h
+// CHECK-ENCODING: [0x29,0x1c,0x7e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17e1c29 <unknown>
+
+bfmls za.h[w10, 0, vgx4], {z19.h - z22.h}, z4.h // 11000001-01110100-01011110-01101000
+// CHECK-INST: bfmls za.h[w10, 0, vgx4], { z19.h - z22.h }, z4.h
+// CHECK-ENCODING: [0x68,0x5e,0x74,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1745e68 <unknown>
+
+bfmls za.h[w10, 0], {z19.h - z22.h}, z4.h // 11000001-01110100-01011110-01101000
+// CHECK-INST: bfmls za.h[w10, 0, vgx4], { z19.h - z22.h }, z4.h
+// CHECK-ENCODING: [0x68,0x5e,0x74,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1745e68 <unknown>
+
+bfmls za.h[w8, 0, vgx4], {z12.h - z15.h}, z2.h // 11000001-01110010-00011101-10001000
+// CHECK-INST: bfmls za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h
+// CHECK-ENCODING: [0x88,0x1d,0x72,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1721d88 <unknown>
+
+bfmls za.h[w8, 0], {z12.h - z15.h}, z2.h // 11000001-01110010-00011101-10001000
+// CHECK-INST: bfmls za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h
+// CHECK-ENCODING: [0x88,0x1d,0x72,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1721d88 <unknown>
+
+bfmls za.h[w10, 1, vgx4], {z1.h - z4.h}, z10.h // 11000001-01111010-01011100-00101001
+// CHECK-INST: bfmls za.h[w10, 1, vgx4], { z1.h - z4.h }, z10.h
+// CHECK-ENCODING: [0x29,0x5c,0x7a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17a5c29 <unknown>
+
+bfmls za.h[w10, 1], {z1.h - z4.h}, z10.h // 11000001-01111010-01011100-00101001
+// CHECK-INST: bfmls za.h[w10, 1, vgx4], { z1.h - z4.h }, z10.h
+// CHECK-ENCODING: [0x29,0x5c,0x7a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17a5c29 <unknown>
+
+bfmls za.h[w8, 5, vgx4], {z22.h - z25.h}, z14.h // 11000001-01111110-00011110-11001101
+// CHECK-INST: bfmls za.h[w8, 5, vgx4], { z22.h - z25.h }, z14.h
+// CHECK-ENCODING: [0xcd,0x1e,0x7e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17e1ecd <unknown>
+
+bfmls za.h[w8, 5], {z22.h - z25.h}, z14.h // 11000001-01111110-00011110-11001101
+// CHECK-INST: bfmls za.h[w8, 5, vgx4], { z22.h - z25.h }, z14.h
+// CHECK-ENCODING: [0xcd,0x1e,0x7e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17e1ecd <unknown>
+
+bfmls za.h[w11, 2, vgx4], {z9.h - z12.h}, z1.h // 11000001-01110001-01111101-00101010
+// CHECK-INST: bfmls za.h[w11, 2, vgx4], { z9.h - z12.h }, z1.h
+// CHECK-ENCODING: [0x2a,0x7d,0x71,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1717d2a <unknown>
+
+bfmls za.h[w11, 2], {z9.h - z12.h}, z1.h // 11000001-01110001-01111101-00101010
+// CHECK-INST: bfmls za.h[w11, 2, vgx4], { z9.h - z12.h }, z1.h
+// CHECK-ENCODING: [0x2a,0x7d,0x71,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1717d2a <unknown>
+
+bfmls za.h[w9, 7, vgx4], {z12.h - z15.h}, z11.h // 11000001-01111011-00111101-10001111
+// CHECK-INST: bfmls za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h
+// CHECK-ENCODING: [0x8f,0x3d,0x7b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17b3d8f <unknown>
+
+bfmls za.h[w9, 7], {z12.h - z15.h}, z11.h // 11000001-01111011-00111101-10001111
+// CHECK-INST: bfmls za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h
+// CHECK-ENCODING: [0x8f,0x3d,0x7b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c17b3d8f <unknown>
+
+bfmls za.h[w8, 0, vgx4], {z0.h - z3.h}, z0.h[0] // 11000001-00010000-10010000-00110000
+// CHECK-INST: bfmls za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h[0]
+// CHECK-ENCODING: [0x30,0x90,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1109030 <unknown>
+
+bfmls za.h[w8, 0], {z0.h - z3.h}, z0.h[0] // 11000001-00010000-10010000-00110000
+// CHECK-INST: bfmls za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h[0]
+// CHECK-ENCODING: [0x30,0x90,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1109030 <unknown>
+
+bfmls za.h[w10, 5, vgx4], {z8.h - z11.h}, z5.h[2] // 11000001-00010101-11010101-00110101
+// CHECK-INST: bfmls za.h[w10, 5, vgx4], { z8.h - z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x35,0xd5,0x15,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c115d535 <unknown>
+
+bfmls za.h[w10, 5], {z8.h - z11.h}, z5.h[2] // 11000001-00010101-11010101-00110101
+// CHECK-INST: bfmls za.h[w10, 5, vgx4], { z8.h - z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x35,0xd5,0x15,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c115d535 <unknown>
+
+bfmls za.h[w11, 7, vgx4], {z12.h - z15.h}, z8.h[6] // 11000001-00011000-11111101-10110111
+// CHECK-INST: bfmls za.h[w11, 7, vgx4], { z12.h - z15.h }, z8.h[6]
+// CHECK-ENCODING: [0xb7,0xfd,0x18,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c118fdb7 <unknown>
+
+bfmls za.h[w11, 7], {z12.h - z15.h}, z8.h[6] // 11000001-00011000-11111101-10110111
+// CHECK-INST: bfmls za.h[w11, 7, vgx4], { z12.h - z15.h }, z8.h[6]
+// CHECK-ENCODING: [0xb7,0xfd,0x18,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c118fdb7 <unknown>
+
+bfmls za.h[w11, 7, vgx4], {z28.h - z31.h}, z15.h[7] // 11000001-00011111-11111111-10111111
+// CHECK-INST: bfmls za.h[w11, 7, vgx4], { z28.h - z31.h }, z15.h[7]
+// CHECK-ENCODING: [0xbf,0xff,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11fffbf <unknown>
+
+bfmls za.h[w11, 7], {z28.h - z31.h}, z15.h[7] // 11000001-00011111-11111111-10111111
+// CHECK-INST: bfmls za.h[w11, 7, vgx4], { z28.h - z31.h }, z15.h[7]
+// CHECK-ENCODING: [0xbf,0xff,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11fffbf <unknown>
+
+bfmls za.h[w8, 5, vgx4], {z16.h - z19.h}, z0.h[6] // 11000001-00010000-10011110-00110101
+// CHECK-INST: bfmls za.h[w8, 5, vgx4], { z16.h - z19.h }, z0.h[6]
+// CHECK-ENCODING: [0x35,0x9e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1109e35 <unknown>
+
+bfmls za.h[w8, 5], {z16.h - z19.h}, z0.h[6] // 11000001-00010000-10011110-00110101
+// CHECK-INST: bfmls za.h[w8, 5, vgx4], { z16.h - z19.h }, z0.h[6]
+// CHECK-ENCODING: [0x35,0x9e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1109e35 <unknown>
+
+bfmls za.h[w8, 1, vgx4], {z0.h - z3.h}, z14.h[2] // 11000001-00011110-10010100-00110001
+// CHECK-INST: bfmls za.h[w8, 1, vgx4], { z0.h - z3.h }, z14.h[2]
+// CHECK-ENCODING: [0x31,0x94,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e9431 <unknown>
+
+bfmls za.h[w8, 1], {z0.h - z3.h}, z14.h[2] // 11000001-00011110-10010100-00110001
+// CHECK-INST: bfmls za.h[w8, 1, vgx4], { z0.h - z3.h }, z14.h[2]
+// CHECK-ENCODING: [0x31,0x94,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e9431 <unknown>
+
+bfmls za.h[w10, 0, vgx4], {z16.h - z19.h}, z4.h[3] // 11000001-00010100-11010110-00111000
+// CHECK-INST: bfmls za.h[w10, 0, vgx4], { z16.h - z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x38,0xd6,0x14,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c114d638 <unknown>
+
+bfmls za.h[w10, 0], {z16.h - z19.h}, z4.h[3] // 11000001-00010100-11010110-00111000
+// CHECK-INST: bfmls za.h[w10, 0, vgx4], { z16.h - z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x38,0xd6,0x14,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c114d638 <unknown>
+
+bfmls za.h[w8, 0, vgx4], {z12.h - z15.h}, z2.h[4] // 11000001-00010010-10011001-10110000
+// CHECK-INST: bfmls za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h[4]
+// CHECK-ENCODING: [0xb0,0x99,0x12,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11299b0 <unknown>
+
+bfmls za.h[w8, 0], {z12.h - z15.h}, z2.h[4] // 11000001-00010010-10011001-10110000
+// CHECK-INST: bfmls za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h[4]
+// CHECK-ENCODING: [0xb0,0x99,0x12,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11299b0 <unknown>
+
+bfmls za.h[w10, 1, vgx4], {z0.h - z3.h}, z10.h[4] // 11000001-00011010-11011000-00110001
+// CHECK-INST: bfmls za.h[w10, 1, vgx4], { z0.h - z3.h }, z10.h[4]
+// CHECK-ENCODING: [0x31,0xd8,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11ad831 <unknown>
+
+bfmls za.h[w10, 1], {z0.h - z3.h}, z10.h[4] // 11000001-00011010-11011000-00110001
+// CHECK-INST: bfmls za.h[w10, 1, vgx4], { z0.h - z3.h }, z10.h[4]
+// CHECK-ENCODING: [0x31,0xd8,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11ad831 <unknown>
+
+bfmls za.h[w8, 5, vgx4], {z20.h - z23.h}, z14.h[5] // 11000001-00011110-10011010-10111101
+// CHECK-INST: bfmls za.h[w8, 5, vgx4], { z20.h - z23.h }, z14.h[5]
+// CHECK-ENCODING: [0xbd,0x9a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e9abd <unknown>
+
+bfmls za.h[w8, 5], {z20.h - z23.h}, z14.h[5] // 11000001-00011110-10011010-10111101
+// CHECK-INST: bfmls za.h[w8, 5, vgx4], { z20.h - z23.h }, z14.h[5]
+// CHECK-ENCODING: [0xbd,0x9a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11e9abd <unknown>
+
+bfmls za.h[w11, 2, vgx4], {z8.h - z11.h}, z1.h[2] // 11000001-00010001-11110101-00110010
+// CHECK-INST: bfmls za.h[w11, 2, vgx4], { z8.h - z11.h }, z1.h[2]
+// CHECK-ENCODING: [0x32,0xf5,0x11,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c111f532 <unknown>
+
+bfmls za.h[w11, 2], {z8.h - z11.h}, z1.h[2] // 11000001-00010001-11110101-00110010
+// CHECK-INST: bfmls za.h[w11, 2, vgx4], { z8.h - z11.h }, z1.h[2]
+// CHECK-ENCODING: [0x32,0xf5,0x11,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c111f532 <unknown>
+
+bfmls za.h[w9, 7, vgx4], {z12.h - z15.h}, z11.h[4] // 11000001-00011011-10111001-10110111
+// CHECK-INST: bfmls za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h[4]
+// CHECK-ENCODING: [0xb7,0xb9,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11bb9b7 <unknown>
+
+bfmls za.h[w9, 7], {z12.h - z15.h}, z11.h[4] // 11000001-00011011-10111001-10110111
+// CHECK-INST: bfmls za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h[4]
+// CHECK-ENCODING: [0xb7,0xb9,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c11bb9b7 <unknown>
+
+bfmls za.h[w8, 0, vgx4], {z0.h - z3.h}, {z0.h - z3.h} // 11000001-11100001-00010000-00011000
+// CHECK-INST: bfmls za.h[w8, 0, vgx4], { z0.h - z3.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x18,0x10,0xe1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e11018 <unknown>
+
+bfmls za.h[w8, 0], {z0.h - z3.h}, {z0.h - z3.h} // 11000001-11100001-00010000-00011000
+// CHECK-INST: bfmls za.h[w8, 0, vgx4], { z0.h - z3.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x18,0x10,0xe1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e11018 <unknown>
+
+bfmls za.h[w10, 5, vgx4], {z8.h - z11.h}, {z20.h - z23.h} // 11000001-11110101-01010001-00011101
+// CHECK-INST: bfmls za.h[w10, 5, vgx4], { z8.h - z11.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x1d,0x51,0xf5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f5511d <unknown>
+
+bfmls za.h[w10, 5], {z8.h - z11.h}, {z20.h - z23.h} // 11000001-11110101-01010001-00011101
+// CHECK-INST: bfmls za.h[w10, 5, vgx4], { z8.h - z11.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x1d,0x51,0xf5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f5511d <unknown>
+
+bfmls za.h[w11, 7, vgx4], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-11101001-01110001-10011111
+// CHECK-INST: bfmls za.h[w11, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x9f,0x71,0xe9,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e9719f <unknown>
+
+bfmls za.h[w11, 7], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-11101001-01110001-10011111
+// CHECK-INST: bfmls za.h[w11, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x9f,0x71,0xe9,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e9719f <unknown>
+
+bfmls za.h[w11, 7, vgx4], {z28.h - z31.h}, {z28.h - z31.h} // 11000001-11111101-01110011-10011111
+// CHECK-INST: bfmls za.h[w11, 7, vgx4], { z28.h - z31.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x9f,0x73,0xfd,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fd739f <unknown>
+
+bfmls za.h[w11, 7], {z28.h - z31.h}, {z28.h - z31.h} // 11000001-11111101-01110011-10011111
+// CHECK-INST: bfmls za.h[w11, 7, vgx4], { z28.h - z31.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x9f,0x73,0xfd,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fd739f <unknown>
+
+bfmls za.h[w8, 5, vgx4], {z16.h - z19.h}, {z16.h - z19.h} // 11000001-11110001-00010010-00011101
+// CHECK-INST: bfmls za.h[w8, 5, vgx4], { z16.h - z19.h }, { z16.h - z19.h }
+// CHECK-ENCODING: [0x1d,0x12,0xf1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f1121d <unknown>
+
+bfmls za.h[w8, 5], {z16.h - z19.h}, {z16.h - z19.h} // 11000001-11110001-00010010-00011101
+// CHECK-INST: bfmls za.h[w8, 5, vgx4], { z16.h - z19.h }, { z16.h - z19.h }
+// CHECK-ENCODING: [0x1d,0x12,0xf1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f1121d <unknown>
+
+bfmls za.h[w8, 1, vgx4], {z0.h - z3.h}, {z28.h - z31.h} // 11000001-11111101-00010000-00011001
+// CHECK-INST: bfmls za.h[w8, 1, vgx4], { z0.h - z3.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x19,0x10,0xfd,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fd1019 <unknown>
+
+bfmls za.h[w8, 1], {z0.h - z3.h}, {z28.h - z31.h} // 11000001-11111101-00010000-00011001
+// CHECK-INST: bfmls za.h[w8, 1, vgx4], { z0.h - z3.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x19,0x10,0xfd,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fd1019 <unknown>
+
+bfmls za.h[w10, 0, vgx4], {z16.h - z19.h}, {z20.h - z23.h} // 11000001-11110101-01010010-00011000
+// CHECK-INST: bfmls za.h[w10, 0, vgx4], { z16.h - z19.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x18,0x52,0xf5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f55218 <unknown>
+
+bfmls za.h[w10, 0], {z16.h - z19.h}, {z20.h - z23.h} // 11000001-11110101-01010010-00011000
+// CHECK-INST: bfmls za.h[w10, 0, vgx4], { z16.h - z19.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x18,0x52,0xf5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f55218 <unknown>
+
+bfmls za.h[w8, 0, vgx4], {z12.h - z15.h}, {z0.h - z3.h} // 11000001-11100001-00010001-10011000
+// CHECK-INST: bfmls za.h[w8, 0, vgx4], { z12.h - z15.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x98,0x11,0xe1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e11198 <unknown>
+
+bfmls za.h[w8, 0], {z12.h - z15.h}, {z0.h - z3.h} // 11000001-11100001-00010001-10011000
+// CHECK-INST: bfmls za.h[w8, 0, vgx4], { z12.h - z15.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x98,0x11,0xe1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e11198 <unknown>
+
+bfmls za.h[w10, 1, vgx4], {z0.h - z3.h}, {z24.h - z27.h} // 11000001-11111001-01010000-00011001
+// CHECK-INST: bfmls za.h[w10, 1, vgx4], { z0.h - z3.h }, { z24.h - z27.h }
+// CHECK-ENCODING: [0x19,0x50,0xf9,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f95019 <unknown>
+
+bfmls za.h[w10, 1], {z0.h - z3.h}, {z24.h - z27.h} // 11000001-11111001-01010000-00011001
+// CHECK-INST: bfmls za.h[w10, 1, vgx4], { z0.h - z3.h }, { z24.h - z27.h }
+// CHECK-ENCODING: [0x19,0x50,0xf9,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1f95019 <unknown>
+
+bfmls za.h[w8, 5, vgx4], {z20.h - z23.h}, {z28.h - z31.h} // 11000001-11111101-00010010-10011101
+// CHECK-INST: bfmls za.h[w8, 5, vgx4], { z20.h - z23.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x9d,0x12,0xfd,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fd129d <unknown>
+
+bfmls za.h[w8, 5], {z20.h - z23.h}, {z28.h - z31.h} // 11000001-11111101-00010010-10011101
+// CHECK-INST: bfmls za.h[w8, 5, vgx4], { z20.h - z23.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x9d,0x12,0xfd,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1fd129d <unknown>
+
+bfmls za.h[w11, 2, vgx4], {z8.h - z11.h}, {z0.h - z3.h} // 11000001-11100001-01110001-00011010
+// CHECK-INST: bfmls za.h[w11, 2, vgx4], { z8.h - z11.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x1a,0x71,0xe1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e1711a <unknown>
+
+bfmls za.h[w11, 2], {z8.h - z11.h}, {z0.h - z3.h} // 11000001-11100001-01110001-00011010
+// CHECK-INST: bfmls za.h[w11, 2, vgx4], { z8.h - z11.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x1a,0x71,0xe1,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e1711a <unknown>
+
+bfmls za.h[w9, 7, vgx4], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-11101001-00110001-10011111
+// CHECK-INST: bfmls za.h[w9, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x9f,0x31,0xe9,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e9319f <unknown>
+
+bfmls za.h[w9, 7], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-11101001-00110001-10011111
+// CHECK-INST: bfmls za.h[w9, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x9f,0x31,0xe9,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e9319f <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/bfmopa-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/bfmopa-diagnostics.s
new file mode 100644
index 000000000000..8b418f4a78cf
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfmopa-diagnostics.s
@@ -0,0 +1,35 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid predicate register
+
+bfmopa za1.h, p8/m, p5/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid restricted predicate register, expected p0..p7 (without element suffix)
+// CHECK-NEXT: bfmopa za1.h, p8/m, p5/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmopa za1.h, p5/m, p8/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid restricted predicate register, expected p0..p7 (without element suffix)
+// CHECK-NEXT: bfmopa za1.h, p5/m, p8/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmopa za1.h, p5.h, p5/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid restricted predicate register, expected p0..p7 (without element suffix)
+// CHECK-NEXT: bfmopa za1.h, p5.h, p5/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid matrix operand
+
+bfmopa za2.h, p5/m, p5/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfmopa za2.h, p5/m, p5/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid register suffixes
+
+bfmopa za1.h, p5/m, p5/m, z12.h, z11.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid element width
+// CHECK-NEXT: bfmopa za1.h, p5/m, p5/m, z12.h, z11.b
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/bfmopa.s b/llvm/test/MC/AArch64/SME2p1/bfmopa.s
new file mode 100644
index 000000000000..7a08185f1896
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfmopa.s
@@ -0,0 +1,84 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+b16b16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+b16b16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+bfmopa za0.h, p0/m, p0/m, z0.h, z0.h // 10000001-10100000-00000000-00001000
+// CHECK-INST: bfmopa za0.h, p0/m, p0/m, z0.h, z0.h
+// CHECK-ENCODING: [0x08,0x00,0xa0,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81a00008 <unknown>
+
+bfmopa za1.h, p5/m, p2/m, z10.h, z21.h // 10000001-10110101-01010101-01001001
+// CHECK-INST: bfmopa za1.h, p5/m, p2/m, z10.h, z21.h
+// CHECK-ENCODING: [0x49,0x55,0xb5,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81b55549 <unknown>
+
+bfmopa za1.h, p3/m, p7/m, z13.h, z8.h // 10000001-10101000-11101101-10101001
+// CHECK-INST: bfmopa za1.h, p3/m, p7/m, z13.h, z8.h
+// CHECK-ENCODING: [0xa9,0xed,0xa8,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81a8eda9 <unknown>
+
+bfmopa za1.h, p7/m, p7/m, z31.h, z31.h // 10000001-10111111-11111111-11101001
+// CHECK-INST: bfmopa za1.h, p7/m, p7/m, z31.h, z31.h
+// CHECK-ENCODING: [0xe9,0xff,0xbf,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81bfffe9 <unknown>
+
+bfmopa za1.h, p3/m, p0/m, z17.h, z16.h // 10000001-10110000-00001110-00101001
+// CHECK-INST: bfmopa za1.h, p3/m, p0/m, z17.h, z16.h
+// CHECK-ENCODING: [0x29,0x0e,0xb0,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81b00e29 <unknown>
+
+bfmopa za1.h, p1/m, p4/m, z1.h, z30.h // 10000001-10111110-10000100-00101001
+// CHECK-INST: bfmopa za1.h, p1/m, p4/m, z1.h, z30.h
+// CHECK-ENCODING: [0x29,0x84,0xbe,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81be8429 <unknown>
+
+bfmopa za0.h, p5/m, p2/m, z19.h, z20.h // 10000001-10110100-01010110-01101000
+// CHECK-INST: bfmopa za0.h, p5/m, p2/m, z19.h, z20.h
+// CHECK-ENCODING: [0x68,0x56,0xb4,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81b45668 <unknown>
+
+bfmopa za0.h, p6/m, p0/m, z12.h, z2.h // 10000001-10100010-00011001-10001000
+// CHECK-INST: bfmopa za0.h, p6/m, p0/m, z12.h, z2.h
+// CHECK-ENCODING: [0x88,0x19,0xa2,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81a21988 <unknown>
+
+bfmopa za1.h, p2/m, p6/m, z1.h, z26.h // 10000001-10111010-11001000-00101001
+// CHECK-INST: bfmopa za1.h, p2/m, p6/m, z1.h, z26.h
+// CHECK-ENCODING: [0x29,0xc8,0xba,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81bac829 <unknown>
+
+bfmopa za1.h, p2/m, p0/m, z22.h, z30.h // 10000001-10111110-00001010-11001001
+// CHECK-INST: bfmopa za1.h, p2/m, p0/m, z22.h, z30.h
+// CHECK-ENCODING: [0xc9,0x0a,0xbe,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81be0ac9 <unknown>
+
+bfmopa za0.h, p5/m, p7/m, z9.h, z1.h // 10000001-10100001-11110101-00101000
+// CHECK-INST: bfmopa za0.h, p5/m, p7/m, z9.h, z1.h
+// CHECK-ENCODING: [0x28,0xf5,0xa1,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81a1f528 <unknown>
+
+bfmopa za1.h, p2/m, p5/m, z12.h, z11.h // 10000001-10101011-10101001-10001001
+// CHECK-INST: bfmopa za1.h, p2/m, p5/m, z12.h, z11.h
+// CHECK-ENCODING: [0x89,0xa9,0xab,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81aba989 <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/bfmops-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/bfmops-diagnostics.s
new file mode 100644
index 000000000000..84275aff7091
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfmops-diagnostics.s
@@ -0,0 +1,35 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid predicate register
+
+bfmops za1.h, p8/m, p5/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid restricted predicate register, expected p0..p7 (without element suffix)
+// CHECK-NEXT: bfmops za1.h, p8/m, p5/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmops za1.h, p5/m, p8/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid restricted predicate register, expected p0..p7 (without element suffix)
+// CHECK-NEXT: bfmops za1.h, p5/m, p8/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfmops za1.h, p5.h, p5/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid restricted predicate register, expected p0..p7 (without element suffix)
+// CHECK-NEXT: bfmops za1.h, p5.h, p5/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid matrix operand
+
+bfmops za2.h, p5/m, p5/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfmops za2.h, p5/m, p5/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid register suffixes
+
+bfmops za1.h, p5/m, p5/m, z12.h, z11.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid element width
+// CHECK-NEXT: bfmops za1.h, p5/m, p5/m, z12.h, z11.b
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/bfmops.s b/llvm/test/MC/AArch64/SME2p1/bfmops.s
new file mode 100644
index 000000000000..380c65f1c7cc
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfmops.s
@@ -0,0 +1,84 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+b16b16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+b16b16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+bfmops za0.h, p0/m, p0/m, z0.h, z0.h // 10000001-10100000-00000000-00011000
+// CHECK-INST: bfmops za0.h, p0/m, p0/m, z0.h, z0.h
+// CHECK-ENCODING: [0x18,0x00,0xa0,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81a00018 <unknown>
+
+bfmops za1.h, p5/m, p2/m, z10.h, z21.h // 10000001-10110101-01010101-01011001
+// CHECK-INST: bfmops za1.h, p5/m, p2/m, z10.h, z21.h
+// CHECK-ENCODING: [0x59,0x55,0xb5,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81b55559 <unknown>
+
+bfmops za1.h, p3/m, p7/m, z13.h, z8.h // 10000001-10101000-11101101-10111001
+// CHECK-INST: bfmops za1.h, p3/m, p7/m, z13.h, z8.h
+// CHECK-ENCODING: [0xb9,0xed,0xa8,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81a8edb9 <unknown>
+
+bfmops za1.h, p7/m, p7/m, z31.h, z31.h // 10000001-10111111-11111111-11111001
+// CHECK-INST: bfmops za1.h, p7/m, p7/m, z31.h, z31.h
+// CHECK-ENCODING: [0xf9,0xff,0xbf,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81bffff9 <unknown>
+
+bfmops za1.h, p3/m, p0/m, z17.h, z16.h // 10000001-10110000-00001110-00111001
+// CHECK-INST: bfmops za1.h, p3/m, p0/m, z17.h, z16.h
+// CHECK-ENCODING: [0x39,0x0e,0xb0,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81b00e39 <unknown>
+
+bfmops za1.h, p1/m, p4/m, z1.h, z30.h // 10000001-10111110-10000100-00111001
+// CHECK-INST: bfmops za1.h, p1/m, p4/m, z1.h, z30.h
+// CHECK-ENCODING: [0x39,0x84,0xbe,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81be8439 <unknown>
+
+bfmops za0.h, p5/m, p2/m, z19.h, z20.h // 10000001-10110100-01010110-01111000
+// CHECK-INST: bfmops za0.h, p5/m, p2/m, z19.h, z20.h
+// CHECK-ENCODING: [0x78,0x56,0xb4,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81b45678 <unknown>
+
+bfmops za0.h, p6/m, p0/m, z12.h, z2.h // 10000001-10100010-00011001-10011000
+// CHECK-INST: bfmops za0.h, p6/m, p0/m, z12.h, z2.h
+// CHECK-ENCODING: [0x98,0x19,0xa2,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81a21998 <unknown>
+
+bfmops za1.h, p2/m, p6/m, z1.h, z26.h // 10000001-10111010-11001000-00111001
+// CHECK-INST: bfmops za1.h, p2/m, p6/m, z1.h, z26.h
+// CHECK-ENCODING: [0x39,0xc8,0xba,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81bac839 <unknown>
+
+bfmops za1.h, p2/m, p0/m, z22.h, z30.h // 10000001-10111110-00001010-11011001
+// CHECK-INST: bfmops za1.h, p2/m, p0/m, z22.h, z30.h
+// CHECK-ENCODING: [0xd9,0x0a,0xbe,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81be0ad9 <unknown>
+
+bfmops za0.h, p5/m, p7/m, z9.h, z1.h // 10000001-10100001-11110101-00111000
+// CHECK-INST: bfmops za0.h, p5/m, p7/m, z9.h, z1.h
+// CHECK-ENCODING: [0x38,0xf5,0xa1,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81a1f538 <unknown>
+
+bfmops za1.h, p2/m, p5/m, z12.h, z11.h // 10000001-10101011-10101001-10011001
+// CHECK-INST: bfmops za1.h, p2/m, p5/m, z12.h, z11.h
+// CHECK-ENCODING: [0x99,0xa9,0xab,0x81]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: 81aba999 <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/bfsub-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/bfsub-diagnostics.s
new file mode 100644
index 000000000000..9d3680ea560e
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfsub-diagnostics.s
@@ -0,0 +1,53 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Out of range index offset
+
+bfsub za.h[w8, 8], {z20.h-z21.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: bfsub za.h[w8, 8], {z20.h-z21.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfsub za.h[w8, -1, vgx4], {z0.h-z3.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: bfsub za.h[w8, -1, vgx4], {z0.h-z3.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector select register
+
+bfsub za.h[w7, 0], {z20.h-z21.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: bfsub za.h[w7, 0], {z20.h-z21.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfsub za.h[w12, 0, vgx4], {z20.h-z23.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: bfsub za.h[w12, 0, vgx4], {z20.h-z23.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+bfsub za.h[w8, 3], {z20.h-z22.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfsub za.h[w8, 3], {z20.h-z22.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfsub za.h[w8, 3, vgx4], {z21.h-z24.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 4 consecutive SVE vectors, where the first vector is a multiple of 4 and with matching element types
+// CHECK-NEXT: bfsub za.h[w8, 3, vgx4], {z21.h-z24.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid suffixes
+
+bfsub za.h[w8, 3, vgx4], {z20.s-z23.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: bfsub za.h[w8, 3, vgx4], {z20.s-z23.s}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+bfsub za.d[w8, 3, vgx4], {z20.h-z23.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand, expected suffix .h
+// CHECK-NEXT: bfsub za.d[w8, 3, vgx4], {z20.h-z23.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/bfsub.s b/llvm/test/MC/AArch64/SME2p1/bfsub.s
new file mode 100644
index 000000000000..dac5f97c9617
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/bfsub.s
@@ -0,0 +1,300 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+b16b16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+b16b16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+b16b16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+bfsub za.h[w8, 0, vgx2], {z0.h, z1.h} // 11000001-11100100-00011100-00001000
+// CHECK-INST: bfsub za.h[w8, 0, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x08,0x1c,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41c08 <unknown>
+
+bfsub za.h[w8, 0], {z0.h - z1.h} // 11000001-11100100-00011100-00001000
+// CHECK-INST: bfsub za.h[w8, 0, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x08,0x1c,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41c08 <unknown>
+
+bfsub za.h[w10, 5, vgx2], {z10.h, z11.h} // 11000001-11100100-01011101-01001101
+// CHECK-INST: bfsub za.h[w10, 5, vgx2], { z10.h, z11.h }
+// CHECK-ENCODING: [0x4d,0x5d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e45d4d <unknown>
+
+bfsub za.h[w10, 5], {z10.h - z11.h} // 11000001-11100100-01011101-01001101
+// CHECK-INST: bfsub za.h[w10, 5, vgx2], { z10.h, z11.h }
+// CHECK-ENCODING: [0x4d,0x5d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e45d4d <unknown>
+
+bfsub za.h[w11, 7, vgx2], {z12.h, z13.h} // 11000001-11100100-01111101-10001111
+// CHECK-INST: bfsub za.h[w11, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x8f,0x7d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e47d8f <unknown>
+
+bfsub za.h[w11, 7], {z12.h - z13.h} // 11000001-11100100-01111101-10001111
+// CHECK-INST: bfsub za.h[w11, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x8f,0x7d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e47d8f <unknown>
+
+bfsub za.h[w11, 7, vgx2], {z30.h, z31.h} // 11000001-11100100-01111111-11001111
+// CHECK-INST: bfsub za.h[w11, 7, vgx2], { z30.h, z31.h }
+// CHECK-ENCODING: [0xcf,0x7f,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e47fcf <unknown>
+
+bfsub za.h[w11, 7], {z30.h - z31.h} // 11000001-11100100-01111111-11001111
+// CHECK-INST: bfsub za.h[w11, 7, vgx2], { z30.h, z31.h }
+// CHECK-ENCODING: [0xcf,0x7f,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e47fcf <unknown>
+
+bfsub za.h[w8, 5, vgx2], {z16.h, z17.h} // 11000001-11100100-00011110-00001101
+// CHECK-INST: bfsub za.h[w8, 5, vgx2], { z16.h, z17.h }
+// CHECK-ENCODING: [0x0d,0x1e,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41e0d <unknown>
+
+bfsub za.h[w8, 5], {z16.h - z17.h} // 11000001-11100100-00011110-00001101
+// CHECK-INST: bfsub za.h[w8, 5, vgx2], { z16.h, z17.h }
+// CHECK-ENCODING: [0x0d,0x1e,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41e0d <unknown>
+
+bfsub za.h[w8, 1, vgx2], {z0.h, z1.h} // 11000001-11100100-00011100-00001001
+// CHECK-INST: bfsub za.h[w8, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x09,0x1c,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41c09 <unknown>
+
+bfsub za.h[w8, 1], {z0.h - z1.h} // 11000001-11100100-00011100-00001001
+// CHECK-INST: bfsub za.h[w8, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x09,0x1c,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41c09 <unknown>
+
+bfsub za.h[w10, 0, vgx2], {z18.h, z19.h} // 11000001-11100100-01011110, 01001000
+// CHECK-INST: bfsub za.h[w10, 0, vgx2], { z18.h, z19.h }
+// CHECK-ENCODING: [0x48,0x5e,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e45e48 <unknown>
+
+bfsub za.h[w10, 0], {z18.h - z19.h} // 11000001-11100100-01011110-01001000
+// CHECK-INST: bfsub za.h[w10, 0, vgx2], { z18.h, z19.h }
+// CHECK-ENCODING: [0x48,0x5e,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e45e48 <unknown>
+
+bfsub za.h[w8, 0, vgx2], {z12.h, z13.h} // 11000001-11100100-00011101-10001000
+// CHECK-INST: bfsub za.h[w8, 0, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x88,0x1d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41d88 <unknown>
+
+bfsub za.h[w8, 0], {z12.h - z13.h} // 11000001-11100100-00011101-10001000
+// CHECK-INST: bfsub za.h[w8, 0, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x88,0x1d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41d88 <unknown>
+
+bfsub za.h[w10, 1, vgx2], {z0.h, z1.h} // 11000001-11100100-01011100-00001001
+// CHECK-INST: bfsub za.h[w10, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x09,0x5c,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e45c09 <unknown>
+
+bfsub za.h[w10, 1], {z0.h - z1.h} // 11000001-11100100-01011100-00001001
+// CHECK-INST: bfsub za.h[w10, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x09,0x5c,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e45c09 <unknown>
+
+bfsub za.h[w8, 5, vgx2], {z22.h, z23.h} // 11000001-11100100-00011110, 11001101
+// CHECK-INST: bfsub za.h[w8, 5, vgx2], { z22.h, z23.h }
+// CHECK-ENCODING: [0xcd,0x1e,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41ecd <unknown>
+
+bfsub za.h[w8, 5], {z22.h - z23.h} // 11000001-11100100-00011110-11001101
+// CHECK-INST: bfsub za.h[w8, 5, vgx2], { z22.h, z23.h }
+// CHECK-ENCODING: [0xcd,0x1e,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e41ecd <unknown>
+
+bfsub za.h[w11, 2, vgx2], {z8.h, z9.h} // 11000001-11100100-01111101-00001010
+// CHECK-INST: bfsub za.h[w11, 2, vgx2], { z8.h, z9.h }
+// CHECK-ENCODING: [0x0a,0x7d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e47d0a <unknown>
+
+bfsub za.h[w11, 2], {z8.h - z9.h} // 11000001-11100100-01111101-00001010
+// CHECK-INST: bfsub za.h[w11, 2, vgx2], { z8.h, z9.h }
+// CHECK-ENCODING: [0x0a,0x7d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e47d0a <unknown>
+
+bfsub za.h[w9, 7, vgx2], {z12.h, z13.h} // 11000001-11100100-00111101-10001111
+// CHECK-INST: bfsub za.h[w9, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x8f,0x3d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e43d8f <unknown>
+
+bfsub za.h[w9, 7], {z12.h - z13.h} // 11000001-11100100-00111101-10001111
+// CHECK-INST: bfsub za.h[w9, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x8f,0x3d,0xe4,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e43d8f <unknown>
+
+bfsub za.h[w8, 0, vgx4], {z0.h - z3.h} // 11000001-11100101-00011100-00001000
+// CHECK-INST: bfsub za.h[w8, 0, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x08,0x1c,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51c08 <unknown>
+
+bfsub za.h[w8, 0], {z0.h - z3.h} // 11000001-11100101-00011100-00001000
+// CHECK-INST: bfsub za.h[w8, 0, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x08,0x1c,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51c08 <unknown>
+
+bfsub za.h[w10, 5, vgx4], {z8.h - z11.h} // 11000001-11100101-01011101-00001101
+// CHECK-INST: bfsub za.h[w10, 5, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x0d,0x5d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e55d0d <unknown>
+
+bfsub za.h[w10, 5], {z8.h - z11.h} // 11000001-11100101-01011101-00001101
+// CHECK-INST: bfsub za.h[w10, 5, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x0d,0x5d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e55d0d <unknown>
+
+bfsub za.h[w11, 7, vgx4], {z12.h - z15.h} // 11000001-11100101-01111101-10001111
+// CHECK-INST: bfsub za.h[w11, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x8f,0x7d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e57d8f <unknown>
+
+bfsub za.h[w11, 7], {z12.h - z15.h} // 11000001-11100101-01111101-10001111
+// CHECK-INST: bfsub za.h[w11, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x8f,0x7d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e57d8f <unknown>
+
+bfsub za.h[w11, 7, vgx4], {z28.h - z31.h} // 11000001-11100101-01111111-10001111
+// CHECK-INST: bfsub za.h[w11, 7, vgx4], { z28.h - z31.h }
+// CHECK-ENCODING: [0x8f,0x7f,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e57f8f <unknown>
+
+bfsub za.h[w11, 7], {z28.h - z31.h} // 11000001-11100101-01111111-10001111
+// CHECK-INST: bfsub za.h[w11, 7, vgx4], { z28.h - z31.h }
+// CHECK-ENCODING: [0x8f,0x7f,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e57f8f <unknown>
+
+bfsub za.h[w8, 5, vgx4], {z16.h - z19.h} // 11000001-11100101-00011110-00001101
+// CHECK-INST: bfsub za.h[w8, 5, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x0d,0x1e,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51e0d <unknown>
+
+bfsub za.h[w8, 5], {z16.h - z19.h} // 11000001-11100101-00011110-00001101
+// CHECK-INST: bfsub za.h[w8, 5, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x0d,0x1e,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51e0d <unknown>
+
+bfsub za.h[w8, 1, vgx4], {z0.h - z3.h} // 11000001-11100101-00011100-00001001
+// CHECK-INST: bfsub za.h[w8, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x09,0x1c,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51c09 <unknown>
+
+bfsub za.h[w8, 1], {z0.h - z3.h} // 11000001-11100101-00011100-00001001
+// CHECK-INST: bfsub za.h[w8, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x09,0x1c,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51c09 <unknown>
+
+bfsub za.h[w10, 0, vgx4], {z16.h - z19.h} // 11000001-11100101-01011110-00001000
+// CHECK-INST: bfsub za.h[w10, 0, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x08,0x5e,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e55e08 <unknown>
+
+bfsub za.h[w10, 0], {z16.h - z19.h} // 11000001-11100101-01011110-00001000
+// CHECK-INST: bfsub za.h[w10, 0, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x08,0x5e,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e55e08 <unknown>
+
+bfsub za.h[w8, 0, vgx4], {z12.h - z15.h} // 11000001-11100101-00011101-10001000
+// CHECK-INST: bfsub za.h[w8, 0, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x88,0x1d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51d88 <unknown>
+
+bfsub za.h[w8, 0], {z12.h - z15.h} // 11000001-11100101-00011101-10001000
+// CHECK-INST: bfsub za.h[w8, 0, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x88,0x1d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51d88 <unknown>
+
+bfsub za.h[w10, 1, vgx4], {z0.h - z3.h} // 11000001-11100101-01011100-00001001
+// CHECK-INST: bfsub za.h[w10, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x09,0x5c,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e55c09 <unknown>
+
+bfsub za.h[w10, 1], {z0.h - z3.h} // 11000001-11100101-01011100-00001001
+// CHECK-INST: bfsub za.h[w10, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x09,0x5c,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e55c09 <unknown>
+
+bfsub za.h[w8, 5, vgx4], {z20.h - z23.h} // 11000001-11100101-00011110-10001101
+// CHECK-INST: bfsub za.h[w8, 5, vgx4], { z20.h - z23.h }
+// CHECK-ENCODING: [0x8d,0x1e,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51e8d <unknown>
+
+bfsub za.h[w8, 5], {z20.h - z23.h} // 11000001-11100101-00011110-10001101
+// CHECK-INST: bfsub za.h[w8, 5, vgx4], { z20.h - z23.h }
+// CHECK-ENCODING: [0x8d,0x1e,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e51e8d <unknown>
+
+bfsub za.h[w11, 2, vgx4], {z8.h - z11.h} // 11000001-11100101-01111101-00001010
+// CHECK-INST: bfsub za.h[w11, 2, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x0a,0x7d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e57d0a <unknown>
+
+bfsub za.h[w11, 2], {z8.h - z11.h} // 11000001-11100101-01111101-00001010
+// CHECK-INST: bfsub za.h[w11, 2, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x0a,0x7d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e57d0a <unknown>
+
+bfsub za.h[w9, 7, vgx4], {z12.h - z15.h} // 11000001-11100101-00111101-10001111
+// CHECK-INST: bfsub za.h[w9, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x8f,0x3d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e53d8f <unknown>
+
+bfsub za.h[w9, 7], {z12.h - z15.h} // 11000001-11100101-00111101-10001111
+// CHECK-INST: bfsub za.h[w9, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x8f,0x3d,0xe5,0xc1]
+// CHECK-ERROR: instruction requires: b16b16 sme2p1
+// CHECK-UNKNOWN: c1e53d8f <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/fadd-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/fadd-diagnostics.s
new file mode 100644
index 000000000000..c13a1be05b1c
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fadd-diagnostics.s
@@ -0,0 +1,45 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Out of range index offset
+
+fadd za.d[w8, 8, vgx2], {z0.d-z1.d}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand, expected suffix .s
+// CHECK-NEXT: fadd za.d[w8, 8, vgx2], {z0.d-z1.d}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fadd za.s[w8, -1, vgx4], {z0.s-z3.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: fadd za.s[w8, -1, vgx4], {z0.s-z3.s}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector select register
+
+fadd za.h[w7, 7, vgx4], {z0.h-z3.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: fadd za.h[w7, 7, vgx4], {z0.h-z3.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fadd za.s[w12, 7, vgx2], {z0.s-z1.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: fadd za.s[w12, 7, vgx2], {z0.s-z1.s}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+fadd za.d[w8, 0, vgx4], {z0.d-z4.d}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid number of vectors
+// CHECK-NEXT: fadd za.d[w8, 0, vgx4], {z0.d-z4.d}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fadd za.h[w8, 0, vgx2], {z1.h-z2.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors, where the first vector is a multiple of 2 and with matching element types
+// CHECK-NEXT: fadd za.h[w8, 0, vgx2], {z1.h-z2.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fadd za.s[w8, 0, vgx4], {z1.s-z4.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 4 consecutive SVE vectors, where the first vector is a multiple of 4 and with matching element types
+// CHECK-NEXT: fadd za.s[w8, 0, vgx4], {z1.s-z4.s}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/fadd.s b/llvm/test/MC/AArch64/SME2p1/fadd.s
new file mode 100644
index 000000000000..a8e64a63dbdb
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fadd.s
@@ -0,0 +1,300 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+sme-f16f16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+sme-f16f16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+fadd za.h[w8, 0, vgx2], {z0.h, z1.h} // 11000001-10100100-00011100-00000000
+// CHECK-INST: fadd za.h[w8, 0, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x00,0x1c,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41c00 <unknown>
+
+fadd za.h[w8, 0], {z0.h - z1.h} // 11000001-10100100-00011100-00000000
+// CHECK-INST: fadd za.h[w8, 0, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x00,0x1c,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41c00 <unknown>
+
+fadd za.h[w10, 5, vgx2], {z10.h, z11.h} // 11000001-10100100-01011101-01000101
+// CHECK-INST: fadd za.h[w10, 5, vgx2], { z10.h, z11.h }
+// CHECK-ENCODING: [0x45,0x5d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a45d45 <unknown>
+
+fadd za.h[w10, 5], {z10.h - z11.h} // 11000001-10100100-01011101-01000101
+// CHECK-INST: fadd za.h[w10, 5, vgx2], { z10.h, z11.h }
+// CHECK-ENCODING: [0x45,0x5d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a45d45 <unknown>
+
+fadd za.h[w11, 7, vgx2], {z12.h, z13.h} // 11000001-10100100-01111101-10000111
+// CHECK-INST: fadd za.h[w11, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x87,0x7d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a47d87 <unknown>
+
+fadd za.h[w11, 7], {z12.h - z13.h} // 11000001-10100100-01111101-10000111
+// CHECK-INST: fadd za.h[w11, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x87,0x7d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a47d87 <unknown>
+
+fadd za.h[w11, 7, vgx2], {z30.h, z31.h} // 11000001-10100100-01111111-11000111
+// CHECK-INST: fadd za.h[w11, 7, vgx2], { z30.h, z31.h }
+// CHECK-ENCODING: [0xc7,0x7f,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a47fc7 <unknown>
+
+fadd za.h[w11, 7], {z30.h - z31.h} // 11000001-10100100-01111111-11000111
+// CHECK-INST: fadd za.h[w11, 7, vgx2], { z30.h, z31.h }
+// CHECK-ENCODING: [0xc7,0x7f,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a47fc7 <unknown>
+
+fadd za.h[w8, 5, vgx2], {z16.h, z17.h} // 11000001-10100100-00011110-00000101
+// CHECK-INST: fadd za.h[w8, 5, vgx2], { z16.h, z17.h }
+// CHECK-ENCODING: [0x05,0x1e,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41e05 <unknown>
+
+fadd za.h[w8, 5], {z16.h - z17.h} // 11000001-10100100-00011110-00000101
+// CHECK-INST: fadd za.h[w8, 5, vgx2], { z16.h, z17.h }
+// CHECK-ENCODING: [0x05,0x1e,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41e05 <unknown>
+
+fadd za.h[w8, 1, vgx2], {z0.h, z1.h} // 11000001-10100100-00011100-00000001
+// CHECK-INST: fadd za.h[w8, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x01,0x1c,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41c01 <unknown>
+
+fadd za.h[w8, 1], {z0.h - z1.h} // 11000001-10100100-00011100-00000001
+// CHECK-INST: fadd za.h[w8, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x01,0x1c,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41c01 <unknown>
+
+fadd za.h[w10, 0, vgx2], {z18.h, z19.h} // 11000001-10100100-01011110, 01000000
+// CHECK-INST: fadd za.h[w10, 0, vgx2], { z18.h, z19.h }
+// CHECK-ENCODING: [0x40,0x5e,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a45e40 <unknown>
+
+fadd za.h[w10, 0], {z18.h - z19.h} // 11000001-10100100-01011110-01000000
+// CHECK-INST: fadd za.h[w10, 0, vgx2], { z18.h, z19.h }
+// CHECK-ENCODING: [0x40,0x5e,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a45e40 <unknown>
+
+fadd za.h[w8, 0, vgx2], {z12.h, z13.h} // 11000001-10100100-00011101-10000000
+// CHECK-INST: fadd za.h[w8, 0, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x80,0x1d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41d80 <unknown>
+
+fadd za.h[w8, 0], {z12.h - z13.h} // 11000001-10100100-00011101-10000000
+// CHECK-INST: fadd za.h[w8, 0, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x80,0x1d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41d80 <unknown>
+
+fadd za.h[w10, 1, vgx2], {z0.h, z1.h} // 11000001-10100100-01011100-00000001
+// CHECK-INST: fadd za.h[w10, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x01,0x5c,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a45c01 <unknown>
+
+fadd za.h[w10, 1], {z0.h - z1.h} // 11000001-10100100-01011100-00000001
+// CHECK-INST: fadd za.h[w10, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x01,0x5c,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a45c01 <unknown>
+
+fadd za.h[w8, 5, vgx2], {z22.h, z23.h} // 11000001-10100100-00011110, 11000101
+// CHECK-INST: fadd za.h[w8, 5, vgx2], { z22.h, z23.h }
+// CHECK-ENCODING: [0xc5,0x1e,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41ec5 <unknown>
+
+fadd za.h[w8, 5], {z22.h - z23.h} // 11000001-10100100-00011110-11000101
+// CHECK-INST: fadd za.h[w8, 5, vgx2], { z22.h, z23.h }
+// CHECK-ENCODING: [0xc5,0x1e,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41ec5 <unknown>
+
+fadd za.h[w11, 2, vgx2], {z8.h, z9.h} // 11000001-10100100-01111101-00000010
+// CHECK-INST: fadd za.h[w11, 2, vgx2], { z8.h, z9.h }
+// CHECK-ENCODING: [0x02,0x7d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a47d02 <unknown>
+
+fadd za.h[w11, 2], {z8.h - z9.h} // 11000001-10100100-01111101-00000010
+// CHECK-INST: fadd za.h[w11, 2, vgx2], { z8.h, z9.h }
+// CHECK-ENCODING: [0x02,0x7d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a47d02 <unknown>
+
+fadd za.h[w9, 7, vgx2], {z12.h, z13.h} // 11000001-10100100-00111101-10000111
+// CHECK-INST: fadd za.h[w9, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x87,0x3d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a43d87 <unknown>
+
+fadd za.h[w9, 7], {z12.h - z13.h} // 11000001-10100100-00111101-10000111
+// CHECK-INST: fadd za.h[w9, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x87,0x3d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a43d87 <unknown>
+
+fadd za.h[w8, 0, vgx4], {z0.h - z3.h} // 11000001-10100101-00011100-00000000
+// CHECK-INST: fadd za.h[w8, 0, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x00,0x1c,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51c00 <unknown>
+
+fadd za.h[w8, 0], {z0.h - z3.h} // 11000001-10100101-00011100-00000000
+// CHECK-INST: fadd za.h[w8, 0, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x00,0x1c,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51c00 <unknown>
+
+fadd za.h[w10, 5, vgx4], {z8.h - z11.h} // 11000001-10100101-01011101-00000101
+// CHECK-INST: fadd za.h[w10, 5, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x05,0x5d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a55d05 <unknown>
+
+fadd za.h[w10, 5], {z8.h - z11.h} // 11000001-10100101-01011101-00000101
+// CHECK-INST: fadd za.h[w10, 5, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x05,0x5d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a55d05 <unknown>
+
+fadd za.h[w11, 7, vgx4], {z12.h - z15.h} // 11000001-10100101-01111101-10000111
+// CHECK-INST: fadd za.h[w11, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x87,0x7d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a57d87 <unknown>
+
+fadd za.h[w11, 7], {z12.h - z15.h} // 11000001-10100101-01111101-10000111
+// CHECK-INST: fadd za.h[w11, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x87,0x7d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a57d87 <unknown>
+
+fadd za.h[w11, 7, vgx4], {z28.h - z31.h} // 11000001-10100101-01111111-10000111
+// CHECK-INST: fadd za.h[w11, 7, vgx4], { z28.h - z31.h }
+// CHECK-ENCODING: [0x87,0x7f,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a57f87 <unknown>
+
+fadd za.h[w11, 7], {z28.h - z31.h} // 11000001-10100101-01111111-10000111
+// CHECK-INST: fadd za.h[w11, 7, vgx4], { z28.h - z31.h }
+// CHECK-ENCODING: [0x87,0x7f,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a57f87 <unknown>
+
+fadd za.h[w8, 5, vgx4], {z16.h - z19.h} // 11000001-10100101-00011110-00000101
+// CHECK-INST: fadd za.h[w8, 5, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x05,0x1e,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51e05 <unknown>
+
+fadd za.h[w8, 5], {z16.h - z19.h} // 11000001-10100101-00011110-00000101
+// CHECK-INST: fadd za.h[w8, 5, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x05,0x1e,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51e05 <unknown>
+
+fadd za.h[w8, 1, vgx4], {z0.h - z3.h} // 11000001-10100101-00011100-00000001
+// CHECK-INST: fadd za.h[w8, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x01,0x1c,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51c01 <unknown>
+
+fadd za.h[w8, 1], {z0.h - z3.h} // 11000001-10100101-00011100-00000001
+// CHECK-INST: fadd za.h[w8, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x01,0x1c,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51c01 <unknown>
+
+fadd za.h[w10, 0, vgx4], {z16.h - z19.h} // 11000001-10100101-01011110-00000000
+// CHECK-INST: fadd za.h[w10, 0, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x00,0x5e,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a55e00 <unknown>
+
+fadd za.h[w10, 0], {z16.h - z19.h} // 11000001-10100101-01011110-00000000
+// CHECK-INST: fadd za.h[w10, 0, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x00,0x5e,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a55e00 <unknown>
+
+fadd za.h[w8, 0, vgx4], {z12.h - z15.h} // 11000001-10100101-00011101-10000000
+// CHECK-INST: fadd za.h[w8, 0, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x80,0x1d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51d80 <unknown>
+
+fadd za.h[w8, 0], {z12.h - z15.h} // 11000001-10100101-00011101-10000000
+// CHECK-INST: fadd za.h[w8, 0, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x80,0x1d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51d80 <unknown>
+
+fadd za.h[w10, 1, vgx4], {z0.h - z3.h} // 11000001-10100101-01011100-00000001
+// CHECK-INST: fadd za.h[w10, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x01,0x5c,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a55c01 <unknown>
+
+fadd za.h[w10, 1], {z0.h - z3.h} // 11000001-10100101-01011100-00000001
+// CHECK-INST: fadd za.h[w10, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x01,0x5c,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a55c01 <unknown>
+
+fadd za.h[w8, 5, vgx4], {z20.h - z23.h} // 11000001-10100101-00011110-10000101
+// CHECK-INST: fadd za.h[w8, 5, vgx4], { z20.h - z23.h }
+// CHECK-ENCODING: [0x85,0x1e,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51e85 <unknown>
+
+fadd za.h[w8, 5], {z20.h - z23.h} // 11000001-10100101-00011110-10000101
+// CHECK-INST: fadd za.h[w8, 5, vgx4], { z20.h - z23.h }
+// CHECK-ENCODING: [0x85,0x1e,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51e85 <unknown>
+
+fadd za.h[w11, 2, vgx4], {z8.h - z11.h} // 11000001-10100101-01111101-00000010
+// CHECK-INST: fadd za.h[w11, 2, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x02,0x7d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a57d02 <unknown>
+
+fadd za.h[w11, 2], {z8.h - z11.h} // 11000001-10100101-01111101-00000010
+// CHECK-INST: fadd za.h[w11, 2, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x02,0x7d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a57d02 <unknown>
+
+fadd za.h[w9, 7, vgx4], {z12.h - z15.h} // 11000001-10100101-00111101-10000111
+// CHECK-INST: fadd za.h[w9, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x87,0x3d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a53d87 <unknown>
+
+fadd za.h[w9, 7], {z12.h - z15.h} // 11000001-10100101-00111101-10000111
+// CHECK-INST: fadd za.h[w9, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x87,0x3d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a53d87 <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/fcvt-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/fcvt-diagnostics.s
new file mode 100644
index 000000000000..005e9847f6c5
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fcvt-diagnostics.s
@@ -0,0 +1,47 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+fcvt z0.h, {z0.s-z2.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: fcvt z0.h, {z0.s-z2.s}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fcvt {z0.s-z2.s}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: fcvt {z0.s-z2.s}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fcvt z0.h, {z1.s-z2.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors, where the first vector is a multiple of 2 and with matching element types
+// CHECK-NEXT: fcvt z0.h, {z1.s-z2.s}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fcvt {z1.s-z2.s}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors, where the first vector is a multiple of 2 and with matching element types
+// CHECK-NEXT: fcvt {z1.s-z2.s}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid Register Suffix
+
+fcvt z0.s, {z0.s-z1.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: fcvt z0.s, {z0.s-z1.s}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fcvt z0.h, {z0.h-z1.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: fcvt z0.h, {z0.h-z1.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fcvt {z0.s-z1.s}, z0.s
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid element width
+// CHECK-NEXT: fcvt {z0.s-z1.s}, z0.s
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fcvt {z0.h-z1.h}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: fcvt {z0.h-z1.h}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/fcvt.s b/llvm/test/MC/AArch64/SME2p1/fcvt.s
new file mode 100644
index 000000000000..b5707bad0a24
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fcvt.s
@@ -0,0 +1,36 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+sme-f16f16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+sme-f16f16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+fcvt {z0.s, z1.s}, z0.h // 11000001-10100000-11100000-00000000
+// CHECK-INST: fcvt { z0.s, z1.s }, z0.h
+// CHECK-ENCODING: [0x00,0xe0,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a0e000 <unknown>
+
+fcvt {z20.s, z21.s}, z10.h // 11000001-10100000-11100001-01010100
+// CHECK-INST: fcvt { z20.s, z21.s }, z10.h
+// CHECK-ENCODING: [0x54,0xe1,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a0e154 <unknown>
+
+fcvt {z22.s, z23.s}, z13.h // 11000001-10100000-11100001-10110110
+// CHECK-INST: fcvt { z22.s, z23.s }, z13.h
+// CHECK-ENCODING: [0xb6,0xe1,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a0e1b6 <unknown>
+
+fcvt {z30.s, z31.s}, z31.h // 11000001-10100000-11100011-11111110
+// CHECK-INST: fcvt { z30.s, z31.s }, z31.h
+// CHECK-ENCODING: [0xfe,0xe3,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a0e3fe <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/fcvtl-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/fcvtl-diagnostics.s
new file mode 100644
index 000000000000..a723d2fc6f3a
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fcvtl-diagnostics.s
@@ -0,0 +1,32 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+fcvtl {z0.s-z2.s}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: fcvtl {z0.s-z2.s}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fcvtl z0.h, {z1.s-z2.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector register expected
+// CHECK-NEXT: fcvtl z0.h, {z1.s-z2.s}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fcvtl {z1.s-z2.s}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors, where the first vector is a multiple of 2 and with matching element types
+// CHECK-NEXT: fcvtl {z1.s-z2.s}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid Register Suffix
+
+fcvtl {z0.s-z1.s}, z0.s
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid element width
+// CHECK-NEXT: fcvtl {z0.s-z1.s}, z0.s
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fcvtl {z0.h-z1.h}, z0.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: fcvtl {z0.h-z1.h}, z0.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/fcvtl.s b/llvm/test/MC/AArch64/SME2p1/fcvtl.s
new file mode 100644
index 000000000000..31cf90d03796
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fcvtl.s
@@ -0,0 +1,36 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+sme-f16f16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+sme-f16f16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+fcvtl {z0.s, z1.s}, z0.h // 11000001-10100000-11100000-00000001
+// CHECK-INST: fcvtl { z0.s, z1.s }, z0.h
+// CHECK-ENCODING: [0x01,0xe0,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a0e001 <unknown>
+
+fcvtl {z20.s, z21.s}, z10.h // 11000001-10100000-11100001-01010101
+// CHECK-INST: fcvtl { z20.s, z21.s }, z10.h
+// CHECK-ENCODING: [0x55,0xe1,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a0e155 <unknown>
+
+fcvtl {z22.s, z23.s}, z13.h // 11000001-10100000-11100001-10110111
+// CHECK-INST: fcvtl { z22.s, z23.s }, z13.h
+// CHECK-ENCODING: [0xb7,0xe1,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a0e1b7 <unknown>
+
+fcvtl {z30.s, z31.s}, z31.h // 11000001-10100000-11100011-11111111
+// CHECK-INST: fcvtl { z30.s, z31.s }, z31.h
+// CHECK-ENCODING: [0xff,0xe3,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a0e3ff <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/fmla-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/fmla-diagnostics.s
new file mode 100644
index 000000000000..d32f795728a2
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fmla-diagnostics.s
@@ -0,0 +1,94 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+fmla za.h[w11, 2, vgx2], {z12.h-z14.h}, z8.h[3]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: fmla za.h[w11, 2, vgx2], {z12.h-z14.h}, z8.h[3]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmla za.h[w11, 2, vgx4], {z12.h-z17.h}, z7.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid number of vectors
+// CHECK-NEXT: fmla za.h[w11, 2, vgx4], {z12.h-z17.h}, z7.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmla za.h[w10, 3, vgx2], {z10.h-z11.h}, {z21.h-z22.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors, where the first vector is a multiple of 2 and with matching element types
+// CHECK-NEXT: fmla za.h[w10, 3, vgx2], {z10.h-z11.h}, {z21.h-z22.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmla za.h[w11, 7, vgx4], {z12.h-z15.h}, {z9.h-z12.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 4 consecutive SVE vectors, where the first vector is a multiple of 4 and with matching element types
+// CHECK-NEXT: fmla za.h[w11, 7, vgx4], {z12.h-z15.h}, {z9.h-z12.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid indexed-vector or single-vector register
+
+fmla za.h[w8, 0], {z0.h-z1.h}, z16.h[0]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: fmla za.h[w8, 0], {z0.h-z1.h}, z16.h[0]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmla za.h[w8, 1], {z0.h-z3.h}, z16.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: fmla za.h[w8, 1], {z0.h-z3.h}, z16.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector select register
+
+fmla za.h[w7, 7, vgx4], {z12.h-z15.h}, {z8.h-z11.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: fmla za.h[w7, 7, vgx4], {z12.h-z15.h}, {z8.h-z11.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmla za.h[w12, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: fmla za.h[w12, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector select offset
+
+fmla za.h[w8, -1, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: fmla za.h[w8, -1, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmla za.h[w8, 8, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: fmla za.h[w8, 8, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid Register Suffix
+
+fmla za.d[w8, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand, expected suffix .s
+// CHECK-NEXT: fmla za.d[w8, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector lane index
+
+fmla za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[8]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: fmla za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[8]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmla za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: fmla za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmla za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: fmla za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmla za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[8]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: fmla za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[8]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/fmla.s b/llvm/test/MC/AArch64/SME2p1/fmla.s
new file mode 100644
index 000000000000..10529d81eed6
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fmla.s
@@ -0,0 +1,877 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+sme-f16f16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+sme-f16f16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+fmla za.h[w8, 0, vgx2], {z0.h, z1.h}, z0.h // 11000001-00100000-00011100-00000000
+// CHECK-INST: fmla za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h
+// CHECK-ENCODING: [0x00,0x1c,0x20,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1201c00 <unknown>
+
+fmla za.h[w8, 0], {z0.h - z1.h}, z0.h // 11000001-00100000-00011100-00000000
+// CHECK-INST: fmla za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h
+// CHECK-ENCODING: [0x00,0x1c,0x20,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1201c00 <unknown>
+
+fmla za.h[w10, 5, vgx2], {z10.h, z11.h}, z5.h // 11000001-00100101-01011101-01000101
+// CHECK-INST: fmla za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h
+// CHECK-ENCODING: [0x45,0x5d,0x25,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1255d45 <unknown>
+
+fmla za.h[w10, 5], {z10.h - z11.h}, z5.h // 11000001-00100101-01011101-01000101
+// CHECK-INST: fmla za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h
+// CHECK-ENCODING: [0x45,0x5d,0x25,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1255d45 <unknown>
+
+fmla za.h[w11, 7, vgx2], {z13.h, z14.h}, z8.h // 11000001-00101000-01111101-10100111
+// CHECK-INST: fmla za.h[w11, 7, vgx2], { z13.h, z14.h }, z8.h
+// CHECK-ENCODING: [0xa7,0x7d,0x28,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1287da7 <unknown>
+
+fmla za.h[w11, 7], {z13.h - z14.h}, z8.h // 11000001-00101000-01111101-10100111
+// CHECK-INST: fmla za.h[w11, 7, vgx2], { z13.h, z14.h }, z8.h
+// CHECK-ENCODING: [0xa7,0x7d,0x28,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1287da7 <unknown>
+
+fmla za.h[w11, 7, vgx2], {z31.h, z0.h}, z15.h // 11000001-00101111-01111111-11100111
+// CHECK-INST: fmla za.h[w11, 7, vgx2], { z31.h, z0.h }, z15.h
+// CHECK-ENCODING: [0xe7,0x7f,0x2f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12f7fe7 <unknown>
+
+fmla za.h[w11, 7], {z31.h - z0.h}, z15.h // 11000001-00101111-01111111-11100111
+// CHECK-INST: fmla za.h[w11, 7, vgx2], { z31.h, z0.h }, z15.h
+// CHECK-ENCODING: [0xe7,0x7f,0x2f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12f7fe7 <unknown>
+
+fmla za.h[w8, 5, vgx2], {z17.h, z18.h}, z0.h // 11000001-00100000-00011110-00100101
+// CHECK-INST: fmla za.h[w8, 5, vgx2], { z17.h, z18.h }, z0.h
+// CHECK-ENCODING: [0x25,0x1e,0x20,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1201e25 <unknown>
+
+fmla za.h[w8, 5], {z17.h - z18.h}, z0.h // 11000001-00100000-00011110-00100101
+// CHECK-INST: fmla za.h[w8, 5, vgx2], { z17.h, z18.h }, z0.h
+// CHECK-ENCODING: [0x25,0x1e,0x20,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1201e25 <unknown>
+
+fmla za.h[w8, 1, vgx2], {z1.h, z2.h}, z14.h // 11000001-00101110-00011100-00100001
+// CHECK-INST: fmla za.h[w8, 1, vgx2], { z1.h, z2.h }, z14.h
+// CHECK-ENCODING: [0x21,0x1c,0x2e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12e1c21 <unknown>
+
+fmla za.h[w8, 1], {z1.h - z2.h}, z14.h // 11000001-00101110-00011100-00100001
+// CHECK-INST: fmla za.h[w8, 1, vgx2], { z1.h, z2.h }, z14.h
+// CHECK-ENCODING: [0x21,0x1c,0x2e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12e1c21 <unknown>
+
+fmla za.h[w10, 0, vgx2], {z19.h, z20.h}, z4.h // 11000001-00100100-01011110-01100000
+// CHECK-INST: fmla za.h[w10, 0, vgx2], { z19.h, z20.h }, z4.h
+// CHECK-ENCODING: [0x60,0x5e,0x24,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1245e60 <unknown>
+
+fmla za.h[w10, 0], {z19.h - z20.h}, z4.h // 11000001-00100100-01011110-01100000
+// CHECK-INST: fmla za.h[w10, 0, vgx2], { z19.h, z20.h }, z4.h
+// CHECK-ENCODING: [0x60,0x5e,0x24,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1245e60 <unknown>
+
+fmla za.h[w8, 0, vgx2], {z12.h, z13.h}, z2.h // 11000001-00100010-00011101-10000000
+// CHECK-INST: fmla za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h
+// CHECK-ENCODING: [0x80,0x1d,0x22,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1221d80 <unknown>
+
+fmla za.h[w8, 0], {z12.h - z13.h}, z2.h // 11000001-00100010-00011101-10000000
+// CHECK-INST: fmla za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h
+// CHECK-ENCODING: [0x80,0x1d,0x22,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1221d80 <unknown>
+
+fmla za.h[w10, 1, vgx2], {z1.h, z2.h}, z10.h // 11000001-00101010-01011100-00100001
+// CHECK-INST: fmla za.h[w10, 1, vgx2], { z1.h, z2.h }, z10.h
+// CHECK-ENCODING: [0x21,0x5c,0x2a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12a5c21 <unknown>
+
+fmla za.h[w10, 1], {z1.h - z2.h}, z10.h // 11000001-00101010-01011100-00100001
+// CHECK-INST: fmla za.h[w10, 1, vgx2], { z1.h, z2.h }, z10.h
+// CHECK-ENCODING: [0x21,0x5c,0x2a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12a5c21 <unknown>
+
+fmla za.h[w8, 5, vgx2], {z22.h, z23.h}, z14.h // 11000001-00101110-00011110-11000101
+// CHECK-INST: fmla za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h
+// CHECK-ENCODING: [0xc5,0x1e,0x2e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12e1ec5 <unknown>
+
+fmla za.h[w8, 5], {z22.h - z23.h}, z14.h // 11000001-00101110-00011110-11000101
+// CHECK-INST: fmla za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h
+// CHECK-ENCODING: [0xc5,0x1e,0x2e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12e1ec5 <unknown>
+
+fmla za.h[w11, 2, vgx2], {z9.h, z10.h}, z1.h // 11000001-00100001-01111101-00100010
+// CHECK-INST: fmla za.h[w11, 2, vgx2], { z9.h, z10.h }, z1.h
+// CHECK-ENCODING: [0x22,0x7d,0x21,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1217d22 <unknown>
+
+fmla za.h[w11, 2], {z9.h - z10.h}, z1.h // 11000001-00100001-01111101-00100010
+// CHECK-INST: fmla za.h[w11, 2, vgx2], { z9.h, z10.h }, z1.h
+// CHECK-ENCODING: [0x22,0x7d,0x21,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1217d22 <unknown>
+
+fmla za.h[w9, 7, vgx2], {z12.h, z13.h}, z11.h // 11000001-00101011-00111101-10000111
+// CHECK-INST: fmla za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h
+// CHECK-ENCODING: [0x87,0x3d,0x2b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12b3d87 <unknown>
+
+fmla za.h[w9, 7], {z12.h - z13.h}, z11.h // 11000001-00101011-00111101-10000111
+// CHECK-INST: fmla za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h
+// CHECK-ENCODING: [0x87,0x3d,0x2b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12b3d87 <unknown>
+
+fmla za.h[w8, 0, vgx2], {z0.h, z1.h}, z0.h[0] // 11000001-00010000-00010000-00000000
+// CHECK-INST: fmla za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h[0]
+// CHECK-ENCODING: [0x00,0x10,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1101000 <unknown>
+
+fmla za.h[w8, 0], {z0.h - z1.h}, z0.h[0] // 11000001-00010000-00010000-00000000
+// CHECK-INST: fmla za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h[0]
+// CHECK-ENCODING: [0x00,0x10,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1101000 <unknown>
+
+fmla za.h[w10, 5, vgx2], {z10.h, z11.h}, z5.h[2] // 11000001-00010101-01010101-01000101
+// CHECK-INST: fmla za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x45,0x55,0x15,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1155545 <unknown>
+
+fmla za.h[w10, 5], {z10.h - z11.h}, z5.h[2] // 11000001-00010101-01010101-01000101
+// CHECK-INST: fmla za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x45,0x55,0x15,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1155545 <unknown>
+
+fmla za.h[w11, 7, vgx2], {z12.h, z13.h}, z8.h[6] // 11000001-00011000-01111101-10000111
+// CHECK-INST: fmla za.h[w11, 7, vgx2], { z12.h, z13.h }, z8.h[6]
+// CHECK-ENCODING: [0x87,0x7d,0x18,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1187d87 <unknown>
+
+fmla za.h[w11, 7], {z12.h - z13.h}, z8.h[6] // 11000001-00011000-01111101-10000111
+// CHECK-INST: fmla za.h[w11, 7, vgx2], { z12.h, z13.h }, z8.h[6]
+// CHECK-ENCODING: [0x87,0x7d,0x18,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1187d87 <unknown>
+
+fmla za.h[w11, 7, vgx2], {z30.h, z31.h}, z15.h[7] // 11000001-00011111-01111111-11001111
+// CHECK-INST: fmla za.h[w11, 7, vgx2], { z30.h, z31.h }, z15.h[7]
+// CHECK-ENCODING: [0xcf,0x7f,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11f7fcf <unknown>
+
+fmla za.h[w11, 7], {z30.h - z31.h}, z15.h[7] // 11000001-00011111-01111111-11001111
+// CHECK-INST: fmla za.h[w11, 7, vgx2], { z30.h, z31.h }, z15.h[7]
+// CHECK-ENCODING: [0xcf,0x7f,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11f7fcf <unknown>
+
+fmla za.h[w8, 5, vgx2], {z16.h, z17.h}, z0.h[6] // 11000001-00010000-00011110-00000101
+// CHECK-INST: fmla za.h[w8, 5, vgx2], { z16.h, z17.h }, z0.h[6]
+// CHECK-ENCODING: [0x05,0x1e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1101e05 <unknown>
+
+fmla za.h[w8, 5], {z16.h - z17.h}, z0.h[6] // 11000001-00010000-00011110-00000101
+// CHECK-INST: fmla za.h[w8, 5, vgx2], { z16.h, z17.h }, z0.h[6]
+// CHECK-ENCODING: [0x05,0x1e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1101e05 <unknown>
+
+fmla za.h[w8, 1, vgx2], {z0.h, z1.h}, z14.h[2] // 11000001-00011110-00010100-00000001
+// CHECK-INST: fmla za.h[w8, 1, vgx2], { z0.h, z1.h }, z14.h[2]
+// CHECK-ENCODING: [0x01,0x14,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e1401 <unknown>
+
+fmla za.h[w8, 1], {z0.h - z1.h}, z14.h[2] // 11000001-00011110-00010100-00000001
+// CHECK-INST: fmla za.h[w8, 1, vgx2], { z0.h, z1.h }, z14.h[2]
+// CHECK-ENCODING: [0x01,0x14,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e1401 <unknown>
+
+fmla za.h[w10, 0, vgx2], {z18.h, z19.h}, z4.h[3] // 11000001-00010100-01010110-01001000
+// CHECK-INST: fmla za.h[w10, 0, vgx2], { z18.h, z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x48,0x56,0x14,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1145648 <unknown>
+
+fmla za.h[w10, 0], {z18.h - z19.h}, z4.h[3] // 11000001-00010100-01010110-01001000
+// CHECK-INST: fmla za.h[w10, 0, vgx2], { z18.h, z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x48,0x56,0x14,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1145648 <unknown>
+
+fmla za.h[w8, 0, vgx2], {z12.h, z13.h}, z2.h[4] // 11000001-00010010-00011001-10000000
+// CHECK-INST: fmla za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h[4]
+// CHECK-ENCODING: [0x80,0x19,0x12,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1121980 <unknown>
+
+fmla za.h[w8, 0], {z12.h - z13.h}, z2.h[4] // 11000001-00010010-00011001-10000000
+// CHECK-INST: fmla za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h[4]
+// CHECK-ENCODING: [0x80,0x19,0x12,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1121980 <unknown>
+
+fmla za.h[w10, 1, vgx2], {z0.h, z1.h}, z10.h[4] // 11000001-00011010-01011000-00000001
+// CHECK-INST: fmla za.h[w10, 1, vgx2], { z0.h, z1.h }, z10.h[4]
+// CHECK-ENCODING: [0x01,0x58,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11a5801 <unknown>
+
+fmla za.h[w10, 1], {z0.h - z1.h}, z10.h[4] // 11000001-00011010-01011000-00000001
+// CHECK-INST: fmla za.h[w10, 1, vgx2], { z0.h, z1.h }, z10.h[4]
+// CHECK-ENCODING: [0x01,0x58,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11a5801 <unknown>
+
+fmla za.h[w8, 5, vgx2], {z22.h, z23.h}, z14.h[5] // 11000001-00011110-00011010-11001101
+// CHECK-INST: fmla za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h[5]
+// CHECK-ENCODING: [0xcd,0x1a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e1acd <unknown>
+
+fmla za.h[w8, 5], {z22.h - z23.h}, z14.h[5] // 11000001-00011110-00011010-11001101
+// CHECK-INST: fmla za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h[5]
+// CHECK-ENCODING: [0xcd,0x1a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e1acd <unknown>
+
+fmla za.h[w11, 2, vgx2], {z8.h, z9.h}, z1.h[2] // 11000001-00010001-01110101-00000010
+// CHECK-INST: fmla za.h[w11, 2, vgx2], { z8.h, z9.h }, z1.h[2]
+// CHECK-ENCODING: [0x02,0x75,0x11,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1117502 <unknown>
+
+fmla za.h[w11, 2], {z8.h - z9.h}, z1.h[2] // 11000001-00010001-01110101-00000010
+// CHECK-INST: fmla za.h[w11, 2, vgx2], { z8.h, z9.h }, z1.h[2]
+// CHECK-ENCODING: [0x02,0x75,0x11,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1117502 <unknown>
+
+fmla za.h[w9, 7, vgx2], {z12.h, z13.h}, z11.h[4] // 11000001-00011011-00111001-10000111
+// CHECK-INST: fmla za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h[4]
+// CHECK-ENCODING: [0x87,0x39,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11b3987 <unknown>
+
+fmla za.h[w9, 7], {z12.h - z13.h}, z11.h[4] // 11000001-00011011-00111001-10000111
+// CHECK-INST: fmla za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h[4]
+// CHECK-ENCODING: [0x87,0x39,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11b3987 <unknown>
+
+fmla za.h[w8, 0, vgx2], {z0.h, z1.h}, {z0.h, z1.h} // 11000001-10100000-00010000-00001000
+// CHECK-INST: fmla za.h[w8, 0, vgx2], { z0.h, z1.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x08,0x10,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a01008 <unknown>
+
+fmla za.h[w8, 0], {z0.h - z1.h}, {z0.h - z1.h} // 11000001-10100000-00010000-00001000
+// CHECK-INST: fmla za.h[w8, 0, vgx2], { z0.h, z1.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x08,0x10,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a01008 <unknown>
+
+fmla za.h[w10, 5, vgx2], {z10.h, z11.h}, {z20.h, z21.h} // 11000001-10110100-01010001-01001101
+// CHECK-INST: fmla za.h[w10, 5, vgx2], { z10.h, z11.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x4d,0x51,0xb4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b4514d <unknown>
+
+fmla za.h[w10, 5], {z10.h - z11.h}, {z20.h - z21.h} // 11000001-10110100-01010001-01001101
+// CHECK-INST: fmla za.h[w10, 5, vgx2], { z10.h, z11.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x4d,0x51,0xb4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b4514d <unknown>
+
+fmla za.h[w11, 7, vgx2], {z12.h, z13.h}, {z8.h, z9.h} // 11000001-10101000-01110001-10001111
+// CHECK-INST: fmla za.h[w11, 7, vgx2], { z12.h, z13.h }, { z8.h, z9.h }
+// CHECK-ENCODING: [0x8f,0x71,0xa8,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a8718f <unknown>
+
+fmla za.h[w11, 7], {z12.h - z13.h}, {z8.h - z9.h} // 11000001-10101000-01110001-10001111
+// CHECK-INST: fmla za.h[w11, 7, vgx2], { z12.h, z13.h }, { z8.h, z9.h }
+// CHECK-ENCODING: [0x8f,0x71,0xa8,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a8718f <unknown>
+
+fmla za.h[w11, 7, vgx2], {z30.h, z31.h}, {z30.h, z31.h} // 11000001-10111110-01110011-11001111
+// CHECK-INST: fmla za.h[w11, 7, vgx2], { z30.h, z31.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xcf,0x73,0xbe,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1be73cf <unknown>
+
+fmla za.h[w11, 7], {z30.h - z31.h}, {z30.h - z31.h} // 11000001-10111110-01110011-11001111
+// CHECK-INST: fmla za.h[w11, 7, vgx2], { z30.h, z31.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xcf,0x73,0xbe,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1be73cf <unknown>
+
+fmla za.h[w8, 5, vgx2], {z16.h, z17.h}, {z16.h, z17.h} // 11000001-10110000-00010010-00001101
+// CHECK-INST: fmla za.h[w8, 5, vgx2], { z16.h, z17.h }, { z16.h, z17.h }
+// CHECK-ENCODING: [0x0d,0x12,0xb0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b0120d <unknown>
+
+fmla za.h[w8, 5], {z16.h - z17.h}, {z16.h - z17.h} // 11000001-10110000-00010010-00001101
+// CHECK-INST: fmla za.h[w8, 5, vgx2], { z16.h, z17.h }, { z16.h, z17.h }
+// CHECK-ENCODING: [0x0d,0x12,0xb0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b0120d <unknown>
+
+fmla za.h[w8, 1, vgx2], {z0.h, z1.h}, {z30.h, z31.h} // 11000001-10111110-00010000-00001001
+// CHECK-INST: fmla za.h[w8, 1, vgx2], { z0.h, z1.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0x09,0x10,0xbe,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1be1009 <unknown>
+
+fmla za.h[w8, 1], {z0.h - z1.h}, {z30.h - z31.h} // 11000001-10111110-00010000-00001001
+// CHECK-INST: fmla za.h[w8, 1, vgx2], { z0.h, z1.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0x09,0x10,0xbe,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1be1009 <unknown>
+
+fmla za.h[w10, 0, vgx2], {z18.h, z19.h}, {z20.h, z21.h} // 11000001-10110100-01010010-01001000
+// CHECK-INST: fmla za.h[w10, 0, vgx2], { z18.h, z19.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x48,0x52,0xb4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b45248 <unknown>
+
+fmla za.h[w10, 0], {z18.h - z19.h}, {z20.h - z21.h} // 11000001-10110100-01010010-01001000
+// CHECK-INST: fmla za.h[w10, 0, vgx2], { z18.h, z19.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x48,0x52,0xb4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b45248 <unknown>
+
+fmla za.h[w8, 0, vgx2], {z12.h, z13.h}, {z2.h, z3.h} // 11000001-10100010-00010001-10001000
+// CHECK-INST: fmla za.h[w8, 0, vgx2], { z12.h, z13.h }, { z2.h, z3.h }
+// CHECK-ENCODING: [0x88,0x11,0xa2,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a21188 <unknown>
+
+fmla za.h[w8, 0], {z12.h - z13.h}, {z2.h - z3.h} // 11000001-10100010-00010001-10001000
+// CHECK-INST: fmla za.h[w8, 0, vgx2], { z12.h, z13.h }, { z2.h, z3.h }
+// CHECK-ENCODING: [0x88,0x11,0xa2,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a21188 <unknown>
+
+fmla za.h[w10, 1, vgx2], {z0.h, z1.h}, {z26.h, z27.h} // 11000001-10111010-01010000-00001001
+// CHECK-INST: fmla za.h[w10, 1, vgx2], { z0.h, z1.h }, { z26.h, z27.h }
+// CHECK-ENCODING: [0x09,0x50,0xba,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1ba5009 <unknown>
+
+fmla za.h[w10, 1], {z0.h - z1.h}, {z26.h - z27.h} // 11000001-10111010-01010000-00001001
+// CHECK-INST: fmla za.h[w10, 1, vgx2], { z0.h, z1.h }, { z26.h, z27.h }
+// CHECK-ENCODING: [0x09,0x50,0xba,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1ba5009 <unknown>
+
+fmla za.h[w8, 5, vgx2], {z22.h, z23.h}, {z30.h, z31.h} // 11000001-10111110-00010010-11001101
+// CHECK-INST: fmla za.h[w8, 5, vgx2], { z22.h, z23.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xcd,0x12,0xbe,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1be12cd <unknown>
+
+fmla za.h[w8, 5], {z22.h - z23.h}, {z30.h - z31.h} // 11000001-10111110-00010010-11001101
+// CHECK-INST: fmla za.h[w8, 5, vgx2], { z22.h, z23.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xcd,0x12,0xbe,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1be12cd <unknown>
+
+fmla za.h[w11, 2, vgx2], {z8.h, z9.h}, {z0.h, z1.h} // 11000001-10100000-01110001-00001010
+// CHECK-INST: fmla za.h[w11, 2, vgx2], { z8.h, z9.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x0a,0x71,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a0710a <unknown>
+
+fmla za.h[w11, 2], {z8.h - z9.h}, {z0.h - z1.h} // 11000001-10100000-01110001-00001010
+// CHECK-INST: fmla za.h[w11, 2, vgx2], { z8.h, z9.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x0a,0x71,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a0710a <unknown>
+
+fmla za.h[w9, 7, vgx2], {z12.h, z13.h}, {z10.h, z11.h} // 11000001-10101010-00110001-10001111
+// CHECK-INST: fmla za.h[w9, 7, vgx2], { z12.h, z13.h }, { z10.h, z11.h }
+// CHECK-ENCODING: [0x8f,0x31,0xaa,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1aa318f <unknown>
+
+fmla za.h[w9, 7], {z12.h - z13.h}, {z10.h - z11.h} // 11000001-10101010-00110001-10001111
+// CHECK-INST: fmla za.h[w9, 7, vgx2], { z12.h, z13.h }, { z10.h, z11.h }
+// CHECK-ENCODING: [0x8f,0x31,0xaa,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1aa318f <unknown>
+
+
+fmla za.h[w8, 0, vgx4], {z0.h - z3.h}, z0.h // 11000001-00110000-00011100-00000000
+// CHECK-INST: fmla za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h
+// CHECK-ENCODING: [0x00,0x1c,0x30,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1301c00 <unknown>
+
+fmla za.h[w8, 0], {z0.h - z3.h}, z0.h // 11000001-00110000-00011100-00000000
+// CHECK-INST: fmla za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h
+// CHECK-ENCODING: [0x00,0x1c,0x30,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1301c00 <unknown>
+
+fmla za.h[w10, 5, vgx4], {z10.h - z13.h}, z5.h // 11000001-00110101-01011101-01000101
+// CHECK-INST: fmla za.h[w10, 5, vgx4], { z10.h - z13.h }, z5.h
+// CHECK-ENCODING: [0x45,0x5d,0x35,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1355d45 <unknown>
+
+fmla za.h[w10, 5], {z10.h - z13.h}, z5.h // 11000001-00110101-01011101-01000101
+// CHECK-INST: fmla za.h[w10, 5, vgx4], { z10.h - z13.h }, z5.h
+// CHECK-ENCODING: [0x45,0x5d,0x35,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1355d45 <unknown>
+
+fmla za.h[w11, 7, vgx4], {z13.h - z16.h}, z8.h // 11000001-00111000-01111101-10100111
+// CHECK-INST: fmla za.h[w11, 7, vgx4], { z13.h - z16.h }, z8.h
+// CHECK-ENCODING: [0xa7,0x7d,0x38,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1387da7 <unknown>
+
+fmla za.h[w11, 7], {z13.h - z16.h}, z8.h // 11000001-00111000-01111101-10100111
+// CHECK-INST: fmla za.h[w11, 7, vgx4], { z13.h - z16.h }, z8.h
+// CHECK-ENCODING: [0xa7,0x7d,0x38,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1387da7 <unknown>
+
+fmla za.h[w11, 7, vgx4], {z31.h, z0.h, z1.h, z2.h}, z15.h // 11000001-00111111-01111111-11100111
+// CHECK-INST: fmla za.h[w11, 7, vgx4], { z31.h, z0.h, z1.h, z2.h }, z15.h
+// CHECK-ENCODING: [0xe7,0x7f,0x3f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13f7fe7 <unknown>
+
+fmla za.h[w11, 7], {z31.h, z0.h, z1.h, z2.h}, z15.h // 11000001-00111111-01111111-11100111
+// CHECK-INST: fmla za.h[w11, 7, vgx4], { z31.h, z0.h, z1.h, z2.h }, z15.h
+// CHECK-ENCODING: [0xe7,0x7f,0x3f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13f7fe7 <unknown>
+
+fmla za.h[w8, 5, vgx4], {z17.h - z20.h}, z0.h // 11000001-00110000-00011110-00100101
+// CHECK-INST: fmla za.h[w8, 5, vgx4], { z17.h - z20.h }, z0.h
+// CHECK-ENCODING: [0x25,0x1e,0x30,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1301e25 <unknown>
+
+fmla za.h[w8, 5], {z17.h - z20.h}, z0.h // 11000001-00110000-00011110-00100101
+// CHECK-INST: fmla za.h[w8, 5, vgx4], { z17.h - z20.h }, z0.h
+// CHECK-ENCODING: [0x25,0x1e,0x30,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1301e25 <unknown>
+
+fmla za.h[w8, 1, vgx4], {z1.h - z4.h}, z14.h // 11000001-00111110-00011100-00100001
+// CHECK-INST: fmla za.h[w8, 1, vgx4], { z1.h - z4.h }, z14.h
+// CHECK-ENCODING: [0x21,0x1c,0x3e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13e1c21 <unknown>
+
+fmla za.h[w8, 1], {z1.h - z4.h}, z14.h // 11000001-00111110-00011100-00100001
+// CHECK-INST: fmla za.h[w8, 1, vgx4], { z1.h - z4.h }, z14.h
+// CHECK-ENCODING: [0x21,0x1c,0x3e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13e1c21 <unknown>
+
+fmla za.h[w10, 0, vgx4], {z19.h - z22.h}, z4.h // 11000001-00110100-01011110-01100000
+// CHECK-INST: fmla za.h[w10, 0, vgx4], { z19.h - z22.h }, z4.h
+// CHECK-ENCODING: [0x60,0x5e,0x34,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1345e60 <unknown>
+
+fmla za.h[w10, 0], {z19.h - z22.h}, z4.h // 11000001-00110100-01011110-01100000
+// CHECK-INST: fmla za.h[w10, 0, vgx4], { z19.h - z22.h }, z4.h
+// CHECK-ENCODING: [0x60,0x5e,0x34,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1345e60 <unknown>
+
+fmla za.h[w8, 0, vgx4], {z12.h - z15.h}, z2.h // 11000001-00110010-00011101-10000000
+// CHECK-INST: fmla za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h
+// CHECK-ENCODING: [0x80,0x1d,0x32,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1321d80 <unknown>
+
+fmla za.h[w8, 0], {z12.h - z15.h}, z2.h // 11000001-00110010-00011101-10000000
+// CHECK-INST: fmla za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h
+// CHECK-ENCODING: [0x80,0x1d,0x32,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1321d80 <unknown>
+
+fmla za.h[w10, 1, vgx4], {z1.h - z4.h}, z10.h // 11000001-00111010-01011100-00100001
+// CHECK-INST: fmla za.h[w10, 1, vgx4], { z1.h - z4.h }, z10.h
+// CHECK-ENCODING: [0x21,0x5c,0x3a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13a5c21 <unknown>
+
+fmla za.h[w10, 1], {z1.h - z4.h}, z10.h // 11000001-00111010-01011100-00100001
+// CHECK-INST: fmla za.h[w10, 1, vgx4], { z1.h - z4.h }, z10.h
+// CHECK-ENCODING: [0x21,0x5c,0x3a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13a5c21 <unknown>
+
+fmla za.h[w8, 5, vgx4], {z22.h - z25.h}, z14.h // 11000001-00111110-00011110-11000101
+// CHECK-INST: fmla za.h[w8, 5, vgx4], { z22.h - z25.h }, z14.h
+// CHECK-ENCODING: [0xc5,0x1e,0x3e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13e1ec5 <unknown>
+
+fmla za.h[w8, 5], {z22.h - z25.h}, z14.h // 11000001-00111110-00011110-11000101
+// CHECK-INST: fmla za.h[w8, 5, vgx4], { z22.h - z25.h }, z14.h
+// CHECK-ENCODING: [0xc5,0x1e,0x3e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13e1ec5 <unknown>
+
+fmla za.h[w11, 2, vgx4], {z9.h - z12.h}, z1.h // 11000001-00110001-01111101-00100010
+// CHECK-INST: fmla za.h[w11, 2, vgx4], { z9.h - z12.h }, z1.h
+// CHECK-ENCODING: [0x22,0x7d,0x31,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1317d22 <unknown>
+
+fmla za.h[w11, 2], {z9.h - z12.h}, z1.h // 11000001-00110001-01111101-00100010
+// CHECK-INST: fmla za.h[w11, 2, vgx4], { z9.h - z12.h }, z1.h
+// CHECK-ENCODING: [0x22,0x7d,0x31,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1317d22 <unknown>
+
+fmla za.h[w9, 7, vgx4], {z12.h - z15.h}, z11.h // 11000001-00111011-00111101-10000111
+// CHECK-INST: fmla za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h
+// CHECK-ENCODING: [0x87,0x3d,0x3b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13b3d87 <unknown>
+
+fmla za.h[w9, 7], {z12.h - z15.h}, z11.h // 11000001-00111011-00111101-10000111
+// CHECK-INST: fmla za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h
+// CHECK-ENCODING: [0x87,0x3d,0x3b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13b3d87 <unknown>
+
+fmla za.h[w8, 0, vgx4], {z0.h - z3.h}, z0.h[0] // 11000001-00010000-10010000-00000000
+// CHECK-INST: fmla za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h[0]
+// CHECK-ENCODING: [0x00,0x90,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1109000 <unknown>
+
+fmla za.h[w8, 0], {z0.h - z3.h}, z0.h[0] // 11000001-00010000-10010000-00000000
+// CHECK-INST: fmla za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h[0]
+// CHECK-ENCODING: [0x00,0x90,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1109000 <unknown>
+
+fmla za.h[w10, 5, vgx4], {z8.h - z11.h}, z5.h[2] // 11000001-00010101-11010101-00000101
+// CHECK-INST: fmla za.h[w10, 5, vgx4], { z8.h - z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x05,0xd5,0x15,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c115d505 <unknown>
+
+fmla za.h[w10, 5], {z8.h - z11.h}, z5.h[2] // 11000001-00010101-11010101-00000101
+// CHECK-INST: fmla za.h[w10, 5, vgx4], { z8.h - z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x05,0xd5,0x15,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c115d505 <unknown>
+
+fmla za.h[w11, 7, vgx4], {z12.h - z15.h}, z8.h[6] // 11000001-00011000-11111101-10000111
+// CHECK-INST: fmla za.h[w11, 7, vgx4], { z12.h - z15.h }, z8.h[6]
+// CHECK-ENCODING: [0x87,0xfd,0x18,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c118fd87 <unknown>
+
+fmla za.h[w11, 7], {z12.h - z15.h}, z8.h[6] // 11000001-00011000-11111101-10000111
+// CHECK-INST: fmla za.h[w11, 7, vgx4], { z12.h - z15.h }, z8.h[6]
+// CHECK-ENCODING: [0x87,0xfd,0x18,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c118fd87 <unknown>
+
+fmla za.h[w11, 7, vgx4], {z28.h - z31.h}, z15.h[7] // 11000001-00011111-11111111-10001111
+// CHECK-INST: fmla za.h[w11, 7, vgx4], { z28.h - z31.h }, z15.h[7]
+// CHECK-ENCODING: [0x8f,0xff,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11fff8f <unknown>
+
+fmla za.h[w11, 7], {z28.h - z31.h}, z15.h[7] // 11000001-00011111-11111111-10001111
+// CHECK-INST: fmla za.h[w11, 7, vgx4], { z28.h - z31.h }, z15.h[7]
+// CHECK-ENCODING: [0x8f,0xff,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11fff8f <unknown>
+
+fmla za.h[w8, 5, vgx4], {z16.h - z19.h}, z0.h[6] // 11000001-00010000-10011110-00000101
+// CHECK-INST: fmla za.h[w8, 5, vgx4], { z16.h - z19.h }, z0.h[6]
+// CHECK-ENCODING: [0x05,0x9e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1109e05 <unknown>
+
+fmla za.h[w8, 5], {z16.h - z19.h}, z0.h[6] // 11000001-00010000-10011110-00000101
+// CHECK-INST: fmla za.h[w8, 5, vgx4], { z16.h - z19.h }, z0.h[6]
+// CHECK-ENCODING: [0x05,0x9e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1109e05 <unknown>
+
+fmla za.h[w8, 1, vgx4], {z0.h - z3.h}, z14.h[2] // 11000001-00011110-10010100-00000001
+// CHECK-INST: fmla za.h[w8, 1, vgx4], { z0.h - z3.h }, z14.h[2]
+// CHECK-ENCODING: [0x01,0x94,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e9401 <unknown>
+
+fmla za.h[w8, 1], {z0.h - z3.h}, z14.h[2] // 11000001-00011110-10010100-00000001
+// CHECK-INST: fmla za.h[w8, 1, vgx4], { z0.h - z3.h }, z14.h[2]
+// CHECK-ENCODING: [0x01,0x94,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e9401 <unknown>
+
+fmla za.h[w10, 0, vgx4], {z16.h - z19.h}, z4.h[3] // 11000001-00010100-11010110-00001000
+// CHECK-INST: fmla za.h[w10, 0, vgx4], { z16.h - z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x08,0xd6,0x14,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c114d608 <unknown>
+
+fmla za.h[w10, 0], {z16.h - z19.h}, z4.h[3] // 11000001-00010100-11010110-00001000
+// CHECK-INST: fmla za.h[w10, 0, vgx4], { z16.h - z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x08,0xd6,0x14,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c114d608 <unknown>
+
+fmla za.h[w8, 0, vgx4], {z12.h - z15.h}, z2.h[4] // 11000001-00010010-10011001-10000000
+// CHECK-INST: fmla za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h[4]
+// CHECK-ENCODING: [0x80,0x99,0x12,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1129980 <unknown>
+
+fmla za.h[w8, 0], {z12.h - z15.h}, z2.h[4] // 11000001-00010010-10011001-10000000
+// CHECK-INST: fmla za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h[4]
+// CHECK-ENCODING: [0x80,0x99,0x12,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1129980 <unknown>
+
+fmla za.h[w10, 1, vgx4], {z0.h - z3.h}, z10.h[4] // 11000001-00011010-11011000-00000001
+// CHECK-INST: fmla za.h[w10, 1, vgx4], { z0.h - z3.h }, z10.h[4]
+// CHECK-ENCODING: [0x01,0xd8,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11ad801 <unknown>
+
+fmla za.h[w10, 1], {z0.h - z3.h}, z10.h[4] // 11000001-00011010-11011000-00000001
+// CHECK-INST: fmla za.h[w10, 1, vgx4], { z0.h - z3.h }, z10.h[4]
+// CHECK-ENCODING: [0x01,0xd8,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11ad801 <unknown>
+
+fmla za.h[w8, 5, vgx4], {z20.h - z23.h}, z14.h[5] // 11000001-00011110-10011010-10001101
+// CHECK-INST: fmla za.h[w8, 5, vgx4], { z20.h - z23.h }, z14.h[5]
+// CHECK-ENCODING: [0x8d,0x9a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e9a8d <unknown>
+
+fmla za.h[w8, 5], {z20.h - z23.h}, z14.h[5] // 11000001-00011110-10011010-10001101
+// CHECK-INST: fmla za.h[w8, 5, vgx4], { z20.h - z23.h }, z14.h[5]
+// CHECK-ENCODING: [0x8d,0x9a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e9a8d <unknown>
+
+fmla za.h[w11, 2, vgx4], {z8.h - z11.h}, z1.h[2] // 11000001-00010001-11110101-00000010
+// CHECK-INST: fmla za.h[w11, 2, vgx4], { z8.h - z11.h }, z1.h[2]
+// CHECK-ENCODING: [0x02,0xf5,0x11,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c111f502 <unknown>
+
+fmla za.h[w11, 2], {z8.h - z11.h}, z1.h[2] // 11000001-00010001-11110101-00000010
+// CHECK-INST: fmla za.h[w11, 2, vgx4], { z8.h - z11.h }, z1.h[2]
+// CHECK-ENCODING: [0x02,0xf5,0x11,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c111f502 <unknown>
+
+fmla za.h[w9, 7, vgx4], {z12.h - z15.h}, z11.h[4] // 11000001-00011011-10111001-10000111
+// CHECK-INST: fmla za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h[4]
+// CHECK-ENCODING: [0x87,0xb9,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11bb987 <unknown>
+
+fmla za.h[w9, 7], {z12.h - z15.h}, z11.h[4] // 11000001-00011011-10111001-10000111
+// CHECK-INST: fmla za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h[4]
+// CHECK-ENCODING: [0x87,0xb9,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11bb987 <unknown>
+
+fmla za.h[w8, 0, vgx4], {z0.h - z3.h}, {z0.h - z3.h} // 11000001-10100001-00010000-00001000
+// CHECK-INST: fmla za.h[w8, 0, vgx4], { z0.h - z3.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x08,0x10,0xa1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a11008 <unknown>
+
+fmla za.h[w8, 0], {z0.h - z3.h}, {z0.h - z3.h} // 11000001-10100001-00010000-00001000
+// CHECK-INST: fmla za.h[w8, 0, vgx4], { z0.h - z3.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x08,0x10,0xa1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a11008 <unknown>
+
+fmla za.h[w10, 5, vgx4], {z8.h - z11.h}, {z20.h - z23.h} // 11000001-10110101-01010001-00001101
+// CHECK-INST: fmla za.h[w10, 5, vgx4], { z8.h - z11.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x0d,0x51,0xb5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b5510d <unknown>
+
+fmla za.h[w10, 5], {z8.h - z11.h}, {z20.h - z23.h} // 11000001-10110101-01010001-00001101
+// CHECK-INST: fmla za.h[w10, 5, vgx4], { z8.h - z11.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x0d,0x51,0xb5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b5510d <unknown>
+
+fmla za.h[w11, 7, vgx4], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-10101001-01110001-10001111
+// CHECK-INST: fmla za.h[w11, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x8f,0x71,0xa9,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a9718f <unknown>
+
+fmla za.h[w11, 7], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-10101001-01110001-10001111
+// CHECK-INST: fmla za.h[w11, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x8f,0x71,0xa9,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a9718f <unknown>
+
+fmla za.h[w11, 7, vgx4], {z28.h - z31.h}, {z28.h - z31.h} // 11000001-10111101-01110011-10001111
+// CHECK-INST: fmla za.h[w11, 7, vgx4], { z28.h - z31.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x8f,0x73,0xbd,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1bd738f <unknown>
+
+fmla za.h[w11, 7], {z28.h - z31.h}, {z28.h - z31.h} // 11000001-10111101-01110011-10001111
+// CHECK-INST: fmla za.h[w11, 7, vgx4], { z28.h - z31.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x8f,0x73,0xbd,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1bd738f <unknown>
+
+fmla za.h[w8, 5, vgx4], {z16.h - z19.h}, {z16.h - z19.h} // 11000001-10110001-00010010-00001101
+// CHECK-INST: fmla za.h[w8, 5, vgx4], { z16.h - z19.h }, { z16.h - z19.h }
+// CHECK-ENCODING: [0x0d,0x12,0xb1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b1120d <unknown>
+
+fmla za.h[w8, 5], {z16.h - z19.h}, {z16.h - z19.h} // 11000001-10110001-00010010-00001101
+// CHECK-INST: fmla za.h[w8, 5, vgx4], { z16.h - z19.h }, { z16.h - z19.h }
+// CHECK-ENCODING: [0x0d,0x12,0xb1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b1120d <unknown>
+
+fmla za.h[w8, 1, vgx4], {z0.h - z3.h}, {z28.h - z31.h} // 11000001-10111101-00010000-00001001
+// CHECK-INST: fmla za.h[w8, 1, vgx4], { z0.h - z3.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x09,0x10,0xbd,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1bd1009 <unknown>
+
+fmla za.h[w8, 1], {z0.h - z3.h}, {z28.h - z31.h} // 11000001-10111101-00010000-00001001
+// CHECK-INST: fmla za.h[w8, 1, vgx4], { z0.h - z3.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x09,0x10,0xbd,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1bd1009 <unknown>
+
+fmla za.h[w10, 0, vgx4], {z16.h - z19.h}, {z20.h - z23.h} // 11000001-10110101-01010010-00001000
+// CHECK-INST: fmla za.h[w10, 0, vgx4], { z16.h - z19.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x08,0x52,0xb5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b55208 <unknown>
+
+fmla za.h[w10, 0], {z16.h - z19.h}, {z20.h - z23.h} // 11000001-10110101-01010010-00001000
+// CHECK-INST: fmla za.h[w10, 0, vgx4], { z16.h - z19.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x08,0x52,0xb5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b55208 <unknown>
+
+fmla za.h[w8, 0, vgx4], {z12.h - z15.h}, {z0.h - z3.h} // 11000001-10100001-00010001-10001000
+// CHECK-INST: fmla za.h[w8, 0, vgx4], { z12.h - z15.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x88,0x11,0xa1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a11188 <unknown>
+
+fmla za.h[w8, 0], {z12.h - z15.h}, {z0.h - z3.h} // 11000001-10100001-00010001-10001000
+// CHECK-INST: fmla za.h[w8, 0, vgx4], { z12.h - z15.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x88,0x11,0xa1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a11188 <unknown>
+
+fmla za.h[w10, 1, vgx4], {z0.h - z3.h}, {z24.h - z27.h} // 11000001-10111001-01010000-00001001
+// CHECK-INST: fmla za.h[w10, 1, vgx4], { z0.h - z3.h }, { z24.h - z27.h }
+// CHECK-ENCODING: [0x09,0x50,0xb9,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b95009 <unknown>
+
+fmla za.h[w10, 1], {z0.h - z3.h}, {z24.h - z27.h} // 11000001-10111001-01010000-00001001
+// CHECK-INST: fmla za.h[w10, 1, vgx4], { z0.h - z3.h }, { z24.h - z27.h }
+// CHECK-ENCODING: [0x09,0x50,0xb9,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b95009 <unknown>
+
+fmla za.h[w8, 5, vgx4], {z20.h - z23.h}, {z28.h - z31.h} // 11000001-10111101-00010010-10001101
+// CHECK-INST: fmla za.h[w8, 5, vgx4], { z20.h - z23.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x8d,0x12,0xbd,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1bd128d <unknown>
+
+fmla za.h[w8, 5], {z20.h - z23.h}, {z28.h - z31.h} // 11000001-10111101-00010010-10001101
+// CHECK-INST: fmla za.h[w8, 5, vgx4], { z20.h - z23.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x8d,0x12,0xbd,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1bd128d <unknown>
+
+fmla za.h[w11, 2, vgx4], {z8.h - z11.h}, {z0.h - z3.h} // 11000001-10100001-01110001-00001010
+// CHECK-INST: fmla za.h[w11, 2, vgx4], { z8.h - z11.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x0a,0x71,0xa1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a1710a <unknown>
+
+fmla za.h[w11, 2], {z8.h - z11.h}, {z0.h - z3.h} // 11000001-10100001-01110001-00001010
+// CHECK-INST: fmla za.h[w11, 2, vgx4], { z8.h - z11.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x0a,0x71,0xa1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a1710a <unknown>
+
+fmla za.h[w9, 7, vgx4], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-10101001-00110001-10001111
+// CHECK-INST: fmla za.h[w9, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x8f,0x31,0xa9,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a9318f <unknown>
+
+fmla za.h[w9, 7], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-10101001-00110001-10001111
+// CHECK-INST: fmla za.h[w9, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x8f,0x31,0xa9,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a9318f <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/fmls-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/fmls-diagnostics.s
new file mode 100644
index 000000000000..2174e4202ba0
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fmls-diagnostics.s
@@ -0,0 +1,94 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+fmls za.h[w11, 2, vgx2], {z12.h-z14.h}, z8.h[3]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: fmls za.h[w11, 2, vgx2], {z12.h-z14.h}, z8.h[3]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmls za.h[w11, 2, vgx4], {z12.h-z17.h}, z7.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid number of vectors
+// CHECK-NEXT: fmls za.h[w11, 2, vgx4], {z12.h-z17.h}, z7.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmls za.h[w10, 3, vgx2], {z10.h-z11.h}, {z21.h-z22.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors, where the first vector is a multiple of 2 and with matching element types
+// CHECK-NEXT: fmls za.h[w10, 3, vgx2], {z10.h-z11.h}, {z21.h-z22.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmls za.h[w11, 7, vgx4], {z12.h-z15.h}, {z9.h-z12.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 4 consecutive SVE vectors, where the first vector is a multiple of 4 and with matching element types
+// CHECK-NEXT: fmls za.h[w11, 7, vgx4], {z12.h-z15.h}, {z9.h-z12.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid indexed-vector or single-vector register
+
+fmls za.h[w8, 0], {z0.h-z1.h}, z16.h[0]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: fmls za.h[w8, 0], {z0.h-z1.h}, z16.h[0]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmls za.h[w8, 1], {z0.h-z3.h}, z16.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid restricted vector register, expected z0.h..z15.h
+// CHECK-NEXT: fmls za.h[w8, 1], {z0.h-z3.h}, z16.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector select register
+
+fmls za.h[w7, 7, vgx4], {z12.h-z15.h}, {z8.h-z11.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: fmls za.h[w7, 7, vgx4], {z12.h-z15.h}, {z8.h-z11.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmls za.h[w12, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: fmls za.h[w12, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector select offset
+
+fmls za.h[w8, -1, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: fmls za.h[w8, -1, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmls za.h[w8, 8, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: fmls za.h[w8, 8, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid Register Suffix
+
+fmls za.d[w8, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand, expected suffix .s
+// CHECK-NEXT: fmls za.d[w8, 7, vgx2], {z12.h-z13.h}, {z8.h-z9.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector lane index
+
+fmls za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[8]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: fmls za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[8]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmls za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: fmls za.h[w11, 6, vgx2], {z12.h-z13.h}, z8.h[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmls za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: fmls za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmls za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[8]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: fmls za.h[w11, 7, vgx4], {z12.h-z15.h}, z8.h[8]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/fmls.s b/llvm/test/MC/AArch64/SME2p1/fmls.s
new file mode 100644
index 000000000000..9bbb21869e37
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fmls.s
@@ -0,0 +1,878 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+sme-f16f16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+sme-f16f16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+fmls za.h[w8, 0, vgx2], {z0.h, z1.h}, z0.h // 11000001-00100000-00011100-00001000
+// CHECK-INST: fmls za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h
+// CHECK-ENCODING: [0x08,0x1c,0x20,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1201c08 <unknown>
+
+fmls za.h[w8, 0], {z0.h - z1.h}, z0.h // 11000001-00100000-00011100-00001000
+// CHECK-INST: fmls za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h
+// CHECK-ENCODING: [0x08,0x1c,0x20,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1201c08 <unknown>
+
+fmls za.h[w10, 5, vgx2], {z10.h, z11.h}, z5.h // 11000001-00100101-01011101-01001101
+// CHECK-INST: fmls za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h
+// CHECK-ENCODING: [0x4d,0x5d,0x25,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1255d4d <unknown>
+
+fmls za.h[w10, 5], {z10.h - z11.h}, z5.h // 11000001-00100101-01011101-01001101
+// CHECK-INST: fmls za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h
+// CHECK-ENCODING: [0x4d,0x5d,0x25,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1255d4d <unknown>
+
+fmls za.h[w11, 7, vgx2], {z13.h, z14.h}, z8.h // 11000001-00101000-01111101-10101111
+// CHECK-INST: fmls za.h[w11, 7, vgx2], { z13.h, z14.h }, z8.h
+// CHECK-ENCODING: [0xaf,0x7d,0x28,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1287daf <unknown>
+
+fmls za.h[w11, 7], {z13.h - z14.h}, z8.h // 11000001-00101000-01111101-10101111
+// CHECK-INST: fmls za.h[w11, 7, vgx2], { z13.h, z14.h }, z8.h
+// CHECK-ENCODING: [0xaf,0x7d,0x28,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1287daf <unknown>
+
+fmls za.h[w11, 7, vgx2], {z31.h, z0.h}, z15.h // 11000001-00101111-01111111-11101111
+// CHECK-INST: fmls za.h[w11, 7, vgx2], { z31.h, z0.h }, z15.h
+// CHECK-ENCODING: [0xef,0x7f,0x2f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12f7fef <unknown>
+
+fmls za.h[w11, 7], {z31.h - z0.h}, z15.h // 11000001-00101111-01111111-11101111
+// CHECK-INST: fmls za.h[w11, 7, vgx2], { z31.h, z0.h }, z15.h
+// CHECK-ENCODING: [0xef,0x7f,0x2f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12f7fef <unknown>
+
+fmls za.h[w8, 5, vgx2], {z17.h, z18.h}, z0.h // 11000001-00100000-00011110-00101101
+// CHECK-INST: fmls za.h[w8, 5, vgx2], { z17.h, z18.h }, z0.h
+// CHECK-ENCODING: [0x2d,0x1e,0x20,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1201e2d <unknown>
+
+fmls za.h[w8, 5], {z17.h - z18.h}, z0.h // 11000001-00100000-00011110-00101101
+// CHECK-INST: fmls za.h[w8, 5, vgx2], { z17.h, z18.h }, z0.h
+// CHECK-ENCODING: [0x2d,0x1e,0x20,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1201e2d <unknown>
+
+fmls za.h[w8, 1, vgx2], {z1.h, z2.h}, z14.h // 11000001-00101110-00011100-00101001
+// CHECK-INST: fmls za.h[w8, 1, vgx2], { z1.h, z2.h }, z14.h
+// CHECK-ENCODING: [0x29,0x1c,0x2e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12e1c29 <unknown>
+
+fmls za.h[w8, 1], {z1.h - z2.h}, z14.h // 11000001-00101110-00011100-00101001
+// CHECK-INST: fmls za.h[w8, 1, vgx2], { z1.h, z2.h }, z14.h
+// CHECK-ENCODING: [0x29,0x1c,0x2e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12e1c29 <unknown>
+
+fmls za.h[w10, 0, vgx2], {z19.h, z20.h}, z4.h // 11000001-00100100-01011110-01101000
+// CHECK-INST: fmls za.h[w10, 0, vgx2], { z19.h, z20.h }, z4.h
+// CHECK-ENCODING: [0x68,0x5e,0x24,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1245e68 <unknown>
+
+fmls za.h[w10, 0], {z19.h - z20.h}, z4.h // 11000001-00100100-01011110-01101000
+// CHECK-INST: fmls za.h[w10, 0, vgx2], { z19.h, z20.h }, z4.h
+// CHECK-ENCODING: [0x68,0x5e,0x24,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1245e68 <unknown>
+
+fmls za.h[w8, 0, vgx2], {z12.h, z13.h}, z2.h // 11000001-00100010-00011101-10001000
+// CHECK-INST: fmls za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h
+// CHECK-ENCODING: [0x88,0x1d,0x22,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1221d88 <unknown>
+
+fmls za.h[w8, 0], {z12.h - z13.h}, z2.h // 11000001-00100010-00011101-10001000
+// CHECK-INST: fmls za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h
+// CHECK-ENCODING: [0x88,0x1d,0x22,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1221d88 <unknown>
+
+fmls za.h[w10, 1, vgx2], {z1.h, z2.h}, z10.h // 11000001-00101010-01011100-00101001
+// CHECK-INST: fmls za.h[w10, 1, vgx2], { z1.h, z2.h }, z10.h
+// CHECK-ENCODING: [0x29,0x5c,0x2a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12a5c29 <unknown>
+
+fmls za.h[w10, 1], {z1.h - z2.h}, z10.h // 11000001-00101010-01011100-00101001
+// CHECK-INST: fmls za.h[w10, 1, vgx2], { z1.h, z2.h }, z10.h
+// CHECK-ENCODING: [0x29,0x5c,0x2a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12a5c29 <unknown>
+
+fmls za.h[w8, 5, vgx2], {z22.h, z23.h}, z14.h // 11000001-00101110-00011110-11001101
+// CHECK-INST: fmls za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h
+// CHECK-ENCODING: [0xcd,0x1e,0x2e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12e1ecd <unknown>
+
+fmls za.h[w8, 5], {z22.h - z23.h}, z14.h // 11000001-00101110-00011110-11001101
+// CHECK-INST: fmls za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h
+// CHECK-ENCODING: [0xcd,0x1e,0x2e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12e1ecd <unknown>
+
+fmls za.h[w11, 2, vgx2], {z9.h, z10.h}, z1.h // 11000001-00100001-01111101-00101010
+// CHECK-INST: fmls za.h[w11, 2, vgx2], { z9.h, z10.h }, z1.h
+// CHECK-ENCODING: [0x2a,0x7d,0x21,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1217d2a <unknown>
+
+fmls za.h[w11, 2], {z9.h - z10.h}, z1.h // 11000001-00100001-01111101-00101010
+// CHECK-INST: fmls za.h[w11, 2, vgx2], { z9.h, z10.h }, z1.h
+// CHECK-ENCODING: [0x2a,0x7d,0x21,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1217d2a <unknown>
+
+fmls za.h[w9, 7, vgx2], {z12.h, z13.h}, z11.h // 11000001-00101011-00111101-10001111
+// CHECK-INST: fmls za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h
+// CHECK-ENCODING: [0x8f,0x3d,0x2b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12b3d8f <unknown>
+
+fmls za.h[w9, 7], {z12.h - z13.h}, z11.h // 11000001-00101011-00111101-10001111
+// CHECK-INST: fmls za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h
+// CHECK-ENCODING: [0x8f,0x3d,0x2b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c12b3d8f <unknown>
+
+
+fmls za.h[w8, 0, vgx2], {z0.h, z1.h}, z0.h[0] // 11000001-00010000-00010000-00010000
+// CHECK-INST: fmls za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h[0]
+// CHECK-ENCODING: [0x10,0x10,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1101010 <unknown>
+
+fmls za.h[w8, 0], {z0.h - z1.h}, z0.h[0] // 11000001-00010000-00010000-00010000
+// CHECK-INST: fmls za.h[w8, 0, vgx2], { z0.h, z1.h }, z0.h[0]
+// CHECK-ENCODING: [0x10,0x10,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1101010 <unknown>
+
+fmls za.h[w10, 5, vgx2], {z10.h, z11.h}, z5.h[2] // 11000001-00010101-01010101-01010101
+// CHECK-INST: fmls za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x55,0x55,0x15,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1155555 <unknown>
+
+fmls za.h[w10, 5], {z10.h - z11.h}, z5.h[2] // 11000001-00010101-01010101-01010101
+// CHECK-INST: fmls za.h[w10, 5, vgx2], { z10.h, z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x55,0x55,0x15,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1155555 <unknown>
+
+fmls za.h[w11, 7, vgx2], {z12.h, z13.h}, z8.h[6] // 11000001-00011000-01111101-10010111
+// CHECK-INST: fmls za.h[w11, 7, vgx2], { z12.h, z13.h }, z8.h[6]
+// CHECK-ENCODING: [0x97,0x7d,0x18,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1187d97 <unknown>
+
+fmls za.h[w11, 7], {z12.h - z13.h}, z8.h[6] // 11000001-00011000-01111101-10010111
+// CHECK-INST: fmls za.h[w11, 7, vgx2], { z12.h, z13.h }, z8.h[6]
+// CHECK-ENCODING: [0x97,0x7d,0x18,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1187d97 <unknown>
+
+fmls za.h[w11, 7, vgx2], {z30.h, z31.h}, z15.h[7] // 11000001-00011111-01111111-11011111
+// CHECK-INST: fmls za.h[w11, 7, vgx2], { z30.h, z31.h }, z15.h[7]
+// CHECK-ENCODING: [0xdf,0x7f,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11f7fdf <unknown>
+
+fmls za.h[w11, 7], {z30.h - z31.h}, z15.h[7] // 11000001-00011111-01111111-11011111
+// CHECK-INST: fmls za.h[w11, 7, vgx2], { z30.h, z31.h }, z15.h[7]
+// CHECK-ENCODING: [0xdf,0x7f,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11f7fdf <unknown>
+
+fmls za.h[w8, 5, vgx2], {z16.h, z17.h}, z0.h[6] // 11000001-00010000-00011110-00010101
+// CHECK-INST: fmls za.h[w8, 5, vgx2], { z16.h, z17.h }, z0.h[6]
+// CHECK-ENCODING: [0x15,0x1e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1101e15 <unknown>
+
+fmls za.h[w8, 5], {z16.h - z17.h}, z0.h[6] // 11000001-00010000-00011110-00010101
+// CHECK-INST: fmls za.h[w8, 5, vgx2], { z16.h, z17.h }, z0.h[6]
+// CHECK-ENCODING: [0x15,0x1e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1101e15 <unknown>
+
+fmls za.h[w8, 1, vgx2], {z0.h, z1.h}, z14.h[2] // 11000001-00011110-00010100-00010001
+// CHECK-INST: fmls za.h[w8, 1, vgx2], { z0.h, z1.h }, z14.h[2]
+// CHECK-ENCODING: [0x11,0x14,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e1411 <unknown>
+
+fmls za.h[w8, 1], {z0.h - z1.h}, z14.h[2] // 11000001-00011110-00010100-00010001
+// CHECK-INST: fmls za.h[w8, 1, vgx2], { z0.h, z1.h }, z14.h[2]
+// CHECK-ENCODING: [0x11,0x14,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e1411 <unknown>
+
+fmls za.h[w10, 0, vgx2], {z18.h, z19.h}, z4.h[3] // 11000001-00010100-01010110-01011000
+// CHECK-INST: fmls za.h[w10, 0, vgx2], { z18.h, z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x58,0x56,0x14,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1145658 <unknown>
+
+fmls za.h[w10, 0], {z18.h - z19.h}, z4.h[3] // 11000001-00010100-01010110-01011000
+// CHECK-INST: fmls za.h[w10, 0, vgx2], { z18.h, z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x58,0x56,0x14,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1145658 <unknown>
+
+fmls za.h[w8, 0, vgx2], {z12.h, z13.h}, z2.h[4] // 11000001-00010010-00011001-10010000
+// CHECK-INST: fmls za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h[4]
+// CHECK-ENCODING: [0x90,0x19,0x12,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1121990 <unknown>
+
+fmls za.h[w8, 0], {z12.h - z13.h}, z2.h[4] // 11000001-00010010-00011001-10010000
+// CHECK-INST: fmls za.h[w8, 0, vgx2], { z12.h, z13.h }, z2.h[4]
+// CHECK-ENCODING: [0x90,0x19,0x12,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1121990 <unknown>
+
+fmls za.h[w10, 1, vgx2], {z0.h, z1.h}, z10.h[4] // 11000001-00011010-01011000-00010001
+// CHECK-INST: fmls za.h[w10, 1, vgx2], { z0.h, z1.h }, z10.h[4]
+// CHECK-ENCODING: [0x11,0x58,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11a5811 <unknown>
+
+fmls za.h[w10, 1], {z0.h - z1.h}, z10.h[4] // 11000001-00011010-01011000-00010001
+// CHECK-INST: fmls za.h[w10, 1, vgx2], { z0.h, z1.h }, z10.h[4]
+// CHECK-ENCODING: [0x11,0x58,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11a5811 <unknown>
+
+fmls za.h[w8, 5, vgx2], {z22.h, z23.h}, z14.h[5] // 11000001-00011110-00011010-11011101
+// CHECK-INST: fmls za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h[5]
+// CHECK-ENCODING: [0xdd,0x1a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e1add <unknown>
+
+fmls za.h[w8, 5], {z22.h - z23.h}, z14.h[5] // 11000001-00011110-00011010-11011101
+// CHECK-INST: fmls za.h[w8, 5, vgx2], { z22.h, z23.h }, z14.h[5]
+// CHECK-ENCODING: [0xdd,0x1a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e1add <unknown>
+
+fmls za.h[w11, 2, vgx2], {z8.h, z9.h}, z1.h[2] // 11000001-00010001-01110101-00010010
+// CHECK-INST: fmls za.h[w11, 2, vgx2], { z8.h, z9.h }, z1.h[2]
+// CHECK-ENCODING: [0x12,0x75,0x11,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1117512 <unknown>
+
+fmls za.h[w11, 2], {z8.h - z9.h}, z1.h[2] // 11000001-00010001-01110101-00010010
+// CHECK-INST: fmls za.h[w11, 2, vgx2], { z8.h, z9.h }, z1.h[2]
+// CHECK-ENCODING: [0x12,0x75,0x11,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1117512 <unknown>
+
+fmls za.h[w9, 7, vgx2], {z12.h, z13.h}, z11.h[4] // 11000001-00011011-00111001-10010111
+// CHECK-INST: fmls za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h[4]
+// CHECK-ENCODING: [0x97,0x39,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11b3997 <unknown>
+
+fmls za.h[w9, 7], {z12.h - z13.h}, z11.h[4] // 11000001-00011011-00111001-10010111
+// CHECK-INST: fmls za.h[w9, 7, vgx2], { z12.h, z13.h }, z11.h[4]
+// CHECK-ENCODING: [0x97,0x39,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11b3997 <unknown>
+
+
+fmls za.h[w8, 0, vgx2], {z0.h, z1.h}, {z0.h, z1.h} // 11000001-10100000-00010000-00011000
+// CHECK-INST: fmls za.h[w8, 0, vgx2], { z0.h, z1.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x18,0x10,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a01018 <unknown>
+
+fmls za.h[w8, 0], {z0.h - z1.h}, {z0.h - z1.h} // 11000001-10100000-00010000-00011000
+// CHECK-INST: fmls za.h[w8, 0, vgx2], { z0.h, z1.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x18,0x10,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a01018 <unknown>
+
+fmls za.h[w10, 5, vgx2], {z10.h, z11.h}, {z20.h, z21.h} // 11000001-10110100-01010001-01011101
+// CHECK-INST: fmls za.h[w10, 5, vgx2], { z10.h, z11.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x5d,0x51,0xb4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b4515d <unknown>
+
+fmls za.h[w10, 5], {z10.h - z11.h}, {z20.h - z21.h} // 11000001-10110100-01010001-01011101
+// CHECK-INST: fmls za.h[w10, 5, vgx2], { z10.h, z11.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x5d,0x51,0xb4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b4515d <unknown>
+
+fmls za.h[w11, 7, vgx2], {z12.h, z13.h}, {z8.h, z9.h} // 11000001-10101000-01110001-10011111
+// CHECK-INST: fmls za.h[w11, 7, vgx2], { z12.h, z13.h }, { z8.h, z9.h }
+// CHECK-ENCODING: [0x9f,0x71,0xa8,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a8719f <unknown>
+
+fmls za.h[w11, 7], {z12.h - z13.h}, {z8.h - z9.h} // 11000001-10101000-01110001-10011111
+// CHECK-INST: fmls za.h[w11, 7, vgx2], { z12.h, z13.h }, { z8.h, z9.h }
+// CHECK-ENCODING: [0x9f,0x71,0xa8,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a8719f <unknown>
+
+fmls za.h[w11, 7, vgx2], {z30.h, z31.h}, {z30.h, z31.h} // 11000001-10111110-01110011-11011111
+// CHECK-INST: fmls za.h[w11, 7, vgx2], { z30.h, z31.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xdf,0x73,0xbe,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1be73df <unknown>
+
+fmls za.h[w11, 7], {z30.h - z31.h}, {z30.h - z31.h} // 11000001-10111110-01110011-11011111
+// CHECK-INST: fmls za.h[w11, 7, vgx2], { z30.h, z31.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xdf,0x73,0xbe,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1be73df <unknown>
+
+fmls za.h[w8, 5, vgx2], {z16.h, z17.h}, {z16.h, z17.h} // 11000001-10110000-00010010-00011101
+// CHECK-INST: fmls za.h[w8, 5, vgx2], { z16.h, z17.h }, { z16.h, z17.h }
+// CHECK-ENCODING: [0x1d,0x12,0xb0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b0121d <unknown>
+
+fmls za.h[w8, 5], {z16.h - z17.h}, {z16.h - z17.h} // 11000001-10110000-00010010-00011101
+// CHECK-INST: fmls za.h[w8, 5, vgx2], { z16.h, z17.h }, { z16.h, z17.h }
+// CHECK-ENCODING: [0x1d,0x12,0xb0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b0121d <unknown>
+
+fmls za.h[w8, 1, vgx2], {z0.h, z1.h}, {z30.h, z31.h} // 11000001-10111110-00010000-00011001
+// CHECK-INST: fmls za.h[w8, 1, vgx2], { z0.h, z1.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0x19,0x10,0xbe,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1be1019 <unknown>
+
+fmls za.h[w8, 1], {z0.h - z1.h}, {z30.h - z31.h} // 11000001-10111110-00010000-00011001
+// CHECK-INST: fmls za.h[w8, 1, vgx2], { z0.h, z1.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0x19,0x10,0xbe,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1be1019 <unknown>
+
+fmls za.h[w10, 0, vgx2], {z18.h, z19.h}, {z20.h, z21.h} // 11000001-10110100-01010010-01011000
+// CHECK-INST: fmls za.h[w10, 0, vgx2], { z18.h, z19.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x58,0x52,0xb4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b45258 <unknown>
+
+fmls za.h[w10, 0], {z18.h - z19.h}, {z20.h - z21.h} // 11000001-10110100-01010010-01011000
+// CHECK-INST: fmls za.h[w10, 0, vgx2], { z18.h, z19.h }, { z20.h, z21.h }
+// CHECK-ENCODING: [0x58,0x52,0xb4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b45258 <unknown>
+
+fmls za.h[w8, 0, vgx2], {z12.h, z13.h}, {z2.h, z3.h} // 11000001-10100010-00010001-10011000
+// CHECK-INST: fmls za.h[w8, 0, vgx2], { z12.h, z13.h }, { z2.h, z3.h }
+// CHECK-ENCODING: [0x98,0x11,0xa2,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a21198 <unknown>
+
+fmls za.h[w8, 0], {z12.h - z13.h}, {z2.h - z3.h} // 11000001-10100010-00010001-10011000
+// CHECK-INST: fmls za.h[w8, 0, vgx2], { z12.h, z13.h }, { z2.h, z3.h }
+// CHECK-ENCODING: [0x98,0x11,0xa2,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a21198 <unknown>
+
+fmls za.h[w10, 1, vgx2], {z0.h, z1.h}, {z26.h, z27.h} // 11000001-10111010-01010000-00011001
+// CHECK-INST: fmls za.h[w10, 1, vgx2], { z0.h, z1.h }, { z26.h, z27.h }
+// CHECK-ENCODING: [0x19,0x50,0xba,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1ba5019 <unknown>
+
+fmls za.h[w10, 1], {z0.h - z1.h}, {z26.h - z27.h} // 11000001-10111010-01010000-00011001
+// CHECK-INST: fmls za.h[w10, 1, vgx2], { z0.h, z1.h }, { z26.h, z27.h }
+// CHECK-ENCODING: [0x19,0x50,0xba,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1ba5019 <unknown>
+
+fmls za.h[w8, 5, vgx2], {z22.h, z23.h}, {z30.h, z31.h} // 11000001-10111110-00010010-11011101
+// CHECK-INST: fmls za.h[w8, 5, vgx2], { z22.h, z23.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xdd,0x12,0xbe,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1be12dd <unknown>
+
+fmls za.h[w8, 5], {z22.h - z23.h}, {z30.h - z31.h} // 11000001-10111110-00010010-11011101
+// CHECK-INST: fmls za.h[w8, 5, vgx2], { z22.h, z23.h }, { z30.h, z31.h }
+// CHECK-ENCODING: [0xdd,0x12,0xbe,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1be12dd <unknown>
+
+fmls za.h[w11, 2, vgx2], {z8.h, z9.h}, {z0.h, z1.h} // 11000001-10100000-01110001-00011010
+// CHECK-INST: fmls za.h[w11, 2, vgx2], { z8.h, z9.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x1a,0x71,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a0711a <unknown>
+
+fmls za.h[w11, 2], {z8.h - z9.h}, {z0.h - z1.h} // 11000001-10100000-01110001-00011010
+// CHECK-INST: fmls za.h[w11, 2, vgx2], { z8.h, z9.h }, { z0.h, z1.h }
+// CHECK-ENCODING: [0x1a,0x71,0xa0,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a0711a <unknown>
+
+fmls za.h[w9, 7, vgx2], {z12.h, z13.h}, {z10.h, z11.h} // 11000001-10101010-00110001-10011111
+// CHECK-INST: fmls za.h[w9, 7, vgx2], { z12.h, z13.h }, { z10.h, z11.h }
+// CHECK-ENCODING: [0x9f,0x31,0xaa,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1aa319f <unknown>
+
+fmls za.h[w9, 7], {z12.h - z13.h}, {z10.h - z11.h} // 11000001-10101010-00110001-10011111
+// CHECK-INST: fmls za.h[w9, 7, vgx2], { z12.h, z13.h }, { z10.h, z11.h }
+// CHECK-ENCODING: [0x9f,0x31,0xaa,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1aa319f <unknown>
+
+fmls za.h[w8, 0, vgx4], {z0.h - z3.h}, z0.h // 11000001-00110000-00011100-00001000
+// CHECK-INST: fmls za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h
+// CHECK-ENCODING: [0x08,0x1c,0x30,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1301c08 <unknown>
+
+fmls za.h[w8, 0], {z0.h - z3.h}, z0.h // 11000001-00110000-00011100-00001000
+// CHECK-INST: fmls za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h
+// CHECK-ENCODING: [0x08,0x1c,0x30,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1301c08 <unknown>
+
+fmls za.h[w10, 5, vgx4], {z10.h - z13.h}, z5.h // 11000001-00110101-01011101-01001101
+// CHECK-INST: fmls za.h[w10, 5, vgx4], { z10.h - z13.h }, z5.h
+// CHECK-ENCODING: [0x4d,0x5d,0x35,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1355d4d <unknown>
+
+fmls za.h[w10, 5], {z10.h - z13.h}, z5.h // 11000001-00110101-01011101-01001101
+// CHECK-INST: fmls za.h[w10, 5, vgx4], { z10.h - z13.h }, z5.h
+// CHECK-ENCODING: [0x4d,0x5d,0x35,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1355d4d <unknown>
+
+fmls za.h[w11, 7, vgx4], {z13.h - z16.h}, z8.h // 11000001-00111000-01111101-10101111
+// CHECK-INST: fmls za.h[w11, 7, vgx4], { z13.h - z16.h }, z8.h
+// CHECK-ENCODING: [0xaf,0x7d,0x38,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1387daf <unknown>
+
+fmls za.h[w11, 7], {z13.h - z16.h}, z8.h // 11000001-00111000-01111101-10101111
+// CHECK-INST: fmls za.h[w11, 7, vgx4], { z13.h - z16.h }, z8.h
+// CHECK-ENCODING: [0xaf,0x7d,0x38,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1387daf <unknown>
+
+fmls za.h[w11, 7, vgx4], {z31.h, z0.h, z1.h, z2.h}, z15.h // 11000001-00111111-01111111-11101111
+// CHECK-INST: fmls za.h[w11, 7, vgx4], { z31.h, z0.h, z1.h, z2.h }, z15.h
+// CHECK-ENCODING: [0xef,0x7f,0x3f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13f7fef <unknown>
+
+fmls za.h[w11, 7], {z31.h, z0.h, z1.h, z2.h}, z15.h // 11000001-00111111-01111111-11101111
+// CHECK-INST: fmls za.h[w11, 7, vgx4], { z31.h, z0.h, z1.h, z2.h }, z15.h
+// CHECK-ENCODING: [0xef,0x7f,0x3f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13f7fef <unknown>
+
+fmls za.h[w8, 5, vgx4], {z17.h - z20.h}, z0.h // 11000001-00110000-00011110-00101101
+// CHECK-INST: fmls za.h[w8, 5, vgx4], { z17.h - z20.h }, z0.h
+// CHECK-ENCODING: [0x2d,0x1e,0x30,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1301e2d <unknown>
+
+fmls za.h[w8, 5], {z17.h - z20.h}, z0.h // 11000001-00110000-00011110-00101101
+// CHECK-INST: fmls za.h[w8, 5, vgx4], { z17.h - z20.h }, z0.h
+// CHECK-ENCODING: [0x2d,0x1e,0x30,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1301e2d <unknown>
+
+fmls za.h[w8, 1, vgx4], {z1.h - z4.h}, z14.h // 11000001-00111110-00011100-00101001
+// CHECK-INST: fmls za.h[w8, 1, vgx4], { z1.h - z4.h }, z14.h
+// CHECK-ENCODING: [0x29,0x1c,0x3e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13e1c29 <unknown>
+
+fmls za.h[w8, 1], {z1.h - z4.h}, z14.h // 11000001-00111110-00011100-00101001
+// CHECK-INST: fmls za.h[w8, 1, vgx4], { z1.h - z4.h }, z14.h
+// CHECK-ENCODING: [0x29,0x1c,0x3e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13e1c29 <unknown>
+
+fmls za.h[w10, 0, vgx4], {z19.h - z22.h}, z4.h // 11000001-00110100-01011110-01101000
+// CHECK-INST: fmls za.h[w10, 0, vgx4], { z19.h - z22.h }, z4.h
+// CHECK-ENCODING: [0x68,0x5e,0x34,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1345e68 <unknown>
+
+fmls za.h[w10, 0], {z19.h - z22.h}, z4.h // 11000001-00110100-01011110-01101000
+// CHECK-INST: fmls za.h[w10, 0, vgx4], { z19.h - z22.h }, z4.h
+// CHECK-ENCODING: [0x68,0x5e,0x34,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1345e68 <unknown>
+
+fmls za.h[w8, 0, vgx4], {z12.h - z15.h}, z2.h // 11000001-00110010-00011101-10001000
+// CHECK-INST: fmls za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h
+// CHECK-ENCODING: [0x88,0x1d,0x32,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1321d88 <unknown>
+
+fmls za.h[w8, 0], {z12.h - z15.h}, z2.h // 11000001-00110010-00011101-10001000
+// CHECK-INST: fmls za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h
+// CHECK-ENCODING: [0x88,0x1d,0x32,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1321d88 <unknown>
+
+fmls za.h[w10, 1, vgx4], {z1.h - z4.h}, z10.h // 11000001-00111010-01011100-00101001
+// CHECK-INST: fmls za.h[w10, 1, vgx4], { z1.h - z4.h }, z10.h
+// CHECK-ENCODING: [0x29,0x5c,0x3a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13a5c29 <unknown>
+
+fmls za.h[w10, 1], {z1.h - z4.h}, z10.h // 11000001-00111010-01011100-00101001
+// CHECK-INST: fmls za.h[w10, 1, vgx4], { z1.h - z4.h }, z10.h
+// CHECK-ENCODING: [0x29,0x5c,0x3a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13a5c29 <unknown>
+
+fmls za.h[w8, 5, vgx4], {z22.h - z25.h}, z14.h // 11000001-00111110-00011110-11001101
+// CHECK-INST: fmls za.h[w8, 5, vgx4], { z22.h - z25.h }, z14.h
+// CHECK-ENCODING: [0xcd,0x1e,0x3e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13e1ecd <unknown>
+
+fmls za.h[w8, 5], {z22.h - z25.h}, z14.h // 11000001-00111110-00011110-11001101
+// CHECK-INST: fmls za.h[w8, 5, vgx4], { z22.h - z25.h }, z14.h
+// CHECK-ENCODING: [0xcd,0x1e,0x3e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13e1ecd <unknown>
+
+fmls za.h[w11, 2, vgx4], {z9.h - z12.h}, z1.h // 11000001-00110001-01111101-00101010
+// CHECK-INST: fmls za.h[w11, 2, vgx4], { z9.h - z12.h }, z1.h
+// CHECK-ENCODING: [0x2a,0x7d,0x31,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1317d2a <unknown>
+
+fmls za.h[w11, 2], {z9.h - z12.h}, z1.h // 11000001-00110001-01111101-00101010
+// CHECK-INST: fmls za.h[w11, 2, vgx4], { z9.h - z12.h }, z1.h
+// CHECK-ENCODING: [0x2a,0x7d,0x31,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1317d2a <unknown>
+
+fmls za.h[w9, 7, vgx4], {z12.h - z15.h}, z11.h // 11000001-00111011-00111101-10001111
+// CHECK-INST: fmls za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h
+// CHECK-ENCODING: [0x8f,0x3d,0x3b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13b3d8f <unknown>
+
+fmls za.h[w9, 7], {z12.h - z15.h}, z11.h // 11000001-00111011-00111101-10001111
+// CHECK-INST: fmls za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h
+// CHECK-ENCODING: [0x8f,0x3d,0x3b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c13b3d8f <unknown>
+
+fmls za.h[w8, 0, vgx4], {z0.h - z3.h}, z0.h[0] // 11000001-00010000-10010000-00010000
+// CHECK-INST: fmls za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h[0]
+// CHECK-ENCODING: [0x10,0x90,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1109010 <unknown>
+
+fmls za.h[w8, 0], {z0.h - z3.h}, z0.h[0] // 11000001-00010000-10010000-00010000
+// CHECK-INST: fmls za.h[w8, 0, vgx4], { z0.h - z3.h }, z0.h[0]
+// CHECK-ENCODING: [0x10,0x90,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1109010 <unknown>
+
+fmls za.h[w10, 5, vgx4], {z8.h - z11.h}, z5.h[2] // 11000001-00010101-11010101-00010101
+// CHECK-INST: fmls za.h[w10, 5, vgx4], { z8.h - z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x15,0xd5,0x15,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c115d515 <unknown>
+
+fmls za.h[w10, 5], {z8.h - z11.h}, z5.h[2] // 11000001-00010101-11010101-00010101
+// CHECK-INST: fmls za.h[w10, 5, vgx4], { z8.h - z11.h }, z5.h[2]
+// CHECK-ENCODING: [0x15,0xd5,0x15,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c115d515 <unknown>
+
+fmls za.h[w11, 7, vgx4], {z12.h - z15.h}, z8.h[6] // 11000001-00011000-11111101-10010111
+// CHECK-INST: fmls za.h[w11, 7, vgx4], { z12.h - z15.h }, z8.h[6]
+// CHECK-ENCODING: [0x97,0xfd,0x18,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c118fd97 <unknown>
+
+fmls za.h[w11, 7], {z12.h - z15.h}, z8.h[6] // 11000001-00011000-11111101-10010111
+// CHECK-INST: fmls za.h[w11, 7, vgx4], { z12.h - z15.h }, z8.h[6]
+// CHECK-ENCODING: [0x97,0xfd,0x18,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c118fd97 <unknown>
+
+fmls za.h[w11, 7, vgx4], {z28.h - z31.h}, z15.h[7] // 11000001-00011111-11111111-10011111
+// CHECK-INST: fmls za.h[w11, 7, vgx4], { z28.h - z31.h }, z15.h[7]
+// CHECK-ENCODING: [0x9f,0xff,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11fff9f <unknown>
+
+fmls za.h[w11, 7], {z28.h - z31.h}, z15.h[7] // 11000001-00011111-11111111-10011111
+// CHECK-INST: fmls za.h[w11, 7, vgx4], { z28.h - z31.h }, z15.h[7]
+// CHECK-ENCODING: [0x9f,0xff,0x1f,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11fff9f <unknown>
+
+fmls za.h[w8, 5, vgx4], {z16.h - z19.h}, z0.h[6] // 11000001-00010000-10011110-00010101
+// CHECK-INST: fmls za.h[w8, 5, vgx4], { z16.h - z19.h }, z0.h[6]
+// CHECK-ENCODING: [0x15,0x9e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1109e15 <unknown>
+
+fmls za.h[w8, 5], {z16.h - z19.h}, z0.h[6] // 11000001-00010000-10011110-00010101
+// CHECK-INST: fmls za.h[w8, 5, vgx4], { z16.h - z19.h }, z0.h[6]
+// CHECK-ENCODING: [0x15,0x9e,0x10,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1109e15 <unknown>
+
+fmls za.h[w8, 1, vgx4], {z0.h - z3.h}, z14.h[2] // 11000001-00011110-10010100-00010001
+// CHECK-INST: fmls za.h[w8, 1, vgx4], { z0.h - z3.h }, z14.h[2]
+// CHECK-ENCODING: [0x11,0x94,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e9411 <unknown>
+
+fmls za.h[w8, 1], {z0.h - z3.h}, z14.h[2] // 11000001-00011110-10010100-00010001
+// CHECK-INST: fmls za.h[w8, 1, vgx4], { z0.h - z3.h }, z14.h[2]
+// CHECK-ENCODING: [0x11,0x94,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e9411 <unknown>
+
+fmls za.h[w10, 0, vgx4], {z16.h - z19.h}, z4.h[3] // 11000001-00010100-11010110-00011000
+// CHECK-INST: fmls za.h[w10, 0, vgx4], { z16.h - z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x18,0xd6,0x14,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c114d618 <unknown>
+
+fmls za.h[w10, 0], {z16.h - z19.h}, z4.h[3] // 11000001-00010100-11010110-00011000
+// CHECK-INST: fmls za.h[w10, 0, vgx4], { z16.h - z19.h }, z4.h[3]
+// CHECK-ENCODING: [0x18,0xd6,0x14,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c114d618 <unknown>
+
+fmls za.h[w8, 0, vgx4], {z12.h - z15.h}, z2.h[4] // 11000001-00010010-10011001-10010000
+// CHECK-INST: fmls za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h[4]
+// CHECK-ENCODING: [0x90,0x99,0x12,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1129990 <unknown>
+
+fmls za.h[w8, 0], {z12.h - z15.h}, z2.h[4] // 11000001-00010010-10011001-10010000
+// CHECK-INST: fmls za.h[w8, 0, vgx4], { z12.h - z15.h }, z2.h[4]
+// CHECK-ENCODING: [0x90,0x99,0x12,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1129990 <unknown>
+
+fmls za.h[w10, 1, vgx4], {z0.h - z3.h}, z10.h[4] // 11000001-00011010-11011000-00010001
+// CHECK-INST: fmls za.h[w10, 1, vgx4], { z0.h - z3.h }, z10.h[4]
+// CHECK-ENCODING: [0x11,0xd8,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11ad811 <unknown>
+
+fmls za.h[w10, 1], {z0.h - z3.h}, z10.h[4] // 11000001-00011010-11011000-00010001
+// CHECK-INST: fmls za.h[w10, 1, vgx4], { z0.h - z3.h }, z10.h[4]
+// CHECK-ENCODING: [0x11,0xd8,0x1a,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11ad811 <unknown>
+
+fmls za.h[w8, 5, vgx4], {z20.h - z23.h}, z14.h[5] // 11000001-00011110-10011010-10011101
+// CHECK-INST: fmls za.h[w8, 5, vgx4], { z20.h - z23.h }, z14.h[5]
+// CHECK-ENCODING: [0x9d,0x9a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e9a9d <unknown>
+
+fmls za.h[w8, 5], {z20.h - z23.h}, z14.h[5] // 11000001-00011110-10011010-10011101
+// CHECK-INST: fmls za.h[w8, 5, vgx4], { z20.h - z23.h }, z14.h[5]
+// CHECK-ENCODING: [0x9d,0x9a,0x1e,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11e9a9d <unknown>
+
+fmls za.h[w11, 2, vgx4], {z8.h - z11.h}, z1.h[2] // 11000001-00010001-11110101-00010010
+// CHECK-INST: fmls za.h[w11, 2, vgx4], { z8.h - z11.h }, z1.h[2]
+// CHECK-ENCODING: [0x12,0xf5,0x11,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c111f512 <unknown>
+
+fmls za.h[w11, 2], {z8.h - z11.h}, z1.h[2] // 11000001-00010001-11110101-00010010
+// CHECK-INST: fmls za.h[w11, 2, vgx4], { z8.h - z11.h }, z1.h[2]
+// CHECK-ENCODING: [0x12,0xf5,0x11,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c111f512 <unknown>
+
+fmls za.h[w9, 7, vgx4], {z12.h - z15.h}, z11.h[4] // 11000001-00011011-10111001-10010111
+// CHECK-INST: fmls za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h[4]
+// CHECK-ENCODING: [0x97,0xb9,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11bb997 <unknown>
+
+fmls za.h[w9, 7], {z12.h - z15.h}, z11.h[4] // 11000001-00011011-10111001-10010111
+// CHECK-INST: fmls za.h[w9, 7, vgx4], { z12.h - z15.h }, z11.h[4]
+// CHECK-ENCODING: [0x97,0xb9,0x1b,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c11bb997 <unknown>
+
+fmls za.h[w8, 0, vgx4], {z0.h - z3.h}, {z0.h - z3.h} // 11000001-10100001-00010000-00011000
+// CHECK-INST: fmls za.h[w8, 0, vgx4], { z0.h - z3.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x18,0x10,0xa1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a11018 <unknown>
+
+fmls za.h[w8, 0], {z0.h - z3.h}, {z0.h - z3.h} // 11000001-10100001-00010000-00011000
+// CHECK-INST: fmls za.h[w8, 0, vgx4], { z0.h - z3.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x18,0x10,0xa1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a11018 <unknown>
+
+fmls za.h[w10, 5, vgx4], {z8.h - z11.h}, {z20.h - z23.h} // 11000001-10110101-01010001-00011101
+// CHECK-INST: fmls za.h[w10, 5, vgx4], { z8.h - z11.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x1d,0x51,0xb5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b5511d <unknown>
+
+fmls za.h[w10, 5], {z8.h - z11.h}, {z20.h - z23.h} // 11000001-10110101-01010001-00011101
+// CHECK-INST: fmls za.h[w10, 5, vgx4], { z8.h - z11.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x1d,0x51,0xb5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b5511d <unknown>
+
+fmls za.h[w11, 7, vgx4], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-10101001-01110001-10011111
+// CHECK-INST: fmls za.h[w11, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x9f,0x71,0xa9,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a9719f <unknown>
+
+fmls za.h[w11, 7], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-10101001-01110001-10011111
+// CHECK-INST: fmls za.h[w11, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x9f,0x71,0xa9,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a9719f <unknown>
+
+fmls za.h[w11, 7, vgx4], {z28.h - z31.h}, {z28.h - z31.h} // 11000001-10111101-01110011-10011111
+// CHECK-INST: fmls za.h[w11, 7, vgx4], { z28.h - z31.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x9f,0x73,0xbd,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1bd739f <unknown>
+
+fmls za.h[w11, 7], {z28.h - z31.h}, {z28.h - z31.h} // 11000001-10111101-01110011-10011111
+// CHECK-INST: fmls za.h[w11, 7, vgx4], { z28.h - z31.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x9f,0x73,0xbd,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1bd739f <unknown>
+
+fmls za.h[w8, 5, vgx4], {z16.h - z19.h}, {z16.h - z19.h} // 11000001-10110001-00010010-00011101
+// CHECK-INST: fmls za.h[w8, 5, vgx4], { z16.h - z19.h }, { z16.h - z19.h }
+// CHECK-ENCODING: [0x1d,0x12,0xb1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b1121d <unknown>
+
+fmls za.h[w8, 5], {z16.h - z19.h}, {z16.h - z19.h} // 11000001-10110001-00010010-00011101
+// CHECK-INST: fmls za.h[w8, 5, vgx4], { z16.h - z19.h }, { z16.h - z19.h }
+// CHECK-ENCODING: [0x1d,0x12,0xb1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b1121d <unknown>
+
+fmls za.h[w8, 1, vgx4], {z0.h - z3.h}, {z28.h - z31.h} // 11000001-10111101-00010000-00011001
+// CHECK-INST: fmls za.h[w8, 1, vgx4], { z0.h - z3.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x19,0x10,0xbd,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1bd1019 <unknown>
+
+fmls za.h[w8, 1], {z0.h - z3.h}, {z28.h - z31.h} // 11000001-10111101-00010000-00011001
+// CHECK-INST: fmls za.h[w8, 1, vgx4], { z0.h - z3.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x19,0x10,0xbd,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1bd1019 <unknown>
+
+fmls za.h[w10, 0, vgx4], {z16.h - z19.h}, {z20.h - z23.h} // 11000001-10110101-01010010-00011000
+// CHECK-INST: fmls za.h[w10, 0, vgx4], { z16.h - z19.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x18,0x52,0xb5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b55218 <unknown>
+
+fmls za.h[w10, 0], {z16.h - z19.h}, {z20.h - z23.h} // 11000001-10110101-01010010-00011000
+// CHECK-INST: fmls za.h[w10, 0, vgx4], { z16.h - z19.h }, { z20.h - z23.h }
+// CHECK-ENCODING: [0x18,0x52,0xb5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b55218 <unknown>
+
+fmls za.h[w8, 0, vgx4], {z12.h - z15.h}, {z0.h - z3.h} // 11000001-10100001-00010001-10011000
+// CHECK-INST: fmls za.h[w8, 0, vgx4], { z12.h - z15.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x98,0x11,0xa1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a11198 <unknown>
+
+fmls za.h[w8, 0], {z12.h - z15.h}, {z0.h - z3.h} // 11000001-10100001-00010001-10011000
+// CHECK-INST: fmls za.h[w8, 0, vgx4], { z12.h - z15.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x98,0x11,0xa1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a11198 <unknown>
+
+fmls za.h[w10, 1, vgx4], {z0.h - z3.h}, {z24.h - z27.h} // 11000001-10111001-01010000-00011001
+// CHECK-INST: fmls za.h[w10, 1, vgx4], { z0.h - z3.h }, { z24.h - z27.h }
+// CHECK-ENCODING: [0x19,0x50,0xb9,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b95019 <unknown>
+
+fmls za.h[w10, 1], {z0.h - z3.h}, {z24.h - z27.h} // 11000001-10111001-01010000-00011001
+// CHECK-INST: fmls za.h[w10, 1, vgx4], { z0.h - z3.h }, { z24.h - z27.h }
+// CHECK-ENCODING: [0x19,0x50,0xb9,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1b95019 <unknown>
+
+fmls za.h[w8, 5, vgx4], {z20.h - z23.h}, {z28.h - z31.h} // 11000001-10111101-00010010-10011101
+// CHECK-INST: fmls za.h[w8, 5, vgx4], { z20.h - z23.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x9d,0x12,0xbd,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1bd129d <unknown>
+
+fmls za.h[w8, 5], {z20.h - z23.h}, {z28.h - z31.h} // 11000001-10111101-00010010-10011101
+// CHECK-INST: fmls za.h[w8, 5, vgx4], { z20.h - z23.h }, { z28.h - z31.h }
+// CHECK-ENCODING: [0x9d,0x12,0xbd,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1bd129d <unknown>
+
+fmls za.h[w11, 2, vgx4], {z8.h - z11.h}, {z0.h - z3.h} // 11000001-10100001-01110001-00011010
+// CHECK-INST: fmls za.h[w11, 2, vgx4], { z8.h - z11.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x1a,0x71,0xa1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a1711a <unknown>
+
+fmls za.h[w11, 2], {z8.h - z11.h}, {z0.h - z3.h} // 11000001-10100001-01110001-00011010
+// CHECK-INST: fmls za.h[w11, 2, vgx4], { z8.h - z11.h }, { z0.h - z3.h }
+// CHECK-ENCODING: [0x1a,0x71,0xa1,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a1711a <unknown>
+
+fmls za.h[w9, 7, vgx4], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-10101001-00110001-10011111
+// CHECK-INST: fmls za.h[w9, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x9f,0x31,0xa9,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a9319f <unknown>
+
+fmls za.h[w9, 7], {z12.h - z15.h}, {z8.h - z11.h} // 11000001-10101001-00110001-10011111
+// CHECK-INST: fmls za.h[w9, 7, vgx4], { z12.h - z15.h }, { z8.h - z11.h }
+// CHECK-ENCODING: [0x9f,0x31,0xa9,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: c1a9319f <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/fmopa-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/fmopa-diagnostics.s
new file mode 100644
index 000000000000..def19a316c2a
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fmopa-diagnostics.s
@@ -0,0 +1,35 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid predicate register
+
+fmopa za1.h, p8/m, p5/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid restricted predicate register, expected p0..p7 (without element suffix)
+// CHECK-NEXT: fmopa za1.h, p8/m, p5/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmopa za1.h, p5/m, p8/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid restricted predicate register, expected p0..p7 (without element suffix)
+// CHECK-NEXT: fmopa za1.h, p5/m, p8/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmopa za1.h, p5.h, p5/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid restricted predicate register, expected p0..p7 (without element suffix)
+// CHECK-NEXT: fmopa za1.h, p5.h, p5/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid matrix operand
+
+fmopa za2.h, p5/m, p5/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: fmopa za2.h, p5/m, p5/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid register suffixes
+
+fmopa za1.h, p5/m, p5/m, z12.h, z11.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid element width
+// CHECK-NEXT: fmopa za1.h, p5/m, p5/m, z12.h, z11.b
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/fmopa.s b/llvm/test/MC/AArch64/SME2p1/fmopa.s
new file mode 100644
index 000000000000..e53d21244fde
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fmopa.s
@@ -0,0 +1,85 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+sme-f16f16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+sme-f16f16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+fmopa za0.h, p0/m, p0/m, z0.h, z0.h // 10000001-10000000-00000000-00001000
+// CHECK-INST: fmopa za0.h, p0/m, p0/m, z0.h, z0.h
+// CHECK-ENCODING: [0x08,0x00,0x80,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 81800008 <unknown>
+
+fmopa za1.h, p5/m, p2/m, z10.h, z21.h // 10000001-10010101-01010101-01001001
+// CHECK-INST: fmopa za1.h, p5/m, p2/m, z10.h, z21.h
+// CHECK-ENCODING: [0x49,0x55,0x95,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 81955549 <unknown>
+
+fmopa za1.h, p3/m, p7/m, z13.h, z8.h // 10000001-10001000-11101101-10101001
+// CHECK-INST: fmopa za1.h, p3/m, p7/m, z13.h, z8.h
+// CHECK-ENCODING: [0xa9,0xed,0x88,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 8188eda9 <unknown>
+
+fmopa za1.h, p7/m, p7/m, z31.h, z31.h // 10000001-10011111-11111111-11101001
+// CHECK-INST: fmopa za1.h, p7/m, p7/m, z31.h, z31.h
+// CHECK-ENCODING: [0xe9,0xff,0x9f,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 819fffe9 <unknown>
+
+fmopa za1.h, p3/m, p0/m, z17.h, z16.h // 10000001-10010000-00001110-00101001
+// CHECK-INST: fmopa za1.h, p3/m, p0/m, z17.h, z16.h
+// CHECK-ENCODING: [0x29,0x0e,0x90,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 81900e29 <unknown>
+
+fmopa za1.h, p1/m, p4/m, z1.h, z30.h // 10000001-10011110-10000100-00101001
+// CHECK-INST: fmopa za1.h, p1/m, p4/m, z1.h, z30.h
+// CHECK-ENCODING: [0x29,0x84,0x9e,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 819e8429 <unknown>
+
+fmopa za0.h, p5/m, p2/m, z19.h, z20.h // 10000001-10010100-01010110-01101000
+// CHECK-INST: fmopa za0.h, p5/m, p2/m, z19.h, z20.h
+// CHECK-ENCODING: [0x68,0x56,0x94,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 81945668 <unknown>
+
+fmopa za0.h, p6/m, p0/m, z12.h, z2.h // 10000001-10000010-00011001-10001000
+// CHECK-INST: fmopa za0.h, p6/m, p0/m, z12.h, z2.h
+// CHECK-ENCODING: [0x88,0x19,0x82,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 81821988 <unknown>
+
+fmopa za1.h, p2/m, p6/m, z1.h, z26.h // 10000001-10011010-11001000-00101001
+// CHECK-INST: fmopa za1.h, p2/m, p6/m, z1.h, z26.h
+// CHECK-ENCODING: [0x29,0xc8,0x9a,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 819ac829 <unknown>
+
+fmopa za1.h, p2/m, p0/m, z22.h, z30.h // 10000001-10011110-00001010-11001001
+// CHECK-INST: fmopa za1.h, p2/m, p0/m, z22.h, z30.h
+// CHECK-ENCODING: [0xc9,0x0a,0x9e,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 819e0ac9 <unknown>
+
+fmopa za0.h, p5/m, p7/m, z9.h, z1.h // 10000001-10000001-11110101-00101000
+// CHECK-INST: fmopa za0.h, p5/m, p7/m, z9.h, z1.h
+// CHECK-ENCODING: [0x28,0xf5,0x81,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 8181f528 <unknown>
+
+fmopa za1.h, p2/m, p5/m, z12.h, z11.h // 10000001-10001011-10101001-10001001
+// CHECK-INST: fmopa za1.h, p2/m, p5/m, z12.h, z11.h
+// CHECK-ENCODING: [0x89,0xa9,0x8b,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 818ba989 <unknown>
+
diff --git a/llvm/test/MC/AArch64/SME2p1/fmops-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/fmops-diagnostics.s
new file mode 100644
index 000000000000..75eea8113262
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fmops-diagnostics.s
@@ -0,0 +1,35 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid predicate register
+
+fmops za1.h, p8/m, p5/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid restricted predicate register, expected p0..p7 (without element suffix)
+// CHECK-NEXT: fmops za1.h, p8/m, p5/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmops za1.h, p5/m, p8/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid restricted predicate register, expected p0..p7 (without element suffix)
+// CHECK-NEXT: fmops za1.h, p5/m, p8/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fmops za1.h, p5.h, p5/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid restricted predicate register, expected p0..p7 (without element suffix)
+// CHECK-NEXT: fmops za1.h, p5.h, p5/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid matrix operand
+
+fmops za2.h, p5/m, p5/m, z12.h, z11.h
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: fmops za2.h, p5/m, p5/m, z12.h, z11.h
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid register suffixes
+
+fmops za1.h, p5/m, p5/m, z12.h, z11.b
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid element width
+// CHECK-NEXT: fmops za1.h, p5/m, p5/m, z12.h, z11.b
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/fmops.s b/llvm/test/MC/AArch64/SME2p1/fmops.s
new file mode 100644
index 000000000000..325d4c125b60
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fmops.s
@@ -0,0 +1,84 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+sme-f16f16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+sme-f16f16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+fmops za0.h, p0/m, p0/m, z0.h, z0.h // 10000001-10000000-00000000-00011000
+// CHECK-INST: fmops za0.h, p0/m, p0/m, z0.h, z0.h
+// CHECK-ENCODING: [0x18,0x00,0x80,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 81800018 <unknown>
+
+fmops za1.h, p5/m, p2/m, z10.h, z21.h // 10000001-10010101-01010101-01011001
+// CHECK-INST: fmops za1.h, p5/m, p2/m, z10.h, z21.h
+// CHECK-ENCODING: [0x59,0x55,0x95,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 81955559 <unknown>
+
+fmops za1.h, p3/m, p7/m, z13.h, z8.h // 10000001-10001000-11101101-10111001
+// CHECK-INST: fmops za1.h, p3/m, p7/m, z13.h, z8.h
+// CHECK-ENCODING: [0xb9,0xed,0x88,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 8188edb9 <unknown>
+
+fmops za1.h, p7/m, p7/m, z31.h, z31.h // 10000001-10011111-11111111-11111001
+// CHECK-INST: fmops za1.h, p7/m, p7/m, z31.h, z31.h
+// CHECK-ENCODING: [0xf9,0xff,0x9f,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 819ffff9 <unknown>
+
+fmops za1.h, p3/m, p0/m, z17.h, z16.h // 10000001-10010000-00001110-00111001
+// CHECK-INST: fmops za1.h, p3/m, p0/m, z17.h, z16.h
+// CHECK-ENCODING: [0x39,0x0e,0x90,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 81900e39 <unknown>
+
+fmops za1.h, p1/m, p4/m, z1.h, z30.h // 10000001-10011110-10000100-00111001
+// CHECK-INST: fmops za1.h, p1/m, p4/m, z1.h, z30.h
+// CHECK-ENCODING: [0x39,0x84,0x9e,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 819e8439 <unknown>
+
+fmops za0.h, p5/m, p2/m, z19.h, z20.h // 10000001-10010100-01010110-01111000
+// CHECK-INST: fmops za0.h, p5/m, p2/m, z19.h, z20.h
+// CHECK-ENCODING: [0x78,0x56,0x94,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 81945678 <unknown>
+
+fmops za0.h, p6/m, p0/m, z12.h, z2.h // 10000001-10000010-00011001-10011000
+// CHECK-INST: fmops za0.h, p6/m, p0/m, z12.h, z2.h
+// CHECK-ENCODING: [0x98,0x19,0x82,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 81821998 <unknown>
+
+fmops za1.h, p2/m, p6/m, z1.h, z26.h // 10000001-10011010-11001000-00111001
+// CHECK-INST: fmops za1.h, p2/m, p6/m, z1.h, z26.h
+// CHECK-ENCODING: [0x39,0xc8,0x9a,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 819ac839 <unknown>
+
+fmops za1.h, p2/m, p0/m, z22.h, z30.h // 10000001-10011110-00001010-11011001
+// CHECK-INST: fmops za1.h, p2/m, p0/m, z22.h, z30.h
+// CHECK-ENCODING: [0xd9,0x0a,0x9e,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 819e0ad9 <unknown>
+
+fmops za0.h, p5/m, p7/m, z9.h, z1.h // 10000001-10000001-11110101-00111000
+// CHECK-INST: fmops za0.h, p5/m, p7/m, z9.h, z1.h
+// CHECK-ENCODING: [0x38,0xf5,0x81,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 8181f538 <unknown>
+
+fmops za1.h, p2/m, p5/m, z12.h, z11.h // 10000001-10001011-10101001-10011001
+// CHECK-INST: fmops za1.h, p2/m, p5/m, z12.h, z11.h
+// CHECK-ENCODING: [0x99,0xa9,0x8b,0x81]
+// CHECK-ERROR: instruction requires: sme2p1 sme-f16f16
+// CHECK-UNKNOWN: 818ba999 <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/fsub-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/fsub-diagnostics.s
new file mode 100644
index 000000000000..716427a2f725
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fsub-diagnostics.s
@@ -0,0 +1,45 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Out of range index offset
+
+fsub za.d[w8, 8, vgx2], {z0.d-z1.d}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand, expected suffix .s
+// CHECK-NEXT: fsub za.d[w8, 8, vgx2], {z0.d-z1.d}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fsub za.s[w8, -1, vgx4], {z0.s-z3.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: fsub za.s[w8, -1, vgx4], {z0.s-z3.s}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector select register
+
+fsub za.h[w7, 7, vgx4], {z0.h-z3.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: fsub za.h[w7, 7, vgx4], {z0.h-z3.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fsub za.s[w12, 7, vgx2], {z0.s-z1.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: fsub za.s[w12, 7, vgx2], {z0.s-z1.s}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector list
+
+fsub za.d[w8, 0, vgx4], {z0.d-z4.d}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid number of vectors
+// CHECK-NEXT: fsub za.d[w8, 0, vgx4], {z0.d-z4.d}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fsub za.h[w8, 0, vgx2], {z1.h-z2.h}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 2 consecutive SVE vectors, where the first vector is a multiple of 2 and with matching element types
+// CHECK-NEXT: fsub za.h[w8, 0, vgx2], {z1.h-z2.h}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+fsub za.s[w8, 0, vgx4], {z1.s-z4.s}
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 4 consecutive SVE vectors, where the first vector is a multiple of 4 and with matching element types
+// CHECK-NEXT: fsub za.s[w8, 0, vgx4], {z1.s-z4.s}
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/fsub.s b/llvm/test/MC/AArch64/SME2p1/fsub.s
new file mode 100644
index 000000000000..b3735d554765
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/fsub.s
@@ -0,0 +1,296 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=+sme2p1,+sme-f16f16 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | llvm-objdump -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1,+sme-f16f16 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1,+sme-f16f16 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+
+fsub za.h[w8, 0], {z0.h - z1.h} // 11000001-10100100-00011100-00001000
+// CHECK-INST: fsub za.h[w8, 0, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x08,0x1c,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41c08 <unknown>
+
+fsub za.h[w10, 5, vgx2], {z10.h, z11.h} // 11000001-10100100-01011101-01001101
+// CHECK-INST: fsub za.h[w10, 5, vgx2], { z10.h, z11.h }
+// CHECK-ENCODING: [0x4d,0x5d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a45d4d <unknown>
+
+fsub za.h[w10, 5], {z10.h - z11.h} // 11000001-10100100-01011101-01001101
+// CHECK-INST: fsub za.h[w10, 5, vgx2], { z10.h, z11.h }
+// CHECK-ENCODING: [0x4d,0x5d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a45d4d <unknown>
+
+fsub za.h[w11, 7, vgx2], {z12.h, z13.h} // 11000001-10100100-01111101-10001111
+// CHECK-INST: fsub za.h[w11, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x8f,0x7d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a47d8f <unknown>
+
+fsub za.h[w11, 7], {z12.h - z13.h} // 11000001-10100100-01111101-10001111
+// CHECK-INST: fsub za.h[w11, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x8f,0x7d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a47d8f <unknown>
+
+fsub za.h[w11, 7, vgx2], {z30.h, z31.h} // 11000001-10100100-01111111-11001111
+// CHECK-INST: fsub za.h[w11, 7, vgx2], { z30.h, z31.h }
+// CHECK-ENCODING: [0xcf,0x7f,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a47fcf <unknown>
+
+fsub za.h[w11, 7], {z30.h - z31.h} // 11000001-10100100-01111111-11001111
+// CHECK-INST: fsub za.h[w11, 7, vgx2], { z30.h, z31.h }
+// CHECK-ENCODING: [0xcf,0x7f,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a47fcf <unknown>
+
+fsub za.h[w8, 5, vgx2], {z16.h, z17.h} // 11000001-10100100-00011110-00001101
+// CHECK-INST: fsub za.h[w8, 5, vgx2], { z16.h, z17.h }
+// CHECK-ENCODING: [0x0d,0x1e,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41e0d <unknown>
+
+fsub za.h[w8, 5], {z16.h - z17.h} // 11000001-10100100-00011110-00001101
+// CHECK-INST: fsub za.h[w8, 5, vgx2], { z16.h, z17.h }
+// CHECK-ENCODING: [0x0d,0x1e,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41e0d <unknown>
+
+fsub za.h[w8, 1, vgx2], {z0.h, z1.h} // 11000001-10100100-00011100-00001001
+// CHECK-INST: fsub za.h[w8, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x09,0x1c,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41c09 <unknown>
+
+fsub za.h[w8, 1], {z0.h - z1.h} // 11000001-10100100-00011100-00001001
+// CHECK-INST: fsub za.h[w8, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x09,0x1c,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41c09 <unknown>
+
+fsub za.h[w10, 0, vgx2], {z18.h, z19.h} // 11000001-10100100-01011110, 01001000
+// CHECK-INST: fsub za.h[w10, 0, vgx2], { z18.h, z19.h }
+// CHECK-ENCODING: [0x48,0x5e,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a45e48 <unknown>
+
+fsub za.h[w10, 0], {z18.h - z19.h} // 11000001-10100100-01011110-01001000
+// CHECK-INST: fsub za.h[w10, 0, vgx2], { z18.h, z19.h }
+// CHECK-ENCODING: [0x48,0x5e,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a45e48 <unknown>
+
+fsub za.h[w8, 0, vgx2], {z12.h, z13.h} // 11000001-10100100-00011101-10001000
+// CHECK-INST: fsub za.h[w8, 0, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x88,0x1d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41d88 <unknown>
+
+fsub za.h[w8, 0], {z12.h - z13.h} // 11000001-10100100-00011101-10001000
+// CHECK-INST: fsub za.h[w8, 0, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x88,0x1d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41d88 <unknown>
+
+fsub za.h[w10, 1, vgx2], {z0.h, z1.h} // 11000001-10100100-01011100-00001001
+// CHECK-INST: fsub za.h[w10, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x09,0x5c,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a45c09 <unknown>
+
+fsub za.h[w10, 1], {z0.h - z1.h} // 11000001-10100100-01011100-00001001
+// CHECK-INST: fsub za.h[w10, 1, vgx2], { z0.h, z1.h }
+// CHECK-ENCODING: [0x09,0x5c,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a45c09 <unknown>
+
+fsub za.h[w8, 5, vgx2], {z22.h, z23.h} // 11000001-10100100-00011110, 11001101
+// CHECK-INST: fsub za.h[w8, 5, vgx2], { z22.h, z23.h }
+// CHECK-ENCODING: [0xcd,0x1e,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41ecd <unknown>
+
+fsub za.h[w8, 5], {z22.h - z23.h} // 11000001-10100100-00011110-11001101
+// CHECK-INST: fsub za.h[w8, 5, vgx2], { z22.h, z23.h }
+// CHECK-ENCODING: [0xcd,0x1e,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a41ecd <unknown>
+
+fsub za.h[w11, 2, vgx2], {z8.h, z9.h} // 11000001-10100100-01111101-00001010
+// CHECK-INST: fsub za.h[w11, 2, vgx2], { z8.h, z9.h }
+// CHECK-ENCODING: [0x0a,0x7d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a47d0a <unknown>
+
+fsub za.h[w11, 2], {z8.h - z9.h} // 11000001-10100100-01111101-00001010
+// CHECK-INST: fsub za.h[w11, 2, vgx2], { z8.h, z9.h }
+// CHECK-ENCODING: [0x0a,0x7d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a47d0a <unknown>
+
+fsub za.h[w9, 7, vgx2], {z12.h, z13.h} // 11000001-10100100-00111101-10001111
+// CHECK-INST: fsub za.h[w9, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x8f,0x3d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a43d8f <unknown>
+
+fsub za.h[w9, 7], {z12.h - z13.h} // 11000001-10100100-00111101-10001111
+// CHECK-INST: fsub za.h[w9, 7, vgx2], { z12.h, z13.h }
+// CHECK-ENCODING: [0x8f,0x3d,0xa4,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a43d8f <unknown>
+
+
+fsub za.h[w8, 0, vgx4], {z0.h - z3.h} // 11000001-10100101-00011100-00001000
+// CHECK-INST: fsub za.h[w8, 0, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x08,0x1c,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51c08 <unknown>
+
+fsub za.h[w8, 0], {z0.h - z3.h} // 11000001-10100101-00011100-00001000
+// CHECK-INST: fsub za.h[w8, 0, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x08,0x1c,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51c08 <unknown>
+
+fsub za.h[w10, 5, vgx4], {z8.h - z11.h} // 11000001-10100101-01011101-00001101
+// CHECK-INST: fsub za.h[w10, 5, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x0d,0x5d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a55d0d <unknown>
+
+fsub za.h[w10, 5], {z8.h - z11.h} // 11000001-10100101-01011101-00001101
+// CHECK-INST: fsub za.h[w10, 5, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x0d,0x5d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a55d0d <unknown>
+
+fsub za.h[w11, 7, vgx4], {z12.h - z15.h} // 11000001-10100101-01111101-10001111
+// CHECK-INST: fsub za.h[w11, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x8f,0x7d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a57d8f <unknown>
+
+fsub za.h[w11, 7], {z12.h - z15.h} // 11000001-10100101-01111101-10001111
+// CHECK-INST: fsub za.h[w11, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x8f,0x7d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a57d8f <unknown>
+
+fsub za.h[w11, 7, vgx4], {z28.h - z31.h} // 11000001-10100101-01111111-10001111
+// CHECK-INST: fsub za.h[w11, 7, vgx4], { z28.h - z31.h }
+// CHECK-ENCODING: [0x8f,0x7f,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a57f8f <unknown>
+
+fsub za.h[w11, 7], {z28.h - z31.h} // 11000001-10100101-01111111-10001111
+// CHECK-INST: fsub za.h[w11, 7, vgx4], { z28.h - z31.h }
+// CHECK-ENCODING: [0x8f,0x7f,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a57f8f <unknown>
+
+fsub za.h[w8, 5, vgx4], {z16.h - z19.h} // 11000001-10100101-00011110-00001101
+// CHECK-INST: fsub za.h[w8, 5, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x0d,0x1e,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51e0d <unknown>
+
+fsub za.h[w8, 5], {z16.h - z19.h} // 11000001-10100101-00011110-00001101
+// CHECK-INST: fsub za.h[w8, 5, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x0d,0x1e,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51e0d <unknown>
+
+fsub za.h[w8, 1, vgx4], {z0.h - z3.h} // 11000001-10100101-00011100-00001001
+// CHECK-INST: fsub za.h[w8, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x09,0x1c,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51c09 <unknown>
+
+fsub za.h[w8, 1], {z0.h - z3.h} // 11000001-10100101-00011100-00001001
+// CHECK-INST: fsub za.h[w8, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x09,0x1c,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51c09 <unknown>
+
+fsub za.h[w10, 0, vgx4], {z16.h - z19.h} // 11000001-10100101-01011110-00001000
+// CHECK-INST: fsub za.h[w10, 0, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x08,0x5e,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a55e08 <unknown>
+
+fsub za.h[w10, 0], {z16.h - z19.h} // 11000001-10100101-01011110-00001000
+// CHECK-INST: fsub za.h[w10, 0, vgx4], { z16.h - z19.h }
+// CHECK-ENCODING: [0x08,0x5e,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a55e08 <unknown>
+
+fsub za.h[w8, 0, vgx4], {z12.h - z15.h} // 11000001-10100101-00011101-10001000
+// CHECK-INST: fsub za.h[w8, 0, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x88,0x1d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51d88 <unknown>
+
+fsub za.h[w8, 0], {z12.h - z15.h} // 11000001-10100101-00011101-10001000
+// CHECK-INST: fsub za.h[w8, 0, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x88,0x1d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51d88 <unknown>
+
+fsub za.h[w10, 1, vgx4], {z0.h - z3.h} // 11000001-10100101-01011100-00001001
+// CHECK-INST: fsub za.h[w10, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x09,0x5c,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a55c09 <unknown>
+
+fsub za.h[w10, 1], {z0.h - z3.h} // 11000001-10100101-01011100-00001001
+// CHECK-INST: fsub za.h[w10, 1, vgx4], { z0.h - z3.h }
+// CHECK-ENCODING: [0x09,0x5c,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a55c09 <unknown>
+
+fsub za.h[w8, 5, vgx4], {z20.h - z23.h} // 11000001-10100101-00011110-10001101
+// CHECK-INST: fsub za.h[w8, 5, vgx4], { z20.h - z23.h }
+// CHECK-ENCODING: [0x8d,0x1e,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51e8d <unknown>
+
+fsub za.h[w8, 5], {z20.h - z23.h} // 11000001-10100101-00011110-10001101
+// CHECK-INST: fsub za.h[w8, 5, vgx4], { z20.h - z23.h }
+// CHECK-ENCODING: [0x8d,0x1e,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a51e8d <unknown>
+
+fsub za.h[w11, 2, vgx4], {z8.h - z11.h} // 11000001-10100101-01111101-00001010
+// CHECK-INST: fsub za.h[w11, 2, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x0a,0x7d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a57d0a <unknown>
+
+fsub za.h[w11, 2], {z8.h - z11.h} // 11000001-10100101-01111101-00001010
+// CHECK-INST: fsub za.h[w11, 2, vgx4], { z8.h - z11.h }
+// CHECK-ENCODING: [0x0a,0x7d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a57d0a <unknown>
+
+fsub za.h[w9, 7, vgx4], {z12.h - z15.h} // 11000001-10100101-00111101-10001111
+// CHECK-INST: fsub za.h[w9, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x8f,0x3d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a53d8f <unknown>
+
+fsub za.h[w9, 7], {z12.h - z15.h} // 11000001-10100101-00111101-10001111
+// CHECK-INST: fsub za.h[w9, 7, vgx4], { z12.h - z15.h }
+// CHECK-ENCODING: [0x8f,0x3d,0xa5,0xc1]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c1a53d8f <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/luti2-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/luti2-diagnostics.s
new file mode 100644
index 000000000000..9f0751157cf8
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/luti2-diagnostics.s
@@ -0,0 +1,70 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid lane indices
+
+luti2 {z0.h, z8.h}, zt0, z0[8]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: luti2 {z0.h, z8.h}, zt0, z0[8]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti2 {z0.h, z8.h}, zt0, z0[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: luti2 {z0.h, z8.h}, zt0, z0[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti2 {z0.h, z8.h}, zt0, z0[8]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: luti2 {z0.h, z8.h}, zt0, z0[8]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti2 {z0.h, z8.h}, zt0, z0[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: luti2 {z0.h, z8.h}, zt0, z0[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti2 {z0.b, z8.b}, zt0, z0[8]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: luti2 {z0.b, z8.b}, zt0, z0[8]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti2 {z0.b, z8.b}, zt0, z0[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 7].
+// CHECK-NEXT: luti2 {z0.b, z8.b}, zt0, z0[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti2 {z19.h, z23.h, z27.h, z31.h}, zt0, z31[4]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 3].
+// CHECK-NEXT: luti2 {z19.h, z23.h, z27.h, z31.h}, zt0, z31[4]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti2 {z19.h, z23.h, z27.h, z31.h}, zt0, z31[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 3].
+// CHECK-NEXT: luti2 {z19.h, z23.h, z27.h, z31.h}, zt0, z31[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti2 {z19.b, z23.b, z27.b, z31.b}, zt0, z31[4]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 3].
+// CHECK-NEXT: luti2 {z19.b, z23.b, z27.b, z31.b}, zt0, z31[4]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti2 {z19.b, z23.b, z27.b, z31.b}, zt0, z31[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 3].
+// CHECK-NEXT: luti2 {z19.b, z23.b, z27.b, z31.b}, zt0, z31[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector lists
+
+luti2 {z0.h, z9.h}, zt0, z0[2]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: luti2 {z0.h, z9.h}, zt0, z0[2]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector suffix
+
+luti2 {z0.d, z2.d}, zt0, z0[3]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: luti2 {z0.d, z2.d}, zt0, z0[3]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/luti2.s b/llvm/test/MC/AArch64/SME2p1/luti2.s
new file mode 100644
index 000000000000..99514d5f6abe
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/luti2.s
@@ -0,0 +1,115 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1 < %s \
+// RUN: | llvm-objdump --no-print-imm-hex -d --mattr=+sme2p1 - \
+// RUN: | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1 < %s \
+// RUN: | llvm-objdump --no-print-imm-hex -d --mattr=-sme2p1 - \
+// RUN: | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+
+luti2 {z0.h, z8.h}, zt0, z0[0] // 11000000-10011100-01010000-00000000
+// CHECK-INST: luti2 { z0.h, z8.h }, zt0, z0[0]
+// CHECK-ENCODING: [0x00,0x50,0x9c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09c5000 <unknown>
+
+luti2 {z21.h, z29.h}, zt0, z10[2] // 11000000-10011101-01010001-01010101
+// CHECK-INST: luti2 { z21.h, z29.h }, zt0, z10[2]
+// CHECK-ENCODING: [0x55,0x51,0x9d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09d5155 <unknown>
+
+luti2 {z23.h, z31.h}, zt0, z13[1] // 11000000-10011100-11010001-10110111
+// CHECK-INST: luti2 { z23.h, z31.h }, zt0, z13[1]
+// CHECK-ENCODING: [0xb7,0xd1,0x9c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09cd1b7 <unknown>
+
+luti2 {z23.h, z31.h}, zt0, z31[7] // 11000000-10011111-11010011-11110111
+// CHECK-INST: luti2 { z23.h, z31.h }, zt0, z31[7]
+// CHECK-ENCODING: [0xf7,0xd3,0x9f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09fd3f7 <unknown>
+
+
+luti2 {z0.b, z8.b}, zt0, z0[0] // 11000000-10011100-01000000-00000000
+// CHECK-INST: luti2 { z0.b, z8.b }, zt0, z0[0]
+// CHECK-ENCODING: [0x00,0x40,0x9c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09c4000 <unknown>
+
+luti2 {z21.b, z29.b}, zt0, z10[2] // 11000000-10011101-01000001-01010101
+// CHECK-INST: luti2 { z21.b, z29.b }, zt0, z10[2]
+// CHECK-ENCODING: [0x55,0x41,0x9d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09d4155 <unknown>
+
+luti2 {z23.b, z31.b}, zt0, z13[1] // 11000000-10011100-11000001-10110111
+// CHECK-INST: luti2 { z23.b, z31.b }, zt0, z13[1]
+// CHECK-ENCODING: [0xb7,0xc1,0x9c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09cc1b7 <unknown>
+
+luti2 {z23.b, z31.b}, zt0, z31[7] // 11000000-10011111-11000011-11110111
+// CHECK-INST: luti2 { z23.b, z31.b }, zt0, z31[7]
+// CHECK-ENCODING: [0xf7,0xc3,0x9f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09fc3f7 <unknown>
+
+
+luti2 {z0.h, z4.h, z8.h, z12.h}, zt0, z0[0] // 11000000-10011100-10010000-00000000
+// CHECK-INST: luti2 { z0.h, z4.h, z8.h, z12.h }, zt0, z0[0]
+// CHECK-ENCODING: [0x00,0x90,0x9c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09c9000 <unknown>
+
+luti2 {z17.h, z21.h, z25.h, z29.h}, zt0, z10[1] // 11000000-10011101-10010001-01010001
+// CHECK-INST: luti2 { z17.h, z21.h, z25.h, z29.h }, zt0, z10[1]
+// CHECK-ENCODING: [0x51,0x91,0x9d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09d9151 <unknown>
+
+luti2 {z19.h, z23.h, z27.h, z31.h}, zt0, z13[0] // 11000000-10011100-10010001-10110011
+// CHECK-INST: luti2 { z19.h, z23.h, z27.h, z31.h }, zt0, z13[0]
+// CHECK-ENCODING: [0xb3,0x91,0x9c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09c91b3 <unknown>
+
+luti2 {z19.h, z23.h, z27.h, z31.h}, zt0, z31[3] // 11000000-10011111-10010011-11110011
+// CHECK-INST: luti2 { z19.h, z23.h, z27.h, z31.h }, zt0, z31[3]
+// CHECK-ENCODING: [0xf3,0x93,0x9f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09f93f3 <unknown>
+
+
+luti2 {z0.b, z4.b, z8.b, z12.b}, zt0, z0[0] // 11000000-10011100-10000000-00000000
+// CHECK-INST: luti2 { z0.b, z4.b, z8.b, z12.b }, zt0, z0[0]
+// CHECK-ENCODING: [0x00,0x80,0x9c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09c8000 <unknown>
+
+luti2 {z17.b, z21.b, z25.b, z29.b}, zt0, z10[1] // 11000000-10011101-10000001-01010001
+// CHECK-INST: luti2 { z17.b, z21.b, z25.b, z29.b }, zt0, z10[1]
+// CHECK-ENCODING: [0x51,0x81,0x9d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09d8151 <unknown>
+
+luti2 {z19.b, z23.b, z27.b, z31.b}, zt0, z13[0] // 11000000-10011100-10000001-10110011
+// CHECK-INST: luti2 { z19.b, z23.b, z27.b, z31.b }, zt0, z13[0]
+// CHECK-ENCODING: [0xb3,0x81,0x9c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09c81b3 <unknown>
+
+luti2 {z19.b, z23.b, z27.b, z31.b}, zt0, z31[3] // 11000000-10011111-10000011-11110011
+// CHECK-INST: luti2 { z19.b, z23.b, z27.b, z31.b }, zt0, z31[3]
+// CHECK-ENCODING: [0xf3,0x83,0x9f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09f83f3 <unknown>
+
diff --git a/llvm/test/MC/AArch64/SME2p1/luti4-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/luti4-diagnostics.s
new file mode 100644
index 000000000000..e92446f295ca
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/luti4-diagnostics.s
@@ -0,0 +1,63 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Invalid lane indices
+
+luti4 {z0.h, z8.h}, zt0, z0[4]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 3].
+// CHECK-NEXT: luti4 {z0.h, z8.h}, zt0, z0[4]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti4 {z0.h, z8.h}, zt0, z0[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 3].
+// CHECK-NEXT: luti4 {z0.h, z8.h}, zt0, z0[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti4 {z0.h, z8.h}, zt0, z0[4]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 3].
+// CHECK-NEXT: luti4 {z0.h, z8.h}, zt0, z0[4]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti4 {z0.h, z8.h}, zt0, z0[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 3].
+// CHECK-NEXT: luti4 {z0.h, z8.h}, zt0, z0[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti4 {z0.b, z8.b}, zt0, z0[4]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 3].
+// CHECK-NEXT: luti4 {z0.b, z8.b}, zt0, z0[4]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti4 {z0.b, z8.b}, zt0, z0[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 3].
+// CHECK-NEXT: luti4 {z0.b, z8.b}, zt0, z0[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti4 {z19.h, z23.h, z27.h, z31.h}, zt0, z31[2]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 1].
+// CHECK-NEXT: luti4 {z19.h, z23.h, z27.h, z31.h}, zt0, z31[2]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti4 {z19.h, z23.h, z27.h, z31.h}, zt0, z31[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 1].
+// CHECK-NEXT: luti4 {z19.h, z23.h, z27.h, z31.h}, zt0, z31[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti4 {z19.h, z23.h, z27.h, z31.h}, zt0, z31[2]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 1].
+// CHECK-NEXT: luti4 {z19.h, z23.h, z27.h, z31.h}, zt0, z31[2]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+luti4 {z19.h, z23.h, z27.h, z31.h}, zt0, z31[-1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector lane must be an integer in range [0, 1].
+// CHECK-NEXT: luti4 {z19.h, z23.h, z27.h, z31.h}, zt0, z31[-1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector lists
+
+luti4 {z1.s-z4.s}, zt0, z0[3]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: Invalid vector list, expected list with 4 consecutive SVE vectors, where the first vector is a multiple of 4 and with matching element types
+// CHECK-NEXT: luti4 {z1.s-z4.s}, zt0, z0[3]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
diff --git a/llvm/test/MC/AArch64/SME2p1/luti4.s b/llvm/test/MC/AArch64/SME2p1/luti4.s
new file mode 100644
index 000000000000..7666e129956b
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/luti4.s
@@ -0,0 +1,89 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1 < %s \
+// RUN: | llvm-objdump --no-print-imm-hex -d --mattr=+sme2p1 - \
+// RUN: | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1 < %s \
+// RUN: | llvm-objdump --no-print-imm-hex -d --mattr=-sme2p1 - \
+// RUN: | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+
+luti4 {z0.h, z8.h}, zt0, z0[0] // 11000000-10011010-01010000-00000000
+// CHECK-INST: luti4 { z0.h, z8.h }, zt0, z0[0]
+// CHECK-ENCODING: [0x00,0x50,0x9a,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09a5000 <unknown>
+
+luti4 {z21.h, z29.h}, zt0, z10[2] // 11000000-10011011-01010001-01010101
+// CHECK-INST: luti4 { z21.h, z29.h }, zt0, z10[2]
+// CHECK-ENCODING: [0x55,0x51,0x9b,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09b5155 <unknown>
+
+luti4 {z23.h, z31.h}, zt0, z13[1] // 11000000-10011010-11010001-10110111
+// CHECK-INST: luti4 { z23.h, z31.h }, zt0, z13[1]
+// CHECK-ENCODING: [0xb7,0xd1,0x9a,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09ad1b7 <unknown>
+
+luti4 {z23.h, z31.h}, zt0, z31[3] // 11000000-10011011-11010011-11110111
+// CHECK-INST: luti4 { z23.h, z31.h }, zt0, z31[3]
+// CHECK-ENCODING: [0xf7,0xd3,0x9b,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09bd3f7 <unknown>
+
+
+luti4 {z0.b, z8.b}, zt0, z0[0] // 11000000-10011010-01000000-00000000
+// CHECK-INST: luti4 { z0.b, z8.b }, zt0, z0[0]
+// CHECK-ENCODING: [0x00,0x40,0x9a,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09a4000 <unknown>
+
+luti4 {z21.b, z29.b}, zt0, z10[2] // 11000000-10011011-01000001-01010101
+// CHECK-INST: luti4 { z21.b, z29.b }, zt0, z10[2]
+// CHECK-ENCODING: [0x55,0x41,0x9b,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09b4155 <unknown>
+
+luti4 {z23.b, z31.b}, zt0, z13[1] // 11000000-10011010-11000001-10110111
+// CHECK-INST: luti4 { z23.b, z31.b }, zt0, z13[1]
+// CHECK-ENCODING: [0xb7,0xc1,0x9a,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09ac1b7 <unknown>
+
+luti4 {z23.b, z31.b}, zt0, z31[3] // 11000000-10011011-11000011-11110111
+// CHECK-INST: luti4 { z23.b, z31.b }, zt0, z31[3]
+// CHECK-ENCODING: [0xf7,0xc3,0x9b,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09bc3f7 <unknown>
+
+
+luti4 {z0.h, z4.h, z8.h, z12.h}, zt0, z0[0] // 11000000-10011010-10010000-00000000
+// CHECK-INST: luti4 { z0.h, z4.h, z8.h, z12.h }, zt0, z0[0]
+// CHECK-ENCODING: [0x00,0x90,0x9a,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09a9000 <unknown>
+
+luti4 {z17.h, z21.h, z25.h, z29.h}, zt0, z10[1] // 11000000-10011011-10010001-01010001
+// CHECK-INST: luti4 { z17.h, z21.h, z25.h, z29.h }, zt0, z10[1]
+// CHECK-ENCODING: [0x51,0x91,0x9b,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09b9151 <unknown>
+
+luti4 {z19.h, z23.h, z27.h, z31.h}, zt0, z13[0] // 11000000-10011010-10010001-10110011
+// CHECK-INST: luti4 { z19.h, z23.h, z27.h, z31.h }, zt0, z13[0]
+// CHECK-ENCODING: [0xb3,0x91,0x9a,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09a91b3 <unknown>
+
+luti4 {z19.h, z23.h, z27.h, z31.h}, zt0, z31[1] // 11000000-10011011-10010011-11110011
+// CHECK-INST: luti4 { z19.h, z23.h, z27.h, z31.h }, zt0, z31[1]
+// CHECK-ENCODING: [0xf3,0x93,0x9b,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c09b93f3 <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/movaz-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/movaz-diagnostics.s
new file mode 100644
index 000000000000..276dd907a9eb
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/movaz-diagnostics.s
@@ -0,0 +1,100 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Out of range index offset
+
+movaz {z0.h-z1.h}, za0h.h[w12, 1:2]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector select offset must be an immediate range of the form <immf>:<imml>, where the first immediate is a multiple of 2 in the range [0, 6] or [0, 14] depending on the instruction, and the second immediate is immf + 1.
+// CHECK-NEXT: movaz {z0.h-z1.h}, za0h.h[w12, 1:2]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz {z0.b-z3.b}, za0v.b[w12, 1:4]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector select offset must be an immediate range of the form <immf>:<imml>, where the first immediate is a multiple of 4 in the range [0, 4] or [0, 12] depending on the instruction, and the second immediate is immf + 3.
+// CHECK-NEXT: movaz {z0.b-z3.b}, za0v.b[w12, 1:4]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz {z0.s-z1.s}, za0h.s[w12, 0:2]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: movaz {z0.s-z1.s}, za0h.s[w12, 0:2]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz {z0.d-z3.d}, za0h.d[w12, 0:4]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid operand for instruction
+// CHECK-NEXT: movaz {z0.d-z3.d}, za0h.d[w12, 0:4]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz {z4.d-z7.d}, za.d[w9, 8]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: movaz {z4.d-z7.d}, za.d[w9, 8]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz {z4.d-z7.d}, za.d[w9, -1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: movaz {z4.d-z7.d}, za.d[w9, -1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz z1.q, za1h.q[w12, 1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be 0.
+// CHECK-NEXT: movaz z1.q, za1h.q[w12, 1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz z31.h, za1h.h[w15, 8]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: movaz z31.h, za1h.h[w15, 8]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz z31.h, za1h.h[w15, -1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: movaz z31.h, za1h.h[w15, -1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz z2.b, za0v.b[w15, -1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 15].
+// CHECK-NEXT: movaz z2.b, za0v.b[w15, -1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz z2.b, za0v.b[w15, 16]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 15].
+// CHECK-NEXT: movaz z2.b, za0v.b[w15, 16]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz z31.s, za1h.s[w15, 4]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 3].
+// CHECK-NEXT: movaz z31.s, za1h.s[w15, 4]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz z31.s, za1h.s[w15, -1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 3].
+// CHECK-NEXT: movaz z31.s, za1h.s[w15, -1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz z31.d, za1v.d[w15, 2]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 1].
+// CHECK-NEXT: movaz z31.d, za1v.d[w15, 2]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz z31.d, za1h.d[w15, -1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 1].
+// CHECK-NEXT: movaz z31.d, za1h.d[w15, -1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid vector select register
+
+movaz z0.h, za0v.h[w11, 0]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w12, w15]
+// CHECK-NEXT: movaz z0.h, za0v.h[w11, 0]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+movaz z0.h, za0v.h[w16, 0]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w12, w15]
+// CHECK-NEXT: movaz z0.h, za0v.h[w16, 0]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid matrix operand
+
+movaz z31.s, za1h.d[w15, -1]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand, expected za[0-3]h.s or za[0-3]v.s
+// CHECK-NEXT: movaz z31.s, za1h.d[w15, -1]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/movaz.s b/llvm/test/MC/AArch64/SME2p1/movaz.s
new file mode 100644
index 000000000000..73390070c696
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/movaz.s
@@ -0,0 +1,1022 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1 < %s \
+// RUN: | llvm-objdump --no-print-imm-hex -d --mattr=+sme2p1 - | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1 < %s \
+// RUN: | llvm-objdump --no-print-imm-hex -d --mattr=-sme2p1 - | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+movaz {z0.d, z1.d}, za.d[w8, 0, vgx2] // 11000000-00000110-00001010-00000000
+// CHECK-INST: movaz { z0.d, z1.d }, za.d[w8, 0, vgx2]
+// CHECK-ENCODING: [0x00,0x0a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060a00 <unknown>
+
+movaz {z0.d, z1.d}, za.d[w8, 0] // 11000000-00000110-00001010-00000000
+// CHECK-INST: movaz { z0.d, z1.d }, za.d[w8, 0, vgx2]
+// CHECK-ENCODING: [0x00,0x0a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060a00 <unknown>
+
+movaz {z20.d, z21.d}, za.d[w10, 2, vgx2] // 11000000-00000110-01001010-01010100
+// CHECK-INST: movaz { z20.d, z21.d }, za.d[w10, 2, vgx2]
+// CHECK-ENCODING: [0x54,0x4a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0064a54 <unknown>
+
+movaz {z20.d, z21.d}, za.d[w10, 2] // 11000000-00000110-01001010-01010100
+// CHECK-INST: movaz { z20.d, z21.d }, za.d[w10, 2, vgx2]
+// CHECK-ENCODING: [0x54,0x4a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0064a54 <unknown>
+
+movaz {z22.d, z23.d}, za.d[w11, 5, vgx2] // 11000000-00000110-01101010-10110110
+// CHECK-INST: movaz { z22.d, z23.d }, za.d[w11, 5, vgx2]
+// CHECK-ENCODING: [0xb6,0x6a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0066ab6 <unknown>
+
+movaz {z22.d, z23.d}, za.d[w11, 5] // 11000000-00000110-01101010-10110110
+// CHECK-INST: movaz { z22.d, z23.d }, za.d[w11, 5, vgx2]
+// CHECK-ENCODING: [0xb6,0x6a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0066ab6 <unknown>
+
+movaz {z30.d, z31.d}, za.d[w11, 7, vgx2] // 11000000-00000110-01101010-11111110
+// CHECK-INST: movaz { z30.d, z31.d }, za.d[w11, 7, vgx2]
+// CHECK-ENCODING: [0xfe,0x6a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0066afe <unknown>
+
+movaz {z30.d, z31.d}, za.d[w11, 7] // 11000000-00000110-01101010-11111110
+// CHECK-INST: movaz { z30.d, z31.d }, za.d[w11, 7, vgx2]
+// CHECK-ENCODING: [0xfe,0x6a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0066afe <unknown>
+
+movaz {z4.d, z5.d}, za.d[w8, 1, vgx2] // 11000000-00000110-00001010-00100100
+// CHECK-INST: movaz { z4.d, z5.d }, za.d[w8, 1, vgx2]
+// CHECK-ENCODING: [0x24,0x0a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060a24 <unknown>
+
+movaz {z4.d, z5.d}, za.d[w8, 1] // 11000000-00000110-00001010-00100100
+// CHECK-INST: movaz { z4.d, z5.d }, za.d[w8, 1, vgx2]
+// CHECK-ENCODING: [0x24,0x0a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060a24 <unknown>
+
+movaz {z0.d, z1.d}, za.d[w8, 1, vgx2] // 11000000-00000110-00001010-00100000
+// CHECK-INST: movaz { z0.d, z1.d }, za.d[w8, 1, vgx2]
+// CHECK-ENCODING: [0x20,0x0a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060a20 <unknown>
+
+movaz {z0.d, z1.d}, za.d[w8, 1] // 11000000-00000110-00001010-00100000
+// CHECK-INST: movaz { z0.d, z1.d }, za.d[w8, 1, vgx2]
+// CHECK-ENCODING: [0x20,0x0a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060a20 <unknown>
+
+movaz {z24.d, z25.d}, za.d[w10, 3, vgx2] // 11000000-00000110-01001010-01111000
+// CHECK-INST: movaz { z24.d, z25.d }, za.d[w10, 3, vgx2]
+// CHECK-ENCODING: [0x78,0x4a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0064a78 <unknown>
+
+movaz {z24.d, z25.d}, za.d[w10, 3] // 11000000-00000110-01001010-01111000
+// CHECK-INST: movaz { z24.d, z25.d }, za.d[w10, 3, vgx2]
+// CHECK-ENCODING: [0x78,0x4a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0064a78 <unknown>
+
+movaz {z0.d, z1.d}, za.d[w8, 4, vgx2] // 11000000-00000110-00001010-10000000
+// CHECK-INST: movaz { z0.d, z1.d }, za.d[w8, 4, vgx2]
+// CHECK-ENCODING: [0x80,0x0a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060a80 <unknown>
+
+movaz {z0.d, z1.d}, za.d[w8, 4] // 11000000-00000110-00001010-10000000
+// CHECK-INST: movaz { z0.d, z1.d }, za.d[w8, 4, vgx2]
+// CHECK-ENCODING: [0x80,0x0a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060a80 <unknown>
+
+movaz {z16.d, z17.d}, za.d[w10, 1, vgx2] // 11000000-00000110-01001010-00110000
+// CHECK-INST: movaz { z16.d, z17.d }, za.d[w10, 1, vgx2]
+// CHECK-ENCODING: [0x30,0x4a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0064a30 <unknown>
+
+movaz {z16.d, z17.d}, za.d[w10, 1] // 11000000-00000110-01001010-00110000
+// CHECK-INST: movaz { z16.d, z17.d }, za.d[w10, 1, vgx2]
+// CHECK-ENCODING: [0x30,0x4a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0064a30 <unknown>
+
+movaz {z28.d, z29.d}, za.d[w8, 6, vgx2] // 11000000-00000110-00001010-11011100
+// CHECK-INST: movaz { z28.d, z29.d }, za.d[w8, 6, vgx2]
+// CHECK-ENCODING: [0xdc,0x0a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060adc <unknown>
+
+movaz {z28.d, z29.d}, za.d[w8, 6] // 11000000-00000110-00001010-11011100
+// CHECK-INST: movaz { z28.d, z29.d }, za.d[w8, 6, vgx2]
+// CHECK-ENCODING: [0xdc,0x0a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060adc <unknown>
+
+movaz {z2.d, z3.d}, za.d[w11, 1, vgx2] // 11000000-00000110-01101010-00100010
+// CHECK-INST: movaz { z2.d, z3.d }, za.d[w11, 1, vgx2]
+// CHECK-ENCODING: [0x22,0x6a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0066a22 <unknown>
+
+movaz {z2.d, z3.d}, za.d[w11, 1] // 11000000-00000110-01101010-00100010
+// CHECK-INST: movaz { z2.d, z3.d }, za.d[w11, 1, vgx2]
+// CHECK-ENCODING: [0x22,0x6a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0066a22 <unknown>
+
+movaz {z6.d, z7.d}, za.d[w9, 4, vgx2] // 11000000-00000110-00101010-10000110
+// CHECK-INST: movaz { z6.d, z7.d }, za.d[w9, 4, vgx2]
+// CHECK-ENCODING: [0x86,0x2a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0062a86 <unknown>
+
+movaz {z6.d, z7.d}, za.d[w9, 4] // 11000000-00000110-00101010-10000110
+// CHECK-INST: movaz { z6.d, z7.d }, za.d[w9, 4, vgx2]
+// CHECK-ENCODING: [0x86,0x2a,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0062a86 <unknown>
+
+
+movaz {z0.d - z3.d}, za.d[w8, 0, vgx4] // 11000000-00000110-00001110-00000000
+// CHECK-INST: movaz { z0.d - z3.d }, za.d[w8, 0, vgx4]
+// CHECK-ENCODING: [0x00,0x0e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060e00 <unknown>
+
+movaz {z0.d - z3.d}, za.d[w8, 0] // 11000000-00000110-00001110-00000000
+// CHECK-INST: movaz { z0.d - z3.d }, za.d[w8, 0, vgx4]
+// CHECK-ENCODING: [0x00,0x0e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060e00 <unknown>
+
+movaz {z20.d - z23.d}, za.d[w10, 2, vgx4] // 11000000-00000110-01001110-01010100
+// CHECK-INST: movaz { z20.d - z23.d }, za.d[w10, 2, vgx4]
+// CHECK-ENCODING: [0x54,0x4e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0064e54 <unknown>
+
+movaz {z20.d - z23.d}, za.d[w10, 2] // 11000000-00000110-01001110-01010100
+// CHECK-INST: movaz { z20.d - z23.d }, za.d[w10, 2, vgx4]
+// CHECK-ENCODING: [0x54,0x4e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0064e54 <unknown>
+
+movaz {z20.d - z23.d}, za.d[w11, 5, vgx4] // 11000000-00000110-01101110-10110100
+// CHECK-INST: movaz { z20.d - z23.d }, za.d[w11, 5, vgx4]
+// CHECK-ENCODING: [0xb4,0x6e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0066eb4 <unknown>
+
+movaz {z20.d - z23.d}, za.d[w11, 5] // 11000000-00000110-01101110-10110100
+// CHECK-INST: movaz { z20.d - z23.d }, za.d[w11, 5, vgx4]
+// CHECK-ENCODING: [0xb4,0x6e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0066eb4 <unknown>
+
+movaz {z28.d - z31.d}, za.d[w11, 7, vgx4] // 11000000-00000110-01101110-11111100
+// CHECK-INST: movaz { z28.d - z31.d }, za.d[w11, 7, vgx4]
+// CHECK-ENCODING: [0xfc,0x6e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0066efc <unknown>
+
+movaz {z28.d - z31.d}, za.d[w11, 7] // 11000000-00000110-01101110-11111100
+// CHECK-INST: movaz { z28.d - z31.d }, za.d[w11, 7, vgx4]
+// CHECK-ENCODING: [0xfc,0x6e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0066efc <unknown>
+
+movaz {z4.d - z7.d}, za.d[w8, 1, vgx4] // 11000000-00000110-00001110-00100100
+// CHECK-INST: movaz { z4.d - z7.d }, za.d[w8, 1, vgx4]
+// CHECK-ENCODING: [0x24,0x0e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060e24 <unknown>
+
+movaz {z4.d - z7.d}, za.d[w8, 1] // 11000000-00000110-00001110-00100100
+// CHECK-INST: movaz { z4.d - z7.d }, za.d[w8, 1, vgx4]
+// CHECK-ENCODING: [0x24,0x0e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060e24 <unknown>
+
+movaz {z0.d - z3.d}, za.d[w8, 1, vgx4] // 11000000-00000110-00001110-00100000
+// CHECK-INST: movaz { z0.d - z3.d }, za.d[w8, 1, vgx4]
+// CHECK-ENCODING: [0x20,0x0e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060e20 <unknown>
+
+movaz {z0.d - z3.d}, za.d[w8, 1] // 11000000-00000110-00001110-00100000
+// CHECK-INST: movaz { z0.d - z3.d }, za.d[w8, 1, vgx4]
+// CHECK-ENCODING: [0x20,0x0e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060e20 <unknown>
+
+movaz {z24.d - z27.d}, za.d[w10, 3, vgx4] // 11000000-00000110-01001110-01111000
+// CHECK-INST: movaz { z24.d - z27.d }, za.d[w10, 3, vgx4]
+// CHECK-ENCODING: [0x78,0x4e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0064e78 <unknown>
+
+movaz {z24.d - z27.d}, za.d[w10, 3] // 11000000-00000110-01001110-01111000
+// CHECK-INST: movaz { z24.d - z27.d }, za.d[w10, 3, vgx4]
+// CHECK-ENCODING: [0x78,0x4e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0064e78 <unknown>
+
+movaz {z0.d - z3.d}, za.d[w8, 4, vgx4] // 11000000-00000110-00001110-10000000
+// CHECK-INST: movaz { z0.d - z3.d }, za.d[w8, 4, vgx4]
+// CHECK-ENCODING: [0x80,0x0e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060e80 <unknown>
+
+movaz {z0.d - z3.d}, za.d[w8, 4] // 11000000-00000110-00001110-10000000
+// CHECK-INST: movaz { z0.d - z3.d }, za.d[w8, 4, vgx4]
+// CHECK-ENCODING: [0x80,0x0e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060e80 <unknown>
+
+movaz {z16.d - z19.d}, za.d[w10, 1, vgx4] // 11000000-00000110-01001110-00110000
+// CHECK-INST: movaz { z16.d - z19.d }, za.d[w10, 1, vgx4]
+// CHECK-ENCODING: [0x30,0x4e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0064e30 <unknown>
+
+movaz {z16.d - z19.d}, za.d[w10, 1] // 11000000-00000110-01001110-00110000
+// CHECK-INST: movaz { z16.d - z19.d }, za.d[w10, 1, vgx4]
+// CHECK-ENCODING: [0x30,0x4e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0064e30 <unknown>
+
+movaz {z28.d - z31.d}, za.d[w8, 6, vgx4] // 11000000-00000110-00001110-11011100
+// CHECK-INST: movaz { z28.d - z31.d }, za.d[w8, 6, vgx4]
+// CHECK-ENCODING: [0xdc,0x0e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060edc <unknown>
+
+movaz {z28.d - z31.d}, za.d[w8, 6] // 11000000-00000110-00001110-11011100
+// CHECK-INST: movaz { z28.d - z31.d }, za.d[w8, 6, vgx4]
+// CHECK-ENCODING: [0xdc,0x0e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0060edc <unknown>
+
+movaz {z0.d - z3.d}, za.d[w11, 1, vgx4] // 11000000-00000110-01101110-00100000
+// CHECK-INST: movaz { z0.d - z3.d }, za.d[w11, 1, vgx4]
+// CHECK-ENCODING: [0x20,0x6e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0066e20 <unknown>
+
+movaz {z0.d - z3.d}, za.d[w11, 1] // 11000000-00000110-01101110-00100000
+// CHECK-INST: movaz { z0.d - z3.d }, za.d[w11, 1, vgx4]
+// CHECK-ENCODING: [0x20,0x6e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0066e20 <unknown>
+
+movaz {z4.d - z7.d}, za.d[w9, 4, vgx4] // 11000000-00000110-00101110-10000100
+// CHECK-INST: movaz { z4.d - z7.d }, za.d[w9, 4, vgx4]
+// CHECK-ENCODING: [0x84,0x2e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0062e84 <unknown>
+
+movaz {z4.d - z7.d}, za.d[w9, 4] // 11000000-00000110-00101110-10000100
+// CHECK-INST: movaz { z4.d - z7.d }, za.d[w9, 4, vgx4]
+// CHECK-ENCODING: [0x84,0x2e,0x06,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0062e84 <unknown>
+
+
+movaz z0.q, za0h.q[w12, 0] // 11000000-11000011-00000010-00000000
+// CHECK-INST: movaz z0.q, za0h.q[w12, 0]
+// CHECK-ENCODING: [0x00,0x02,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c30200 <unknown>
+
+movaz z21.q, za10h.q[w14, 0] // 11000000-11000011-01000011-01010101
+// CHECK-INST: movaz z21.q, za10h.q[w14, 0]
+// CHECK-ENCODING: [0x55,0x43,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c34355 <unknown>
+
+movaz z23.q, za13h.q[w15, 0] // 11000000-11000011-01100011-10110111
+// CHECK-INST: movaz z23.q, za13h.q[w15, 0]
+// CHECK-ENCODING: [0xb7,0x63,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c363b7 <unknown>
+
+movaz z31.q, za15h.q[w15, 0] // 11000000-11000011-01100011-11111111
+// CHECK-INST: movaz z31.q, za15h.q[w15, 0]
+// CHECK-ENCODING: [0xff,0x63,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c363ff <unknown>
+
+movaz z5.q, za1h.q[w12, 0] // 11000000-11000011-00000010-00100101
+// CHECK-INST: movaz z5.q, za1h.q[w12, 0]
+// CHECK-ENCODING: [0x25,0x02,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c30225 <unknown>
+
+movaz z1.q, za1h.q[w12, 0] // 11000000-11000011-00000010-00100001
+// CHECK-INST: movaz z1.q, za1h.q[w12, 0]
+// CHECK-ENCODING: [0x21,0x02,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c30221 <unknown>
+
+movaz z24.q, za3h.q[w14, 0] // 11000000-11000011-01000010-01111000
+// CHECK-INST: movaz z24.q, za3h.q[w14, 0]
+// CHECK-ENCODING: [0x78,0x42,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c34278 <unknown>
+
+movaz z0.q, za12h.q[w12, 0] // 11000000-11000011-00000011-10000000
+// CHECK-INST: movaz z0.q, za12h.q[w12, 0]
+// CHECK-ENCODING: [0x80,0x03,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c30380 <unknown>
+
+movaz z17.q, za1h.q[w14, 0] // 11000000-11000011-01000010-00110001
+// CHECK-INST: movaz z17.q, za1h.q[w14, 0]
+// CHECK-ENCODING: [0x31,0x42,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c34231 <unknown>
+
+movaz z29.q, za6h.q[w12, 0] // 11000000-11000011-00000010-11011101
+// CHECK-INST: movaz z29.q, za6h.q[w12, 0]
+// CHECK-ENCODING: [0xdd,0x02,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c302dd <unknown>
+
+movaz z2.q, za9h.q[w15, 0] // 11000000-11000011-01100011-00100010
+// CHECK-INST: movaz z2.q, za9h.q[w15, 0]
+// CHECK-ENCODING: [0x22,0x63,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c36322 <unknown>
+
+movaz z7.q, za12h.q[w13, 0] // 11000000-11000011-00100011-10000111
+// CHECK-INST: movaz z7.q, za12h.q[w13, 0]
+// CHECK-ENCODING: [0x87,0x23,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c32387 <unknown>
+
+movaz z0.q, za0v.q[w12, 0] // 11000000-11000011-10000010-00000000
+// CHECK-INST: movaz z0.q, za0v.q[w12, 0]
+// CHECK-ENCODING: [0x00,0x82,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c38200 <unknown>
+
+movaz z21.q, za10v.q[w14, 0] // 11000000-11000011-11000011-01010101
+// CHECK-INST: movaz z21.q, za10v.q[w14, 0]
+// CHECK-ENCODING: [0x55,0xc3,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c3c355 <unknown>
+
+movaz z23.q, za13v.q[w15, 0] // 11000000-11000011-11100011-10110111
+// CHECK-INST: movaz z23.q, za13v.q[w15, 0]
+// CHECK-ENCODING: [0xb7,0xe3,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c3e3b7 <unknown>
+
+movaz z31.q, za15v.q[w15, 0] // 11000000-11000011-11100011-11111111
+// CHECK-INST: movaz z31.q, za15v.q[w15, 0]
+// CHECK-ENCODING: [0xff,0xe3,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c3e3ff <unknown>
+
+movaz z5.q, za1v.q[w12, 0] // 11000000-11000011-10000010-00100101
+// CHECK-INST: movaz z5.q, za1v.q[w12, 0]
+// CHECK-ENCODING: [0x25,0x82,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c38225 <unknown>
+
+movaz z1.q, za1v.q[w12, 0] // 11000000-11000011-10000010-00100001
+// CHECK-INST: movaz z1.q, za1v.q[w12, 0]
+// CHECK-ENCODING: [0x21,0x82,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c38221 <unknown>
+
+movaz z24.q, za3v.q[w14, 0] // 11000000-11000011-11000010-01111000
+// CHECK-INST: movaz z24.q, za3v.q[w14, 0]
+// CHECK-ENCODING: [0x78,0xc2,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c3c278 <unknown>
+
+movaz z0.q, za12v.q[w12, 0] // 11000000-11000011-10000011-10000000
+// CHECK-INST: movaz z0.q, za12v.q[w12, 0]
+// CHECK-ENCODING: [0x80,0x83,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c38380 <unknown>
+
+movaz z17.q, za1v.q[w14, 0] // 11000000-11000011-11000010-00110001
+// CHECK-INST: movaz z17.q, za1v.q[w14, 0]
+// CHECK-ENCODING: [0x31,0xc2,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c3c231 <unknown>
+
+movaz z29.q, za6v.q[w12, 0] // 11000000-11000011-10000010-11011101
+// CHECK-INST: movaz z29.q, za6v.q[w12, 0]
+// CHECK-ENCODING: [0xdd,0x82,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c382dd <unknown>
+
+movaz z2.q, za9v.q[w15, 0] // 11000000-11000011-11100011-00100010
+// CHECK-INST: movaz z2.q, za9v.q[w15, 0]
+// CHECK-ENCODING: [0x22,0xe3,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c3e322 <unknown>
+
+movaz z7.q, za12v.q[w13, 0] // 11000000-11000011-10100011-10000111
+// CHECK-INST: movaz z7.q, za12v.q[w13, 0]
+// CHECK-ENCODING: [0x87,0xa3,0xc3,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c3a387 <unknown>
+
+movaz z0.h, za0h.h[w12, 0] // 11000000-01000010-00000010-00000000
+// CHECK-INST: movaz z0.h, za0h.h[w12, 0]
+// CHECK-ENCODING: [0x00,0x02,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0420200 <unknown>
+
+movaz z21.h, za1h.h[w14, 2] // 11000000-01000010-01000011-01010101
+// CHECK-INST: movaz z21.h, za1h.h[w14, 2]
+// CHECK-ENCODING: [0x55,0x43,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0424355 <unknown>
+
+movaz z23.h, za1h.h[w15, 5] // 11000000-01000010-01100011-10110111
+// CHECK-INST: movaz z23.h, za1h.h[w15, 5]
+// CHECK-ENCODING: [0xb7,0x63,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c04263b7 <unknown>
+
+movaz z31.h, za1h.h[w15, 7] // 11000000-01000010-01100011-11111111
+// CHECK-INST: movaz z31.h, za1h.h[w15, 7]
+// CHECK-ENCODING: [0xff,0x63,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c04263ff <unknown>
+
+movaz z5.h, za0h.h[w12, 1] // 11000000-01000010-00000010-00100101
+// CHECK-INST: movaz z5.h, za0h.h[w12, 1]
+// CHECK-ENCODING: [0x25,0x02,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0420225 <unknown>
+
+movaz z1.h, za0h.h[w12, 1] // 11000000-01000010-00000010-00100001
+// CHECK-INST: movaz z1.h, za0h.h[w12, 1]
+// CHECK-ENCODING: [0x21,0x02,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0420221 <unknown>
+
+movaz z24.h, za0h.h[w14, 3] // 11000000-01000010-01000010-01111000
+// CHECK-INST: movaz z24.h, za0h.h[w14, 3]
+// CHECK-ENCODING: [0x78,0x42,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0424278 <unknown>
+
+movaz z0.h, za1h.h[w12, 4] // 11000000-01000010-00000011-10000000
+// CHECK-INST: movaz z0.h, za1h.h[w12, 4]
+// CHECK-ENCODING: [0x80,0x03,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0420380 <unknown>
+
+movaz z17.h, za0h.h[w14, 1] // 11000000-01000010-01000010-00110001
+// CHECK-INST: movaz z17.h, za0h.h[w14, 1]
+// CHECK-ENCODING: [0x31,0x42,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0424231 <unknown>
+
+movaz z29.h, za0h.h[w12, 6] // 11000000-01000010-00000010-11011101
+// CHECK-INST: movaz z29.h, za0h.h[w12, 6]
+// CHECK-ENCODING: [0xdd,0x02,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c04202dd <unknown>
+
+movaz z2.h, za1h.h[w15, 1] // 11000000-01000010-01100011-00100010
+// CHECK-INST: movaz z2.h, za1h.h[w15, 1]
+// CHECK-ENCODING: [0x22,0x63,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0426322 <unknown>
+
+movaz z7.h, za1h.h[w13, 4] // 11000000-01000010-00100011-10000111
+// CHECK-INST: movaz z7.h, za1h.h[w13, 4]
+// CHECK-ENCODING: [0x87,0x23,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0422387 <unknown>
+
+movaz z0.h, za0v.h[w12, 0] // 11000000-01000010-10000010-00000000
+// CHECK-INST: movaz z0.h, za0v.h[w12, 0]
+// CHECK-ENCODING: [0x00,0x82,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0428200 <unknown>
+
+movaz z21.h, za1v.h[w14, 2] // 11000000-01000010-11000011-01010101
+// CHECK-INST: movaz z21.h, za1v.h[w14, 2]
+// CHECK-ENCODING: [0x55,0xc3,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c042c355 <unknown>
+
+movaz z23.h, za1v.h[w15, 5] // 11000000-01000010-11100011-10110111
+// CHECK-INST: movaz z23.h, za1v.h[w15, 5]
+// CHECK-ENCODING: [0xb7,0xe3,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c042e3b7 <unknown>
+
+movaz z31.h, za1v.h[w15, 7] // 11000000-01000010-11100011-11111111
+// CHECK-INST: movaz z31.h, za1v.h[w15, 7]
+// CHECK-ENCODING: [0xff,0xe3,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c042e3ff <unknown>
+
+movaz z5.h, za0v.h[w12, 1] // 11000000-01000010-10000010-00100101
+// CHECK-INST: movaz z5.h, za0v.h[w12, 1]
+// CHECK-ENCODING: [0x25,0x82,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0428225 <unknown>
+
+movaz z1.h, za0v.h[w12, 1] // 11000000-01000010-10000010-00100001
+// CHECK-INST: movaz z1.h, za0v.h[w12, 1]
+// CHECK-ENCODING: [0x21,0x82,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0428221 <unknown>
+
+movaz z24.h, za0v.h[w14, 3] // 11000000-01000010-11000010-01111000
+// CHECK-INST: movaz z24.h, za0v.h[w14, 3]
+// CHECK-ENCODING: [0x78,0xc2,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c042c278 <unknown>
+
+movaz z0.h, za1v.h[w12, 4] // 11000000-01000010-10000011-10000000
+// CHECK-INST: movaz z0.h, za1v.h[w12, 4]
+// CHECK-ENCODING: [0x80,0x83,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0428380 <unknown>
+
+movaz z17.h, za0v.h[w14, 1] // 11000000-01000010-11000010-00110001
+// CHECK-INST: movaz z17.h, za0v.h[w14, 1]
+// CHECK-ENCODING: [0x31,0xc2,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c042c231 <unknown>
+
+movaz z29.h, za0v.h[w12, 6] // 11000000-01000010-10000010-11011101
+// CHECK-INST: movaz z29.h, za0v.h[w12, 6]
+// CHECK-ENCODING: [0xdd,0x82,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c04282dd <unknown>
+
+movaz z2.h, za1v.h[w15, 1] // 11000000-01000010-11100011-00100010
+// CHECK-INST: movaz z2.h, za1v.h[w15, 1]
+// CHECK-ENCODING: [0x22,0xe3,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c042e322 <unknown>
+
+movaz z7.h, za1v.h[w13, 4] // 11000000-01000010-10100011-10000111
+// CHECK-INST: movaz z7.h, za1v.h[w13, 4]
+// CHECK-ENCODING: [0x87,0xa3,0x42,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c042a387 <unknown>
+
+movaz z0.s, za0h.s[w12, 0] // 11000000-10000010-00000010-00000000
+// CHECK-INST: movaz z0.s, za0h.s[w12, 0]
+// CHECK-ENCODING: [0x00,0x02,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0820200 <unknown>
+
+movaz z21.s, za2h.s[w14, 2] // 11000000-10000010-01000011-01010101
+// CHECK-INST: movaz z21.s, za2h.s[w14, 2]
+// CHECK-ENCODING: [0x55,0x43,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0824355 <unknown>
+
+movaz z23.s, za3h.s[w15, 1] // 11000000-10000010-01100011-10110111
+// CHECK-INST: movaz z23.s, za3h.s[w15, 1]
+// CHECK-ENCODING: [0xb7,0x63,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c08263b7 <unknown>
+
+movaz z31.s, za3h.s[w15, 3] // 11000000-10000010-01100011-11111111
+// CHECK-INST: movaz z31.s, za3h.s[w15, 3]
+// CHECK-ENCODING: [0xff,0x63,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c08263ff <unknown>
+
+movaz z5.s, za0h.s[w12, 1] // 11000000-10000010-00000010-00100101
+// CHECK-INST: movaz z5.s, za0h.s[w12, 1]
+// CHECK-ENCODING: [0x25,0x02,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0820225 <unknown>
+
+movaz z1.s, za0h.s[w12, 1] // 11000000-10000010-00000010-00100001
+// CHECK-INST: movaz z1.s, za0h.s[w12, 1]
+// CHECK-ENCODING: [0x21,0x02,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0820221 <unknown>
+
+movaz z24.s, za0h.s[w14, 3] // 11000000-10000010-01000010-01111000
+// CHECK-INST: movaz z24.s, za0h.s[w14, 3]
+// CHECK-ENCODING: [0x78,0x42,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0824278 <unknown>
+
+movaz z0.s, za3h.s[w12, 0] // 11000000-10000010-00000011-10000000
+// CHECK-INST: movaz z0.s, za3h.s[w12, 0]
+// CHECK-ENCODING: [0x80,0x03,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0820380 <unknown>
+
+movaz z17.s, za0h.s[w14, 1] // 11000000-10000010-01000010-00110001
+// CHECK-INST: movaz z17.s, za0h.s[w14, 1]
+// CHECK-ENCODING: [0x31,0x42,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0824231 <unknown>
+
+movaz z29.s, za1h.s[w12, 2] // 11000000-10000010-00000010-11011101
+// CHECK-INST: movaz z29.s, za1h.s[w12, 2]
+// CHECK-ENCODING: [0xdd,0x02,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c08202dd <unknown>
+
+movaz z2.s, za2h.s[w15, 1] // 11000000-10000010-01100011-00100010
+// CHECK-INST: movaz z2.s, za2h.s[w15, 1]
+// CHECK-ENCODING: [0x22,0x63,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0826322 <unknown>
+
+movaz z7.s, za3h.s[w13, 0] // 11000000-10000010-00100011-10000111
+// CHECK-INST: movaz z7.s, za3h.s[w13, 0]
+// CHECK-ENCODING: [0x87,0x23,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0822387 <unknown>
+
+movaz z0.s, za0v.s[w12, 0] // 11000000-10000010-10000010-00000000
+// CHECK-INST: movaz z0.s, za0v.s[w12, 0]
+// CHECK-ENCODING: [0x00,0x82,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0828200 <unknown>
+
+movaz z21.s, za2v.s[w14, 2] // 11000000-10000010-11000011-01010101
+// CHECK-INST: movaz z21.s, za2v.s[w14, 2]
+// CHECK-ENCODING: [0x55,0xc3,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c082c355 <unknown>
+
+movaz z23.s, za3v.s[w15, 1] // 11000000-10000010-11100011-10110111
+// CHECK-INST: movaz z23.s, za3v.s[w15, 1]
+// CHECK-ENCODING: [0xb7,0xe3,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c082e3b7 <unknown>
+
+movaz z31.s, za3v.s[w15, 3] // 11000000-10000010-11100011-11111111
+// CHECK-INST: movaz z31.s, za3v.s[w15, 3]
+// CHECK-ENCODING: [0xff,0xe3,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c082e3ff <unknown>
+
+movaz z5.s, za0v.s[w12, 1] // 11000000-10000010-10000010-00100101
+// CHECK-INST: movaz z5.s, za0v.s[w12, 1]
+// CHECK-ENCODING: [0x25,0x82,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0828225 <unknown>
+
+movaz z1.s, za0v.s[w12, 1] // 11000000-10000010-10000010-00100001
+// CHECK-INST: movaz z1.s, za0v.s[w12, 1]
+// CHECK-ENCODING: [0x21,0x82,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0828221 <unknown>
+
+movaz z24.s, za0v.s[w14, 3] // 11000000-10000010-11000010-01111000
+// CHECK-INST: movaz z24.s, za0v.s[w14, 3]
+// CHECK-ENCODING: [0x78,0xc2,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c082c278 <unknown>
+
+movaz z0.s, za3v.s[w12, 0] // 11000000-10000010-10000011-10000000
+// CHECK-INST: movaz z0.s, za3v.s[w12, 0]
+// CHECK-ENCODING: [0x80,0x83,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0828380 <unknown>
+
+movaz z17.s, za0v.s[w14, 1] // 11000000-10000010-11000010-00110001
+// CHECK-INST: movaz z17.s, za0v.s[w14, 1]
+// CHECK-ENCODING: [0x31,0xc2,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c082c231 <unknown>
+
+movaz z29.s, za1v.s[w12, 2] // 11000000-10000010-10000010-11011101
+// CHECK-INST: movaz z29.s, za1v.s[w12, 2]
+// CHECK-ENCODING: [0xdd,0x82,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c08282dd <unknown>
+
+movaz z2.s, za2v.s[w15, 1] // 11000000-10000010-11100011-00100010
+// CHECK-INST: movaz z2.s, za2v.s[w15, 1]
+// CHECK-ENCODING: [0x22,0xe3,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c082e322 <unknown>
+
+movaz z7.s, za3v.s[w13, 0] // 11000000-10000010-10100011-10000111
+// CHECK-INST: movaz z7.s, za3v.s[w13, 0]
+// CHECK-ENCODING: [0x87,0xa3,0x82,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c082a387 <unknown>
+
+movaz z0.d, za0h.d[w12, 0] // 11000000-11000010-00000010-00000000
+// CHECK-INST: movaz z0.d, za0h.d[w12, 0]
+// CHECK-ENCODING: [0x00,0x02,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c20200 <unknown>
+
+movaz z21.d, za5h.d[w14, 0] // 11000000-11000010-01000011-01010101
+// CHECK-INST: movaz z21.d, za5h.d[w14, 0]
+// CHECK-ENCODING: [0x55,0x43,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c24355 <unknown>
+
+movaz z23.d, za6h.d[w15, 1] // 11000000-11000010-01100011-10110111
+// CHECK-INST: movaz z23.d, za6h.d[w15, 1]
+// CHECK-ENCODING: [0xb7,0x63,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c263b7 <unknown>
+
+movaz z31.d, za7h.d[w15, 1] // 11000000-11000010-01100011-11111111
+// CHECK-INST: movaz z31.d, za7h.d[w15, 1]
+// CHECK-ENCODING: [0xff,0x63,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c263ff <unknown>
+
+movaz z5.d, za0h.d[w12, 1] // 11000000-11000010-00000010-00100101
+// CHECK-INST: movaz z5.d, za0h.d[w12, 1]
+// CHECK-ENCODING: [0x25,0x02,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c20225 <unknown>
+
+movaz z1.d, za0h.d[w12, 1] // 11000000-11000010-00000010-00100001
+// CHECK-INST: movaz z1.d, za0h.d[w12, 1]
+// CHECK-ENCODING: [0x21,0x02,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c20221 <unknown>
+
+movaz z24.d, za1h.d[w14, 1] // 11000000-11000010-01000010-01111000
+// CHECK-INST: movaz z24.d, za1h.d[w14, 1]
+// CHECK-ENCODING: [0x78,0x42,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c24278 <unknown>
+
+movaz z0.d, za6h.d[w12, 0] // 11000000-11000010-00000011-10000000
+// CHECK-INST: movaz z0.d, za6h.d[w12, 0]
+// CHECK-ENCODING: [0x80,0x03,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c20380 <unknown>
+
+movaz z17.d, za0h.d[w14, 1] // 11000000-11000010-01000010-00110001
+// CHECK-INST: movaz z17.d, za0h.d[w14, 1]
+// CHECK-ENCODING: [0x31,0x42,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c24231 <unknown>
+
+movaz z29.d, za3h.d[w12, 0] // 11000000-11000010-00000010-11011101
+// CHECK-INST: movaz z29.d, za3h.d[w12, 0]
+// CHECK-ENCODING: [0xdd,0x02,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c202dd <unknown>
+
+movaz z2.d, za4h.d[w15, 1] // 11000000-11000010-01100011-00100010
+// CHECK-INST: movaz z2.d, za4h.d[w15, 1]
+// CHECK-ENCODING: [0x22,0x63,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c26322 <unknown>
+
+movaz z7.d, za6h.d[w13, 0] // 11000000-11000010-00100011-10000111
+// CHECK-INST: movaz z7.d, za6h.d[w13, 0]
+// CHECK-ENCODING: [0x87,0x23,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c22387 <unknown>
+
+movaz z0.d, za0v.d[w12, 0] // 11000000-11000010-10000010-00000000
+// CHECK-INST: movaz z0.d, za0v.d[w12, 0]
+// CHECK-ENCODING: [0x00,0x82,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c28200 <unknown>
+
+movaz z21.d, za5v.d[w14, 0] // 11000000-11000010-11000011-01010101
+// CHECK-INST: movaz z21.d, za5v.d[w14, 0]
+// CHECK-ENCODING: [0x55,0xc3,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c2c355 <unknown>
+
+movaz z23.d, za6v.d[w15, 1] // 11000000-11000010-11100011-10110111
+// CHECK-INST: movaz z23.d, za6v.d[w15, 1]
+// CHECK-ENCODING: [0xb7,0xe3,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c2e3b7 <unknown>
+
+movaz z31.d, za7v.d[w15, 1] // 11000000-11000010-11100011-11111111
+// CHECK-INST: movaz z31.d, za7v.d[w15, 1]
+// CHECK-ENCODING: [0xff,0xe3,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c2e3ff <unknown>
+
+movaz z5.d, za0v.d[w12, 1] // 11000000-11000010-10000010-00100101
+// CHECK-INST: movaz z5.d, za0v.d[w12, 1]
+// CHECK-ENCODING: [0x25,0x82,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c28225 <unknown>
+
+movaz z1.d, za0v.d[w12, 1] // 11000000-11000010-10000010-00100001
+// CHECK-INST: movaz z1.d, za0v.d[w12, 1]
+// CHECK-ENCODING: [0x21,0x82,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c28221 <unknown>
+
+movaz z24.d, za1v.d[w14, 1] // 11000000-11000010-11000010-01111000
+// CHECK-INST: movaz z24.d, za1v.d[w14, 1]
+// CHECK-ENCODING: [0x78,0xc2,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c2c278 <unknown>
+
+movaz z0.d, za6v.d[w12, 0] // 11000000-11000010-10000011-10000000
+// CHECK-INST: movaz z0.d, za6v.d[w12, 0]
+// CHECK-ENCODING: [0x80,0x83,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c28380 <unknown>
+
+movaz z17.d, za0v.d[w14, 1] // 11000000-11000010-11000010-00110001
+// CHECK-INST: movaz z17.d, za0v.d[w14, 1]
+// CHECK-ENCODING: [0x31,0xc2,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c2c231 <unknown>
+
+movaz z29.d, za3v.d[w12, 0] // 11000000-11000010-10000010-11011101
+// CHECK-INST: movaz z29.d, za3v.d[w12, 0]
+// CHECK-ENCODING: [0xdd,0x82,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c282dd <unknown>
+
+movaz z2.d, za4v.d[w15, 1] // 11000000-11000010-11100011-00100010
+// CHECK-INST: movaz z2.d, za4v.d[w15, 1]
+// CHECK-ENCODING: [0x22,0xe3,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c2e322 <unknown>
+
+movaz z7.d, za6v.d[w13, 0] // 11000000-11000010-10100011-10000111
+// CHECK-INST: movaz z7.d, za6v.d[w13, 0]
+// CHECK-ENCODING: [0x87,0xa3,0xc2,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0c2a387 <unknown>
+
+movaz z0.b, za0h.b[w12, 0] // 11000000-00000010-00000010-00000000
+// CHECK-INST: movaz z0.b, za0h.b[w12, 0]
+// CHECK-ENCODING: [0x00,0x02,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0020200 <unknown>
+
+movaz z21.b, za0h.b[w14, 10] // 11000000-00000010-01000011-01010101
+// CHECK-INST: movaz z21.b, za0h.b[w14, 10]
+// CHECK-ENCODING: [0x55,0x43,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0024355 <unknown>
+
+movaz z23.b, za0h.b[w15, 13] // 11000000-00000010-01100011-10110111
+// CHECK-INST: movaz z23.b, za0h.b[w15, 13]
+// CHECK-ENCODING: [0xb7,0x63,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00263b7 <unknown>
+
+movaz z31.b, za0h.b[w15, 15] // 11000000-00000010-01100011-11111111
+// CHECK-INST: movaz z31.b, za0h.b[w15, 15]
+// CHECK-ENCODING: [0xff,0x63,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00263ff <unknown>
+
+movaz z5.b, za0h.b[w12, 1] // 11000000-00000010-00000010-00100101
+// CHECK-INST: movaz z5.b, za0h.b[w12, 1]
+// CHECK-ENCODING: [0x25,0x02,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0020225 <unknown>
+
+movaz z1.b, za0h.b[w12, 1] // 11000000-00000010-00000010-00100001
+// CHECK-INST: movaz z1.b, za0h.b[w12, 1]
+// CHECK-ENCODING: [0x21,0x02,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0020221 <unknown>
+
+movaz z24.b, za0h.b[w14, 3] // 11000000-00000010-01000010-01111000
+// CHECK-INST: movaz z24.b, za0h.b[w14, 3]
+// CHECK-ENCODING: [0x78,0x42,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0024278 <unknown>
+
+movaz z0.b, za0h.b[w12, 12] // 11000000-00000010-00000011-10000000
+// CHECK-INST: movaz z0.b, za0h.b[w12, 12]
+// CHECK-ENCODING: [0x80,0x03,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0020380 <unknown>
+
+movaz z17.b, za0h.b[w14, 1] // 11000000-00000010-01000010-00110001
+// CHECK-INST: movaz z17.b, za0h.b[w14, 1]
+// CHECK-ENCODING: [0x31,0x42,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0024231 <unknown>
+
+movaz z29.b, za0h.b[w12, 6] // 11000000-00000010-00000010-11011101
+// CHECK-INST: movaz z29.b, za0h.b[w12, 6]
+// CHECK-ENCODING: [0xdd,0x02,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00202dd <unknown>
+
+movaz z2.b, za0h.b[w15, 9] // 11000000-00000010-01100011-00100010
+// CHECK-INST: movaz z2.b, za0h.b[w15, 9]
+// CHECK-ENCODING: [0x22,0x63,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0026322 <unknown>
+
+movaz z7.b, za0h.b[w13, 12] // 11000000-00000010-00100011-10000111
+// CHECK-INST: movaz z7.b, za0h.b[w13, 12]
+// CHECK-ENCODING: [0x87,0x23,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0022387 <unknown>
+
+movaz z0.b, za0v.b[w12, 0] // 11000000-00000010-10000010-00000000
+// CHECK-INST: movaz z0.b, za0v.b[w12, 0]
+// CHECK-ENCODING: [0x00,0x82,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0028200 <unknown>
+
+movaz z21.b, za0v.b[w14, 10] // 11000000-00000010-11000011-01010101
+// CHECK-INST: movaz z21.b, za0v.b[w14, 10]
+// CHECK-ENCODING: [0x55,0xc3,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c002c355 <unknown>
+
+movaz z23.b, za0v.b[w15, 13] // 11000000-00000010-11100011-10110111
+// CHECK-INST: movaz z23.b, za0v.b[w15, 13]
+// CHECK-ENCODING: [0xb7,0xe3,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c002e3b7 <unknown>
+
+movaz z31.b, za0v.b[w15, 15] // 11000000-00000010-11100011-11111111
+// CHECK-INST: movaz z31.b, za0v.b[w15, 15]
+// CHECK-ENCODING: [0xff,0xe3,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c002e3ff <unknown>
+
+movaz z5.b, za0v.b[w12, 1] // 11000000-00000010-10000010-00100101
+// CHECK-INST: movaz z5.b, za0v.b[w12, 1]
+// CHECK-ENCODING: [0x25,0x82,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0028225 <unknown>
+
+movaz z1.b, za0v.b[w12, 1] // 11000000-00000010-10000010-00100001
+// CHECK-INST: movaz z1.b, za0v.b[w12, 1]
+// CHECK-ENCODING: [0x21,0x82,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0028221 <unknown>
+
+movaz z24.b, za0v.b[w14, 3] // 11000000-00000010-11000010-01111000
+// CHECK-INST: movaz z24.b, za0v.b[w14, 3]
+// CHECK-ENCODING: [0x78,0xc2,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c002c278 <unknown>
+
+movaz z0.b, za0v.b[w12, 12] // 11000000-00000010-10000011-10000000
+// CHECK-INST: movaz z0.b, za0v.b[w12, 12]
+// CHECK-ENCODING: [0x80,0x83,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c0028380 <unknown>
+
+movaz z17.b, za0v.b[w14, 1] // 11000000-00000010-11000010-00110001
+// CHECK-INST: movaz z17.b, za0v.b[w14, 1]
+// CHECK-ENCODING: [0x31,0xc2,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c002c231 <unknown>
+
+movaz z29.b, za0v.b[w12, 6] // 11000000-00000010-10000010-11011101
+// CHECK-INST: movaz z29.b, za0v.b[w12, 6]
+// CHECK-ENCODING: [0xdd,0x82,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00282dd <unknown>
+
+movaz z2.b, za0v.b[w15, 9] // 11000000-00000010-11100011-00100010
+// CHECK-INST: movaz z2.b, za0v.b[w15, 9]
+// CHECK-ENCODING: [0x22,0xe3,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c002e322 <unknown>
+
+movaz z7.b, za0v.b[w13, 12] // 11000000-00000010-10100011-10000111
+// CHECK-INST: movaz z7.b, za0v.b[w13, 12]
+// CHECK-ENCODING: [0x87,0xa3,0x02,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c002a387 <unknown>
diff --git a/llvm/test/MC/AArch64/SME2p1/zero-diagnostics.s b/llvm/test/MC/AArch64/SME2p1/zero-diagnostics.s
new file mode 100644
index 000000000000..ee6010af9939
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/zero-diagnostics.s
@@ -0,0 +1,60 @@
+// RUN: not llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1 2>&1 < %s | FileCheck %s
+
+// --------------------------------------------------------------------------//
+// Out of range index offset
+
+zero za.d[w11, 8, vgx2]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: zero za.d[w11, 8, vgx2]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+zero za.d[w11, -1, vgx4]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: zero za.d[w11, -1, vgx4]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+zero za.d[w11, 5:8, vgx4]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector select offset must be an immediate range of the form <immf>:<imml>, where the first immediate is a multiple of 4 in the range [0, 4] or [0, 12] depending on the instruction, and the second immediate is immf + 3.
+// CHECK-NEXT: zero za.d[w11, 5:8, vgx4]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+zero za.d[w11, 5:8, vgx2]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: vector select offset must be an immediate range of the form <immf>:<imml>, where the first immediate is a multiple of 4 in the range [0, 4] or [0, 12] depending on the instruction, and the second immediate is immf + 3.
+// CHECK-NEXT: zero za.d[w11, 5:8, vgx2]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+zero za.d[w11, 0:4, vgx4]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: zero za.d[w11, 0:4, vgx4]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+zero za.d[w11, 0:4, vgx2]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: zero za.d[w11, 0:4, vgx2]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+zero za.d[w11, 11:15]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: immediate must be an integer in range [0, 7].
+// CHECK-NEXT: zero za.d[w11, 11:15]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid select register
+
+zero za.d[w7, 7, vgx2]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: zero za.d[w7, 7, vgx2]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+zero za.d[w12, 7, vgx2]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: operand must be a register in range [w8, w11]
+// CHECK-NEXT: zero za.d[w12, 7, vgx2]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
+
+// --------------------------------------------------------------------------//
+// Invalid suffix
+
+zero za.s[w11, 7, vgx2]
+// CHECK: [[@LINE-1]]:{{[0-9]+}}: error: invalid matrix operand, expected suffix .d
+// CHECK-NEXT: zero za.s[w11, 7, vgx2]
+// CHECK-NOT: [[@LINE-1]]:{{[0-9]+}}:
diff --git a/llvm/test/MC/AArch64/SME2p1/zero.s b/llvm/test/MC/AArch64/SME2p1/zero.s
new file mode 100644
index 000000000000..a28f9cc6e08b
--- /dev/null
+++ b/llvm/test/MC/AArch64/SME2p1/zero.s
@@ -0,0 +1,394 @@
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1 < %s \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+// RUN: not llvm-mc -triple=aarch64 -show-encoding < %s 2>&1 \
+// RUN: | FileCheck %s --check-prefix=CHECK-ERROR
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1 < %s \
+// RUN: | llvm-objdump --no-print-imm-hex -d --mattr=+sme2p1 - \
+// RUN: | FileCheck %s --check-prefix=CHECK-INST
+// RUN: llvm-mc -triple=aarch64 -filetype=obj -mattr=+sme2p1 < %s \
+// RUN: | llvm-objdump --no-print-imm-hex -d --mattr=-sme2p1 - \
+// RUN: | FileCheck %s --check-prefix=CHECK-UNKNOWN
+// RUN: llvm-mc -triple=aarch64 -show-encoding -mattr=+sme2p1 < %s \
+// RUN: | sed '/.text/d' | sed 's/.*encoding: //g' \
+// RUN: | llvm-mc -triple=aarch64 -mattr=+sme2p1 -disassemble -show-encoding \
+// RUN: | FileCheck %s --check-prefixes=CHECK-ENCODING,CHECK-INST
+
+zero za.d[w8, 0:1] // 11000000-00001100-10000000-00000000
+// CHECK-INST: zero za.d[w8, 0:1]
+// CHECK-ENCODING: [0x00,0x80,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00c8000 <unknown>
+
+zero za.d[w10, 10:11] // 11000000-00001100-11000000-00000101
+// CHECK-INST: zero za.d[w10, 10:11]
+// CHECK-ENCODING: [0x05,0xc0,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00cc005 <unknown>
+
+zero za.d[w11, 14:15] // 11000000-00001100-11100000-00000111
+// CHECK-INST: zero za.d[w11, 14:15]
+// CHECK-ENCODING: [0x07,0xe0,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00ce007 <unknown>
+
+zero za.d[w8, 10:11] // 11000000-00001100-10000000-00000101
+// CHECK-INST: zero za.d[w8, 10:11]
+// CHECK-ENCODING: [0x05,0x80,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00c8005 <unknown>
+
+zero za.d[w8, 2:3] // 11000000-00001100-10000000-00000001
+// CHECK-INST: zero za.d[w8, 2:3]
+// CHECK-ENCODING: [0x01,0x80,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00c8001 <unknown>
+
+zero za.d[w10, 0:1] // 11000000-00001100-11000000-00000000
+// CHECK-INST: zero za.d[w10, 0:1]
+// CHECK-ENCODING: [0x00,0xc0,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00cc000 <unknown>
+
+zero za.d[w10, 2:3] // 11000000-00001100-11000000-00000001
+// CHECK-INST: zero za.d[w10, 2:3]
+// CHECK-ENCODING: [0x01,0xc0,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00cc001 <unknown>
+
+zero za.d[w11, 4:5] // 11000000-00001100-11100000-00000010
+// CHECK-INST: zero za.d[w11, 4:5]
+// CHECK-ENCODING: [0x02,0xe0,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00ce002 <unknown>
+
+zero za.d[w9, 14:15] // 11000000-00001100-10100000-00000111
+// CHECK-INST: zero za.d[w9, 14:15]
+// CHECK-ENCODING: [0x07,0xa0,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00ca007 <unknown>
+
+
+zero za.d[w8, 0:3] // 11000000-00001110-10000000-00000000
+// CHECK-INST: zero za.d[w8, 0:3]
+// CHECK-ENCODING: [0x00,0x80,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00e8000 <unknown>
+
+zero za.d[w10, 4:7] // 11000000-00001110-11000000-00000001
+// CHECK-INST: zero za.d[w10, 4:7]
+// CHECK-ENCODING: [0x01,0xc0,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00ec001 <unknown>
+
+zero za.d[w11, 12:15] // 11000000-00001110-11100000-00000011
+// CHECK-INST: zero za.d[w11, 12:15]
+// CHECK-ENCODING: [0x03,0xe0,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00ee003 <unknown>
+
+zero za.d[w8, 4:7] // 11000000-00001110-10000000-00000001
+// CHECK-INST: zero za.d[w8, 4:7]
+// CHECK-ENCODING: [0x01,0x80,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00e8001 <unknown>
+
+zero za.d[w10, 0:3] // 11000000-00001110-11000000-00000000
+// CHECK-INST: zero za.d[w10, 0:3]
+// CHECK-ENCODING: [0x00,0xc0,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00ec000 <unknown>
+
+zero za.d[w11, 8:11] // 11000000-00001110-11100000-00000010
+// CHECK-INST: zero za.d[w11, 8:11]
+// CHECK-ENCODING: [0x02,0xe0,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00ee002 <unknown>
+
+zero za.d[w9, 12:15] // 11000000-00001110-10100000-00000011
+// CHECK-INST: zero za.d[w9, 12:15]
+// CHECK-ENCODING: [0x03,0xa0,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00ea003 <unknown>
+
+
+zero za.d[w8, 0, vgx2] // 11000000-00001100-00000000-00000000
+// CHECK-INST: zero za.d[w8, 0, vgx2]
+// CHECK-ENCODING: [0x00,0x00,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00c0000 <unknown>
+
+zero za.d[w10, 5, vgx2] // 11000000-00001100-01000000-00000101
+// CHECK-INST: zero za.d[w10, 5, vgx2]
+// CHECK-ENCODING: [0x05,0x40,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00c4005 <unknown>
+
+zero za.d[w11, 7, vgx2] // 11000000-00001100-01100000-00000111
+// CHECK-INST: zero za.d[w11, 7, vgx2]
+// CHECK-ENCODING: [0x07,0x60,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00c6007 <unknown>
+
+zero za.d[w8, 5, vgx2] // 11000000-00001100-00000000-00000101
+// CHECK-INST: zero za.d[w8, 5, vgx2]
+// CHECK-ENCODING: [0x05,0x00,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00c0005 <unknown>
+
+zero za.d[w8, 1, vgx2] // 11000000-00001100-00000000-00000001
+// CHECK-INST: zero za.d[w8, 1, vgx2]
+// CHECK-ENCODING: [0x01,0x00,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00c0001 <unknown>
+
+zero za.d[w10, 0, vgx2] // 11000000-00001100-01000000-00000000
+// CHECK-INST: zero za.d[w10, 0, vgx2]
+// CHECK-ENCODING: [0x00,0x40,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00c4000 <unknown>
+
+zero za.d[w10, 1, vgx2] // 11000000-00001100-01000000-00000001
+// CHECK-INST: zero za.d[w10, 1, vgx2]
+// CHECK-ENCODING: [0x01,0x40,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00c4001 <unknown>
+
+zero za.d[w11, 2, vgx2] // 11000000-00001100-01100000-00000010
+// CHECK-INST: zero za.d[w11, 2, vgx2]
+// CHECK-ENCODING: [0x02,0x60,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00c6002 <unknown>
+
+zero za.d[w9, 7, vgx2] // 11000000-00001100-00100000-00000111
+// CHECK-INST: zero za.d[w9, 7, vgx2]
+// CHECK-ENCODING: [0x07,0x20,0x0c,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00c2007 <unknown>
+
+
+zero za.d[w8, 0:1, vgx2] // 11000000-00001101-00000000-00000000
+// CHECK-INST: zero za.d[w8, 0:1, vgx2]
+// CHECK-ENCODING: [0x00,0x00,0x0d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00d0000 <unknown>
+
+zero za.d[w10, 2:3, vgx2] // 11000000-00001101-01000000-00000001
+// CHECK-INST: zero za.d[w10, 2:3, vgx2]
+// CHECK-ENCODING: [0x01,0x40,0x0d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00d4001 <unknown>
+
+zero za.d[w11, 6:7, vgx2] // 11000000-00001101-01100000-00000011
+// CHECK-INST: zero za.d[w11, 6:7, vgx2]
+// CHECK-ENCODING: [0x03,0x60,0x0d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00d6003 <unknown>
+
+zero za.d[w8, 2:3, vgx2] // 11000000-00001101-00000000-00000001
+// CHECK-INST: zero za.d[w8, 2:3, vgx2]
+// CHECK-ENCODING: [0x01,0x00,0x0d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00d0001 <unknown>
+
+zero za.d[w10, 0:1, vgx2] // 11000000-00001101-01000000-00000000
+// CHECK-INST: zero za.d[w10, 0:1, vgx2]
+// CHECK-ENCODING: [0x00,0x40,0x0d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00d4000 <unknown>
+
+zero za.d[w11, 4:5, vgx2] // 11000000-00001101-01100000-00000010
+// CHECK-INST: zero za.d[w11, 4:5, vgx2]
+// CHECK-ENCODING: [0x02,0x60,0x0d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00d6002 <unknown>
+
+zero za.d[w9, 6:7, vgx2] // 11000000-00001101-00100000-00000011
+// CHECK-INST: zero za.d[w9, 6:7, vgx2]
+// CHECK-ENCODING: [0x03,0x20,0x0d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00d2003 <unknown>
+
+
+zero za.d[w8, 0:3, vgx2] // 11000000-00001111-00000000-00000000
+// CHECK-INST: zero za.d[w8, 0:3, vgx2]
+// CHECK-ENCODING: [0x00,0x00,0x0f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00f0000 <unknown>
+
+zero za.d[w10, 4:7, vgx2] // 11000000-00001111-01000000-00000001
+// CHECK-INST: zero za.d[w10, 4:7, vgx2]
+// CHECK-ENCODING: [0x01,0x40,0x0f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00f4001 <unknown>
+
+zero za.d[w11, 4:7, vgx2] // 11000000-00001111-01100000-00000001
+// CHECK-INST: zero za.d[w11, 4:7, vgx2]
+// CHECK-ENCODING: [0x01,0x60,0x0f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00f6001 <unknown>
+
+zero za.d[w8, 4:7, vgx2] // 11000000-00001111-00000000-00000001
+// CHECK-INST: zero za.d[w8, 4:7, vgx2]
+// CHECK-ENCODING: [0x01,0x00,0x0f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00f0001 <unknown>
+
+zero za.d[w10, 0:3, vgx2] // 11000000-00001111-01000000-00000000
+// CHECK-INST: zero za.d[w10, 0:3, vgx2]
+// CHECK-ENCODING: [0x00,0x40,0x0f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00f4000 <unknown>
+
+zero za.d[w11, 0:3, vgx2] // 11000000-00001111-01100000-00000000
+// CHECK-INST: zero za.d[w11, 0:3, vgx2]
+// CHECK-ENCODING: [0x00,0x60,0x0f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00f6000 <unknown>
+
+zero za.d[w9, 4:7, vgx2] // 11000000-00001111-00100000-00000001
+// CHECK-INST: zero za.d[w9, 4:7, vgx2]
+// CHECK-ENCODING: [0x01,0x20,0x0f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00f2001 <unknown>
+
+
+zero za.d[w8, 0, vgx4] // 11000000-00001110-00000000-00000000
+// CHECK-INST: zero za.d[w8, 0, vgx4]
+// CHECK-ENCODING: [0x00,0x00,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00e0000 <unknown>
+
+zero za.d[w10, 5, vgx4] // 11000000-00001110-01000000-00000101
+// CHECK-INST: zero za.d[w10, 5, vgx4]
+// CHECK-ENCODING: [0x05,0x40,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00e4005 <unknown>
+
+zero za.d[w11, 7, vgx4] // 11000000-00001110-01100000-00000111
+// CHECK-INST: zero za.d[w11, 7, vgx4]
+// CHECK-ENCODING: [0x07,0x60,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00e6007 <unknown>
+
+zero za.d[w8, 5, vgx4] // 11000000-00001110-00000000-00000101
+// CHECK-INST: zero za.d[w8, 5, vgx4]
+// CHECK-ENCODING: [0x05,0x00,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00e0005 <unknown>
+
+zero za.d[w8, 1, vgx4] // 11000000-00001110-00000000-00000001
+// CHECK-INST: zero za.d[w8, 1, vgx4]
+// CHECK-ENCODING: [0x01,0x00,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00e0001 <unknown>
+
+zero za.d[w10, 0, vgx4] // 11000000-00001110-01000000-00000000
+// CHECK-INST: zero za.d[w10, 0, vgx4]
+// CHECK-ENCODING: [0x00,0x40,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00e4000 <unknown>
+
+zero za.d[w10, 1, vgx4] // 11000000-00001110-01000000-00000001
+// CHECK-INST: zero za.d[w10, 1, vgx4]
+// CHECK-ENCODING: [0x01,0x40,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00e4001 <unknown>
+
+zero za.d[w11, 2, vgx4] // 11000000-00001110-01100000-00000010
+// CHECK-INST: zero za.d[w11, 2, vgx4]
+// CHECK-ENCODING: [0x02,0x60,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00e6002 <unknown>
+
+zero za.d[w9, 7, vgx4] // 11000000-00001110-00100000-00000111
+// CHECK-INST: zero za.d[w9, 7, vgx4]
+// CHECK-ENCODING: [0x07,0x20,0x0e,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00e2007 <unknown>
+
+
+zero za.d[w8, 0:1, vgx4] // 11000000-00001101-10000000-00000000
+// CHECK-INST: zero za.d[w8, 0:1, vgx4]
+// CHECK-ENCODING: [0x00,0x80,0x0d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00d8000 <unknown>
+
+zero za.d[w10, 2:3, vgx4] // 11000000-00001101-11000000-00000001
+// CHECK-INST: zero za.d[w10, 2:3, vgx4]
+// CHECK-ENCODING: [0x01,0xc0,0x0d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00dc001 <unknown>
+
+zero za.d[w11, 6:7, vgx4] // 11000000-00001101-11100000-00000011
+// CHECK-INST: zero za.d[w11, 6:7, vgx4]
+// CHECK-ENCODING: [0x03,0xe0,0x0d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00de003 <unknown>
+
+zero za.d[w8, 2:3, vgx4] // 11000000-00001101-10000000-00000001
+// CHECK-INST: zero za.d[w8, 2:3, vgx4]
+// CHECK-ENCODING: [0x01,0x80,0x0d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00d8001 <unknown>
+
+zero za.d[w10, 0:1, vgx4] // 11000000-00001101-11000000-00000000
+// CHECK-INST: zero za.d[w10, 0:1, vgx4]
+// CHECK-ENCODING: [0x00,0xc0,0x0d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00dc000 <unknown>
+
+zero za.d[w11, 4:5, vgx4] // 11000000-00001101-11100000-00000010
+// CHECK-INST: zero za.d[w11, 4:5, vgx4]
+// CHECK-ENCODING: [0x02,0xe0,0x0d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00de002 <unknown>
+
+zero za.d[w9, 6:7, vgx4] // 11000000-00001101-10100000-00000011
+// CHECK-INST: zero za.d[w9, 6:7, vgx4]
+// CHECK-ENCODING: [0x03,0xa0,0x0d,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00da003 <unknown>
+
+
+zero za.d[w8, 0:3, vgx4] // 11000000-00001111-10000000-00000000
+// CHECK-INST: zero za.d[w8, 0:3, vgx4]
+// CHECK-ENCODING: [0x00,0x80,0x0f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00f8000 <unknown>
+
+zero za.d[w10, 4:7, vgx4] // 11000000-00001111-11000000-00000001
+// CHECK-INST: zero za.d[w10, 4:7, vgx4]
+// CHECK-ENCODING: [0x01,0xc0,0x0f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00fc001 <unknown>
+
+zero za.d[w11, 4:7, vgx4] // 11000000-00001111-11100000-00000001
+// CHECK-INST: zero za.d[w11, 4:7, vgx4]
+// CHECK-ENCODING: [0x01,0xe0,0x0f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00fe001 <unknown>
+
+zero za.d[w8, 4:7, vgx4] // 11000000-00001111-10000000-00000001
+// CHECK-INST: zero za.d[w8, 4:7, vgx4]
+// CHECK-ENCODING: [0x01,0x80,0x0f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00f8001 <unknown>
+
+zero za.d[w10, 0:3, vgx4] // 11000000-00001111-11000000-00000000
+// CHECK-INST: zero za.d[w10, 0:3, vgx4]
+// CHECK-ENCODING: [0x00,0xc0,0x0f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00fc000 <unknown>
+
+zero za.d[w11, 0:3, vgx4] // 11000000-00001111-11100000-00000000
+// CHECK-INST: zero za.d[w11, 0:3, vgx4]
+// CHECK-ENCODING: [0x00,0xe0,0x0f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00fe000 <unknown>
+
+zero za.d[w9, 4:7, vgx4] // 11000000-00001111-10100000-00000001
+// CHECK-INST: zero za.d[w9, 4:7, vgx4]
+// CHECK-ENCODING: [0x01,0xa0,0x0f,0xc0]
+// CHECK-ERROR: instruction requires: sme2p1
+// CHECK-UNKNOWN: c00fa001 <unknown>
+
diff --git a/llvm/unittests/Support/TargetParserTest.cpp b/llvm/unittests/Support/TargetParserTest.cpp
index d94005e4a551..8cd6b5854724 100644
--- a/llvm/unittests/Support/TargetParserTest.cpp
+++ b/llvm/unittests/Support/TargetParserTest.cpp
@@ -1598,7 +1598,7 @@ TEST(TargetParserTest, AArch64ExtensionFeatures) {
AArch64::AEK_SME, AArch64::AEK_SMEF64F64, AArch64::AEK_SMEI16I64,
AArch64::AEK_SME2, AArch64::AEK_HBC, AArch64::AEK_MOPS,
AArch64::AEK_PERFMON, AArch64::AEK_SVE2p1, AArch64::AEK_SME2p1,
- AArch64::AEK_B16B16};
+ AArch64::AEK_B16B16, AArch64::AEK_SMEF16F16};
std::vector<StringRef> Features;
@@ -1657,6 +1657,7 @@ TEST(TargetParserTest, AArch64ExtensionFeatures) {
EXPECT_TRUE(llvm::is_contained(Features, "+sme"));
EXPECT_TRUE(llvm::is_contained(Features, "+sme-f64f64"));
EXPECT_TRUE(llvm::is_contained(Features, "+sme-i16i64"));
+ EXPECT_TRUE(llvm::is_contained(Features, "+sme-f16f16"));
EXPECT_TRUE(llvm::is_contained(Features, "+sme2"));
EXPECT_TRUE(llvm::is_contained(Features, "+sme2p1"));
EXPECT_TRUE(llvm::is_contained(Features, "+hbc"));
@@ -1739,6 +1740,7 @@ TEST(TargetParserTest, AArch64ArchExtFeature) {
{"sme", "nosme", "+sme", "-sme"},
{"sme-f64f64", "nosme-f64f64", "+sme-f64f64", "-sme-f64f64"},
{"sme-i16i64", "nosme-i16i64", "+sme-i16i64", "-sme-i16i64"},
+ {"sme-f16f16", "nosme-f16f16", "+sme-f16f16", "-sme-f16f16"},
{"sme2", "nosme2", "+sme2", "-sme2"},
{"sme2p1", "nosme2p1", "+sme2p1", "-sme2p1"},
{"hbc", "nohbc", "+hbc", "-hbc"},
More information about the llvm-commits
mailing list