[clang] [llvm] [AArch64] Add intrinsics for non-widening FMOPA/FMOPS (PR #88105)

Wed Apr 10 05:56:48 PDT 2024

================
@@ -815,8 +815,8 @@ defm FMLS_VG4_M4Z2Z_H : sme2_dot_mla_add_sub_array_vg4_multi<"fmls", 0b0100011,
 defm FCVT_2ZZ_H  : sme2p1_fp_cvt_vector_vg2_single<"fcvt", 0b0>;
 defm FCVTL_2ZZ_H : sme2p1_fp_cvt_vector_vg2_single<"fcvtl", 0b1>;
 
-defm FMOPA_MPPZZ_H : sme2p1_fmop_tile_fp16<"fmopa", 0b0, 0b0, 0b11, ZPR16>;
-defm FMOPS_MPPZZ_H : sme2p1_fmop_tile_fp16<"fmops", 0b0, 0b1, 0b11, ZPR16>;
+defm FMOPA_MPPZZ_H : sme2p1_fmop_tile_fp16<"fmopa", 0b0, 0b0, 0b11, nxv8f16, int_aarch64_sme_mopa_nonwide>;
----------------
CarolineConcatto wrote:

Not in this patch, but I think all the instructions under 
`let Predicates = [HasSME2p1, HasSMEF16F16]`, 
should be under
`  let Predicates = [HasSME2, HasSMEF16F16].` to be consistent with clang
Otherwise we will have some problem with clang( clang has sme2 and the backend has sme2.1.
TBH when I look at
[FEAT_SME_F16F16](https://developer.arm.com/documentation/ddi0602/2024-03/SME-Instructions/FMOPS--non-widening---Floating-point-outer-product-and-subtract-?lang=en)
It maybe only  need FEAT_SME_F16F16, like

 defm USMOPA_MPPZZ_D : sme_int_outer_product_i64<0b100, "usmopa", int_aarch64_sme_usmopa_wide>;


https://github.com/llvm/llvm-project/pull/88105