[PATCH] D141849: [AArch64][SME2] Add SME2 outer product intrinsics
David Sherwood via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 17 07:25:42 PST 2023
david-arm added inline comments.
================
Comment at: llvm/lib/Target/AArch64/AArch64SMEInstrInfo.td:569
-defm SMOPA_MPPZZ_HtoS : sme2_int_mopx_tile<"smopa", 0b000>;
-defm SMOPS_MPPZZ_HtoS : sme2_int_mopx_tile<"smops", 0b001>;
+defm SMOPA_MPPZZ_HtoS : sme2_int_mopx_tile<"smopa", 0b000, int_aarch64_sme_smopa_za32>;
+defm SMOPS_MPPZZ_HtoS : sme2_int_mopx_tile<"smops", 0b001, int_aarch64_sme_smops_za32>;
----------------
Not saying you should do this as part of this patch, but I wonder if at some point we should change the existing SME smopa/umopa_wide intrinsics to also use int_aarch64_sme_smopa_za32, etc. For example, the smopa (4-way) 8-bit variant.
================
Comment at: llvm/lib/Target/AArch64/SMEInstrFormats.td:2583
-multiclass sme2_bfp_mopx_tile<string mnemonic, bits<3> op> {
- def NAME : sme_outer_product_widening_inst<op, ZPR32, mnemonic>;
----------------
nit: I just realised this class isn't name correctly. It suggests that it's for binary FP, but actually it's used for bmopa and bmops, both of which are "Bitwise exclusive NOR population count outer product" instructions, i.e. have 32-bit integer inputs. Maybe you could rename this to something like sme2_int_bmopx_tile to avoid confusion?
================
Comment at: llvm/test/CodeGen/AArch64/sme2-intrinsics-mop.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=aarch64-linux-gnu -mattr=+sve -mattr=+sme2 -mattr=+bf16 -verify-machineinstrs < %s | FileCheck %s
+
----------------
I don't think we need the -mattr=+sve and -mattr=+bf16 here, right?
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D141849/new/
https://reviews.llvm.org/D141849
More information about the llvm-commits
mailing list