[PATCH] D142656: [SVE][codegen] Add pattern for SVE multiply-add accumulate

Mon Feb 20 11:05:15 PST 2023

paulwalker-arm added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:405
+// pattern for generating pseudo for MLA_ZPmZZ/MAD_ZPmZZ
+def AArch64mlad_pseudo : PatFrags<(ops node:$pred, node:$op1, node:$op2, node:$op3),
+                             [(add node:$op1, (AArch64mul_p_oneuse node:$pred, node:$op2, node:$op3)),
----------------
We have a naming convention here that follows the instruction's name with a letter suffix.  In this instance this should be `AArch64mla_p`.

================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:493
+
+  defm MLA_ZPZZZ : sve_int_3op_pred_pseudo<AArch64mlad_pseudo> ;

----------------
Given you've added all the plumbing for MLS/MSB pseudo nodes please can you add support for `AArch64mls_p` to this patch as well.

================
Comment at: llvm/lib/Target/AArch64/SVEInstrFormats.td:499

+let AddedComplexity = 9 in {
+class SVE_4_Op_Psuedo_Pat<ValueType vtd, SDPatternOperator op, ValueType vt1,
----------------
We don't normally use `AddedComplexity` like this unless the containing class has something about it that makes the complexity relevant.  This is how it's used above bit for your usage it's better to just reuse `SVE_4_Op_Pat` and set `AddedComplexity` specifically for the important pattern.

With all that said for this first patch I'd rather not set `AddedComplexity` at all.  This goes hand in hand with my comment relating to the tests.  I'd prefer to get the base MLA/MLS pseudo support added thats solves the main MLA/MLS vs MAD/MSB problem and then have follow on work to solve the more nuanced issue of whether to emit "mul followed by add_imm" or "mov_imm followed by mla".

================
Comment at: llvm/test/CodeGen/AArch64/sve-multiply-add-accumulate.ll:4-6
+define <vscale x 2 x i64> @muladd_i64_positiveAddend(<vscale x 2 x i64> %a, <vscale x 2 x i64> %b)
+; CHECK-LABEL: muladd_i64_positiveAddend:
+; CHECK:       // %bb.0:
----------------
For this patch I'd rather see a set of much simpler patterns that just exercise the new isel, whereby the mla/mad choice is controlled by the function's operand order (i.e. %a is the multiplicand -> mad, %a is the addend -> mla).  You'll see example of this for the equivalent fmla/fmad testing.  This should mean all test ( 16 = 4 instructions * 4 element types) where each only emits a single instruction (ignoring the ret).

The more nuanced cases should form part of a different patch along with the code changes to improve them.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142656/new/

https://reviews.llvm.org/D142656