[PATCH] D142656: [SVE][codegen] Add pattern for SVE multiply-add accumulate

Wed Feb 1 03:19:55 PST 2023

sushgokh added a comment.

In D142656#4096082 <https://reviews.llvm.org/D142656#4096082>, @david-arm wrote:

> Hi @SjoerdMeijer, in @sushgokh's defence there is precedent for some of the changes in this patch - by changing from SVE_4_Op_Pat to SVE_4_Mad_Op_Pat we are able to set the AddedComplexity field to the pattern, which is not dissimilar to SVE_3_Op_Pat_SelZero or SVE_3_Op_Pat_Shift_Imm_SelZero, i.e.
>
>   let AddedComplexity = 1 in {
>   class SVE_3_Op_Pat_SelZero<ValueType vtd, SDPatternOperator op, ValueType vt1,
>                      ValueType vt2, ValueType vt3, Instruction inst>
>   : Pat<(vtd (vtd (op vt1:$Op1, (vselect vt1:$Op1, vt2:$Op2, (SVEDup0)), vt3:$Op3))),
>         (inst $Op1, $Op2, $Op3)>;
>   
>   class SVE_3_Op_Pat_Shift_Imm_SelZero<ValueType vtd, SDPatternOperator op,
>                                        ValueType vt1, ValueType vt2,
>                                        Operand vt3, Instruction inst>
>   : Pat<(vtd (op vt1:$Op1, (vselect vt1:$Op1, vt2:$Op2, (SVEDup0)), (i32 (vt3:$Op3)))),
>         (inst $Op1, $Op2, vt3:$Op3)>;
>   }
>
> What I don't fully understand is why the complexity has to be so high, since it suggests there are multiple competing patterns and it might be useful to understand what they are. I admit that `AArch64mul_p_firstOpndWithSingleUse` looks a bit unusual and I'm not sure that we should be checking for explicit opcodes such as TokenFactor, etc.

@david-arm We dont want mad to be generated in all cases. As far as I remember, without using predicate 'AArch64mul_p_firstOpndWithSingleUse', muladd_i16_test3  generates mad with extra register moves and that becomes inefficient.

================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:491

-  defm MAD_ZPmZZ : sve_int_mladdsub_vvv_pred<0b0, "mad", int_aarch64_sve_mad>;
+  defm MAD_ZPmZZ : sve_int_mladdsub_vvv_pred<0b0, "mad", AArch64mad_m1>;
   defm MSB_ZPmZZ : sve_int_mladdsub_vvv_pred<0b1, "msb", int_aarch64_sve_msb>;
----------------
SjoerdMeijer wrote:
> Is this not the only change we need?
yes

================
Comment at: llvm/lib/Target/AArch64/SVEInstrFormats.td:3115

-  def : SVE_4_Op_Pat<nxv16i8, op, nxv16i1, nxv16i8, nxv16i8, nxv16i8, !cast<Instruction>(NAME # _B)>;
-  def : SVE_4_Op_Pat<nxv8i16, op, nxv8i1, nxv8i16, nxv8i16, nxv8i16, !cast<Instruction>(NAME # _H)>;
-  def : SVE_4_Op_Pat<nxv4i32, op, nxv4i1, nxv4i32, nxv4i32, nxv4i32, !cast<Instruction>(NAME # _S)>;
-  def : SVE_4_Op_Pat<nxv2i64, op, nxv2i1, nxv2i64, nxv2i64, nxv2i64, !cast<Instruction>(NAME # _D)>;
+  def : SVE_4_Mad_Op_Pat<nxv16i8, op, nxv16i1, nxv16i8, nxv16i8, nxv16i8, !cast<Instruction>(NAME # _B)>;
+  def : SVE_4_Mad_Op_Pat<nxv8i16, op, nxv8i1, nxv8i16, nxv8i16, nxv8i16, !cast<Instruction>(NAME # _H)>;
----------------
SjoerdMeijer wrote:
> So why do we need these changes?
SVE_4_Mad_Op_Pat is exactly identical to SVE_4_Op_Pat and in fact derived from SVE_4_Op_Pat.

The reason for creating seperate pattern just for MAD is assigning priority only to MAD and not to all instructions using SVE_4_Op_Pat

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142656/new/

https://reviews.llvm.org/D142656