[PATCH] D142656: [SVE][codegen] Add pattern for SVE multiply-add accumulate

Paul Walker via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 21 07:57:41 PST 2023


paulwalker-arm added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:405
+// pattern for generating pseudo for MLA_ZPmZZ/MAD_ZPmZZ
+def AArch64mlad_pseudo : PatFrags<(ops node:$pred, node:$op1, node:$op2, node:$op3),
+                             [(add node:$op1, (AArch64mul_p_oneuse node:$pred, node:$op2, node:$op3)),
----------------
sushgokh wrote:
> paulwalker-arm wrote:
> > We have a naming convention here that follows the instruction's name with a letter suffix.  In this instance this should be `AArch64mla_p`.
> _p is slightly confusing and can confused with using 'predicated version'. To be specific, I used 'pseudo'. But to be consistent at this point, I will change it as you say
I guess the comment at the top of the file can be improved but my request to use `AArch64mla_p` is because this is the 'predicated version'. By which I mean this PatFrags represents all the patterns that can be used to represent a predicated mla where the result of the inactive lanes does not matter. The only different between this and `AArch64fmla_p` is the latter has more representations including a dedicated ISD node, but that'll change, for example I expect int_aarch64_sve_mla_u will exist soon.


================
Comment at: llvm/lib/Target/AArch64/AArch64SVEInstrInfo.td:493
+
+  defm MLA_ZPZZZ : sve_int_3op_pred_pseudo<AArch64mlad_pseudo> ;
 
----------------
sushgokh wrote:
> paulwalker-arm wrote:
> > Given you've added all the plumbing for MLS/MSB pseudo nodes please can you add support for `AArch64mls_p` to this patch as well.
> I was reluctant to do since that's unrelated for this patch. I will take this in next patch if you agree
I don't really see it as unrelated, especially given this patch is updating the pseudo mappings for MLS/MSB, but if that's your preference then fair enough.


================
Comment at: llvm/lib/Target/AArch64/SVEInstrFormats.td:499
 
+let AddedComplexity = 9 in {
+class SVE_4_Op_Psuedo_Pat<ValueType vtd, SDPatternOperator op, ValueType vt1,
----------------
sushgokh wrote:
> paulwalker-arm wrote:
> > We don't normally use `AddedComplexity` like this unless the containing class has something about it that makes the complexity relevant.  This is how it's used above bit for your usage it's better to just reuse `SVE_4_Op_Pat` and set `AddedComplexity` specifically for the important pattern.
> > 
> > With all that said for this first patch I'd rather not set `AddedComplexity` at all.  This goes hand in hand with my comment relating to the tests.  I'd prefer to get the base MLA/MLS pseudo support added thats solves the main MLA/MLS vs MAD/MSB problem and then have follow on work to solve the more nuanced issue of whether to emit "mul followed by add_imm" or "mov_imm followed by mla".
> Two points:
> 1. As said above, will make changes for MLS/MSB in the next patch if you agree.
> 2. The main motive of this patch was getting 'mad/mla' generated for 
> 
> ```
> void
> imul(uint8_t * restrict dst, 
>      uint32_t * restrict src1, 
>      uint32_t * restrict src2, 
>      int n)
> {
>   for (int i = 0; i < n; i++) {
>     dst[i] = ( src1[i] * src2[i] + 1 ) ;
>   }
> } 
> 
> ```
> now, coming to complexity issue, I have created custom pattern with increased complexity so that
>     a) mad/mla is generated for above test case
>     a) it has no side-effects for instructions using 'SVE_4_Op_Pat'
> 
> Having said all this, if you say, I am ready to go by following sequence:
> 1. Convert MLA/MLS/MAD/MSB to pseudo
> 2. Looking at which sequence to generate: mad/mla or mul+add/sub 
> 
> Let me know if you still suggest the above sequence.
Yes I prefer to split the task in two given part (1) is always wanted, but (2) feels more specialised and warrants a different testing strategy.

For example, I'm assuming the advantage of using the mla is the mov_imm will be hoisted out of the loop and thus I think that patch will want a test showing this. Whereas this patch only requires simple one line tests to protect the new functionality.

Re SVE_4_Op_Psuedo_Pat: I just don't want to see multiple redefinitions of `SVE_4_Op_Pat` each with a different complexity. The need is specific to MLA and thus I prefer see it closer to its definition.  That said, I suppose you can extend `SVE_4_Op_Pat` to take an extra parameter that defaults to unset or 0 and then set AddedComplexity within SVE_4_Op_Pat? This all belongs in the follow on patch though given it's not necessary for the base MLA support. 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142656/new/

https://reviews.llvm.org/D142656



More information about the llvm-commits mailing list