[PATCH] D142656: [SVE][codegen] Add pattern for SVE multiply-add accumulate

Sushant Gokhale via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Feb 21 10:54:27 PST 2023


sushgokh added inline comments.


================
Comment at: llvm/lib/Target/AArch64/SVEInstrFormats.td:499
 
+let AddedComplexity = 9 in {
+class SVE_4_Op_Psuedo_Pat<ValueType vtd, SDPatternOperator op, ValueType vt1,
----------------
paulwalker-arm wrote:
> sushgokh wrote:
> > paulwalker-arm wrote:
> > > We don't normally use `AddedComplexity` like this unless the containing class has something about it that makes the complexity relevant.  This is how it's used above bit for your usage it's better to just reuse `SVE_4_Op_Pat` and set `AddedComplexity` specifically for the important pattern.
> > > 
> > > With all that said for this first patch I'd rather not set `AddedComplexity` at all.  This goes hand in hand with my comment relating to the tests.  I'd prefer to get the base MLA/MLS pseudo support added thats solves the main MLA/MLS vs MAD/MSB problem and then have follow on work to solve the more nuanced issue of whether to emit "mul followed by add_imm" or "mov_imm followed by mla".
> > Two points:
> > 1. As said above, will make changes for MLS/MSB in the next patch if you agree.
> > 2. The main motive of this patch was getting 'mad/mla' generated for 
> > 
> > ```
> > void
> > imul(uint8_t * restrict dst, 
> >      uint32_t * restrict src1, 
> >      uint32_t * restrict src2, 
> >      int n)
> > {
> >   for (int i = 0; i < n; i++) {
> >     dst[i] = ( src1[i] * src2[i] + 1 ) ;
> >   }
> > } 
> > 
> > ```
> > now, coming to complexity issue, I have created custom pattern with increased complexity so that
> >     a) mad/mla is generated for above test case
> >     a) it has no side-effects for instructions using 'SVE_4_Op_Pat'
> > 
> > Having said all this, if you say, I am ready to go by following sequence:
> > 1. Convert MLA/MLS/MAD/MSB to pseudo
> > 2. Looking at which sequence to generate: mad/mla or mul+add/sub 
> > 
> > Let me know if you still suggest the above sequence.
> Yes I prefer to split the task in two given part (1) is always wanted, but (2) feels more specialised and warrants a different testing strategy.
> 
> For example, I'm assuming the advantage of using the mla is the mov_imm will be hoisted out of the loop and thus I think that patch will want a test showing this. Whereas this patch only requires simple one line tests to protect the new functionality.
> 
> Re SVE_4_Op_Psuedo_Pat: I just don't want to see multiple redefinitions of `SVE_4_Op_Pat` each with a different complexity. The need is specific to MLA and thus I prefer see it closer to its definition.  That said, I suppose you can extend `SVE_4_Op_Pat` to take an extra parameter that defaults to unset or 0 and then set AddedComplexity within SVE_4_Op_Pat? This all belongs in the follow on patch though given it's not necessary for the base MLA support. 
ok thanks. As we agree, will upload a 
1. first patch - convert MLA/MAD/MLS/MSB to pseudo  
2. second patch - get into more complex mul+add patterns that can be converted to mla/mad/etc.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142656/new/

https://reviews.llvm.org/D142656



More information about the llvm-commits mailing list