[PATCH] D134982: [X86] Add support for "light" AVX

Fri Jan 6 13:17:24 PST 2023

TokarIP added inline comments.

================
Comment at: llvm/lib/Target/X86/X86.td:1290
+                                     TuningInsertVZEROUPPER,
+                                     TuningAllowLight256Bit];
   list<SubtargetFeature> ZN2AdditionalFeatures = [FeatureCLWB,
----------------
RKSimon wrote:
> TokarIP wrote:
> > RKSimon wrote:
> > > TokarIP wrote:
> > > > lebedev.ri wrote:
> > > > > TokarIP wrote:
> > > > > > RKSimon wrote:
> > > > > > > I'm not certain Ryzen needs this - even on znver1 with double pumping of 256-bit ops.
> > > > > > I'm not sure I understand this comment. You mean since Ryzen doesn't have any frequency problems, so we don't care about  prefer-vector-width=128 behavior? This is mostly here for a) completeness (since 256-ops don't seem to hurt on ryzen we do prefer 256 bit loads/stores) and b) for cases where users want znver tuning but still prefer good performance on intel sop they pass prefer-vector-width=128
> > > > > I agree with @RKSimon here. I'm not really sure why anyone would want to
> > > > > use non-full vector width on Ryzens, so i don't think we support it there.
> > > > FWIW mtune=znver3 + mprefer-vector-width=128 often gives best results for a mixed (skylake+rome) server fleet.
> > > Would -mtune=x86-64-v3 not be better for those cases?
> > Not really, x86-64-v3 is basically haswell, and it seems that ryzen benefits more from ryzen tuning than skylake from haswell tuning. 
> OK - if you want to include this then please can you ensure you add znver test coverage below
Added znver case to memcpy-light-avx.ll

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134982/new/

https://reviews.llvm.org/D134982