[PATCH] D124284: [SLP]Try partial store vectorization if supported by target.

Mon May 9 11:26:34 PDT 2022

asbirlea added a comment.

In D124284#3500259 <https://reviews.llvm.org/D124284#3500259>, @ABataev wrote:

> In D124284#3499844 <https://reviews.llvm.org/D124284#3499844>, @asbirlea wrote:
>
>> In D124284#3496381 <https://reviews.llvm.org/D124284#3496381>, @RKSimon wrote:
>>
>>> @asbirlea How are you specifying the SSE/AVX level for your benchmark runs - are you running with -march=native the x86-64-v* levels or something else?
>>
>> I believe the runs are using `-target-cpu k8` and `-target-cpu haswell`.
>>
>> The latest performance testing still shows one regression in a benchmark from singlesource, but there are many improvements that offset it in non-public benchmarks. So this latest diff is good to go from my side.
>
> Hi, thanks for the testing. What's the name of the regressed single source test?

Shootout/sieve for xfdo configuration on Rome looks regressed by 20%, and Shootout/fib2 for opt configuration also on Rome by 10%.
Dither has one -10% (floyd 128) and one +10% (floyd 512) on Rome opt and xfdo, but the same floyd ones plus a few others are +15 to +20% on Skylake opt, thinlto and xfdo and +5% to +10% on Haswell opt, thinlto and xfdo. Net win overall.
Eigen, complex benchmark also has many configs with net wins, e.g. +10% to +30% haswell xfdo and +10 to +25% rome opt.

There are others, but I hope this gives a rough idea.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124284/new/

https://reviews.llvm.org/D124284