[PATCH] D68667: [SLP] respect target register width for GEP vectorization (PR43578)

Wed Oct 9 04:18:39 PDT 2019

spatel added a comment.

In D68667#1701060 <https://reviews.llvm.org/D68667#1701060>, @xbolva00 wrote:

> Generally, I think there are more bugs for -march=haswell. Only in rare cases the perf of binaries with  -march=haswell is better than plain -O3.
>  I tried this patch with zstd but nothing improved.
>
> Plain -O3
>  ./zstd  -b selesiafiles/* -f
>
>   3# 13 files         : 251919670 ->  97724903 (2.578), 182.0 MB/s , 923.2 MB/s 
>   
>
> -O3 -march=haswell
>  /zstd  -b selesiafiles/* -f
>
>   3# 13 files         : 251919670 ->  97724903 (2.578), 185.7 MB/s , 866.9 MB/s 
>   
>
> -O3 -march=haswell -mprefer-vector-width=128
>  ./zstd  -b bench/* -f
>
>   3# 13 files         : 251919670 ->  97724903 (2.578), 188.5 MB/s , 806.8 MB/s 
>   
>
> for example gcc-10's results for -march=haswell
>  ./zstd  -b bench/* -f
>
>   3# 13 files         : 251919670 ->  97724903 (2.578), 188.7 MB/s ,1032.8 MB/s 

Thanks for testing! I suspect that this problem (ignoring the target-based register width) is more widespread than only the transform starting from phi, but I want to make sure we have proper tests in place if we change the behavior in other places. Can you file another bug for "zstd"?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68667/new/

https://reviews.llvm.org/D68667