[llvm] [SLP]Represent externally used values as original scalars, if profitable. (PR #100904)

Thu Nov 7 06:51:39 PST 2024

alexey-bataev wrote:

> Hi @alexey-bataev,
> 
> Under the latest commit of LLVM 19.1.3 (`test` column - [ab51ecc](https://github.com/llvm/llvm-project/commit/ab51eccf88f5321e7c60591c5546b254b6afab99)), there is an observable ~8% drop in performance on my x86_64 6c/12t Ubuntu machine for the SPEC CPU 2017 tests `CFP2017speed/638.imagick_s/638.imagick_s.test` and `CFP2017rate/538.imagick_r/538.imagick_r.test` compared to LLVM 18.1.8 (`baseline` column - [3b5b5c1](https://github.com/llvm/llvm-project/commit/3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)):
> 
> ```
> Tests: 11
> Metric: exec_time
> 
> Program                                                                   exec_time             
>                                                                           baseline  test   diff 
>              test-suite :: MultiSource/Applications/JM/ldecod/ldecod.test   0.03      0.03 27.0%
>              test-suite :: MultiSource/Applications/JM/lencod/lencod.test   2.82      2.80 -0.9%
>                   test-suite :: MultiSource/Benchmarks/Bullet/bullet.test   2.28      2.37  4.1%
> test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test   2.24      2.29  2.0%
>           test-suite :: MultiSource/Benchmarks/FreeBench/pifft/pifft.test   0.05      0.05 -0.4%
>                   test-suite :: MultiSource/Benchmarks/nbench/nbench.test   0.88      0.86 -2.7%
>             test-suite :: SPEC/CFP2017rate/510.parest_r/510.parest_r.test  26.08     26.73  2.5%
>             test-suite :: SPEC/CFP2017rate/511.povray_r/511.povray_r.test   3.09      3.15  1.8%
>           test-suite :: SPEC/CFP2017rate/526.blender_r/526.blender_r.test  96.17     96.14 -0.0%
>           test-suite :: SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test  20.92     22.76  8.8%
>          test-suite :: SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test  20.98     22.75  8.5%
>                                                        Geomean difference                   4.3%
>        exec_time                      
> l/r     baseline       test       diff
> count  11.000000  11.000000  11.000000
> mean   15.958291  16.356445  0.046005 
> std    28.342679  28.424672  0.082604 
> min    0.026300   0.033400  -0.026621 
> 25%    1.561100   1.571350  -0.002055 
> 50%    2.821600   2.796500   0.019570 
> 75%    20.950250  22.752750  0.062887 
> max    96.165100  96.142500  0.269962 
> ```
> 
> I bisected this performance regression to the introduction of commit [32994cc](https://github.com/llvm/llvm-project/commit/32994cc0d63513f77223c64148faeeb50aebb702), however, from local testing, the regression was later fixed by this pull request.
> 
> Would it be possible to please backport this pull request ([b10ecfa](https://github.com/llvm/llvm-project/commit/b10ecfa914dd1bc2013584917d0505ba5f15f75c)) and dependent changes to LLVM 19 on the grounds of improved performance? I am requesting your help since I am personally not familiar with the SLP Vectorizer code.
> 
> In an attempt to provide motivation, I identified the following series of commits which allow this pull request to be cleanly picked into LLVM 19 at [ab51ecc](https://github.com/llvm/llvm-project/commit/ab51eccf88f5321e7c60591c5546b254b6afab99):
> 
> * [5fc9502](https://github.com/llvm/llvm-project/commit/5fc9502f19a87f7b1194cf5eadccf5f918bc50ca) – NFC
> * [1e1c8d1](https://github.com/llvm/llvm-project/commit/1e1c8d16153a8b3f53b6a8797a77112ecc289551)
> * [197f4a9](https://github.com/llvm/llvm-project/commit/197f4a90519df308d9bfddcc931f7683a5ae9cb9)
> * [6b1d137](https://github.com/llvm/llvm-project/commit/6b1d13761ac0c9857763e5f4c0ae554f076dd9b7) – Fixed [[SLPVectorizer] Instruction does not dominate all uses! LLVM ERROR: Broken module found, compilation aborted! #101213](https://github.com/llvm/llvm-project/issues/101213) which is present in LLVM 19 (see https://godbolt.org/z/4Wc5Yj3eh)
> * [b5a7d3b](https://github.com/llvm/llvm-project/commit/b5a7d3b6c2169d84f9da749425a38dcef914d1ce)
> * [deb3ecf](https://github.com/llvm/llvm-project/commit/deb3ecf09fa30391bd22f890d2885c3d4816ca15) – NFC
> * [799fd3d](https://github.com/llvm/llvm-project/commit/799fd3d87bb15c37027c9c4451ab8c4dac3ca437)
> * [834ad10](https://github.com/llvm/llvm-project/commit/834ad102c377a4d1cdc6c601d9899b5dc0a1858b) – NFC
> * [daf4a06](https://github.com/llvm/llvm-project/commit/daf4a06e5c5531005b275b72681e04bd08e58fe4)
> * [441f94f](https://github.com/llvm/llvm-project/commit/441f94f4bdf6e2b9747ec12194c595098d9c3a5b)
> * [97743b8](https://github.com/llvm/llvm-project/commit/97743b8be86ab96afb26ba93e1876406c1f4d541)
> * [32c69fa](https://github.com/llvm/llvm-project/commit/32c69faa6ce58333c26293a7708fa3f71991c55c)
> * [b10ecfa](https://github.com/llvm/llvm-project/commit/b10ecfa914dd1bc2013584917d0505ba5f15f75c)
> 
> Using the above set of commits, all LIT tests pass. Here are some results which demonstrate a measurable improvement back to LLVM 18 levels under LLVM 19 for the two `imagick` tests highlighted at least on x86_64:
> 
> ```
> Tests: 11
> Metric: exec_time
> 
> Program                                                                   exec_time              
>                                                                           baseline  test   diff  
>              test-suite :: MultiSource/Applications/JM/ldecod/ldecod.test   0.03      0.02 -25.5%
>              test-suite :: MultiSource/Applications/JM/lencod/lencod.test   2.82      2.79  -1.1%
>                   test-suite :: MultiSource/Benchmarks/Bullet/bullet.test   2.28      2.26  -0.7%
> test-suite :: MultiSource/Benchmarks/DOE-ProxyApps-C++/miniFE/miniFE.test   2.24      2.22  -1.0%
>           test-suite :: MultiSource/Benchmarks/FreeBench/pifft/pifft.test   0.05      0.05  -1.6%
>                   test-suite :: MultiSource/Benchmarks/nbench/nbench.test   0.88      0.85  -3.3%
>             test-suite :: SPEC/CFP2017rate/510.parest_r/510.parest_r.test  26.08     26.07  -0.0%
>             test-suite :: SPEC/CFP2017rate/511.povray_r/511.povray_r.test   3.09      3.09  -0.2%
>           test-suite :: SPEC/CFP2017rate/526.blender_r/526.blender_r.test  96.17     95.06  -1.1%
>           test-suite :: SPEC/CFP2017rate/538.imagick_r/538.imagick_r.test  20.92     20.55  -1.8%
>          test-suite :: SPEC/CFP2017speed/638.imagick_s/638.imagick_s.test  20.98     20.58  -1.9%
>                                                        Geomean difference                   -3.8%
>        exec_time                      
> l/r     baseline       test       diff
> count  11.000000  11.000000  11.000000
> mean   15.958291  15.777064 -0.034704 
> std    28.342679  28.022843  0.073510 
> min    0.026300   0.019600  -0.254753 
> 25%    1.561100   1.535950  -0.018355 
> 50%    2.821600   2.790200  -0.011454 
> 75%    20.950250  20.565700 -0.008460 
> max    96.165100  95.063600 -0.000391 
> ```
> 
> * `test` column - LLVM 19.1.3 @ [ab51ecc](https://github.com/llvm/llvm-project/commit/ab51eccf88f5321e7c60591c5546b254b6afab99) + all commits mentioned above
> * `baseline` column - LLVM 18.1.8 @ [3b5b5c1](https://github.com/llvm/llvm-project/commit/3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
> 
> If any additional information or testing is desired, please let me know. I'd be happy to provide.
> 
> Thank you, Douglas Gliner

Hi, I rather doubt it is a good idea to push so many commits to fix a single regression. Better to ask release maintainers if it worth doing for the 19x release or better to wait for 20

https://github.com/llvm/llvm-project/pull/100904