[llvm] [MCP] Move dependencies if they block copy propagation (PR #105562)

Sun Aug 25 07:55:53 PDT 2024

spaits wrote:

I have done some benchmarking.

I used this cmake command for the benchmarks:
```
cmake -DCMAKE_C_COMPILER=llvm-project/build/bin/clang  -DCMAKE_CXX_COMPILER=llvm-project/build/bin/clang++ -DTEST_SUITE_BENCHMARKING_ONLY=1 .. -GNinja
```
I used this command to compile:
```
ninja -j1
```

This is the llvm-lit invcoaction:
```
llvm-lit -v -j 1 -o resnew21.json . && /home/spaits/repo/llvm-project/build/bin/llvm-lit -v -j 1 -o resnew22.json .
```
I ran compare with this command:
```
python3 ../utils/compare.py resnew21.json resnew22.json vs resold21.json resold22.json
```

And here are the results for compile time:
```
Tests: 1173
Metric: compile_time

/home/spaits/repo/llvm-test-suite/build/../utils/compare.py:206: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
  name0 = names[0]
Program                                       compile_time             
                                              lhs          rhs    diff 
SingleSource/Benchmarks/Misc/pi                 0.14         0.16 13.2%
SingleSource/Benchmarks/Misc/fp-convert         0.24         0.26  6.1%
SingleSour...bench/stencils/fdtd-2d/fdtd-2d     1.63         1.73  6.0%
SingleSour...h/stencils/jacobi-1d/jacobi-1d     0.93         0.98  5.7%
SingleSour...near-algebra/kernels/bicg/bicg     0.81         0.86  5.2%
SingleSour...h/linear-algebra/solvers/lu/lu     1.18         1.22  3.4%
SingleSour...-algebra/solvers/durbin/durbin     0.89         0.92  3.4%
MultiSourc...Benchmarks/Olden/health/health     1.38         1.43  3.4%
SingleSource/Benchmarks/Misc/salsa20            0.56         0.58  3.1%
SingleSour...rks/Polybench/stencils/adi/adi     0.95         0.98  2.8%
SingleSour...bench/medley/nussinov/nussinov     1.13         1.16  2.6%
SingleSource/Benchmarks/Misc/mandel-2           0.28         0.29  2.4%
SingleSour...Benchmarks/Misc/matmul_f64_4x4     0.30         0.31  2.3%
SingleSource/Benchmarks/SmallPT/smallpt         1.95         2.00  2.3%
MultiSourc...arks/FreeBench/distray/distray     1.06         1.08  2.2%
                           Geomean difference                     -1.8%
      compile_time                         
l/r            lhs          rhs        diff
count  1173.000000  1173.000000  284.000000
mean   5.521519     5.432766    -0.018112  
std    35.607681    35.240664    0.026822  
min    0.000000     0.000000    -0.093135  
25%    0.000000     0.000000    -0.033920  
50%    0.000000     0.000000    -0.016928  
75%    0.000000     0.000000    -0.004431  
max    891.948600   888.081400   0.132171  
```

And for execution time:
```
Metric: exec_time

/home/spaits/repo/llvm-test-suite/build/../utils/compare.py:206: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
  name0 = names[0]
Program                                       exec_time               
                                              lhs       rhs    diff   
SingleSour...Benchmarks/Stanford/Oscar.test     0.00      0.00    inf%
SingleSour...ncils/jacobi-1d/jacobi-1d.test     0.00      0.00    inf%
MultiSourc.../Prolangs-C/bison/mybison.test     0.00      0.00    inf%
MultiSourc...adpcm/rawcaudio/rawcaudio.test     0.00      0.00  140.0%
MultiSourc...abench/jpeg/jpeg-6a/cjpeg.test     0.00      0.00  100.0%
MultiSourc...ks/Prolangs-C++/city/city.test     0.00      0.00   61.5%
MultiSourc...cCat/03-testtrie/testtrie.test     0.00      0.00   60.0%
SingleSour...ootout/Shootout-ackermann.test     0.01      0.01   51.9%
SingleSour...s/BenchmarkGame/recursive.test     0.42      0.59   40.0%
SingleSour...tout-C++/Shootout-C++-ary.test     0.01      0.01   35.0%
SingleSour...++/Shootout-C++-ackermann.test     0.66      0.87   31.0%
SingleSour...out-C++/Shootout-C++-ary2.test     0.01      0.01   25.8%
MicroBench...st:BM_DIFF_PREDICT_LAMBDA/5001    23.13     29.09   25.7%
MultiSourc...telecomm-FFT/telecomm-fft.test     0.01      0.01   24.6%
MultiSourc...lications/ClamAV/clamscan.test     0.04      0.05   21.8%
                           Geomean difference                  -100.0%
/home/spaits/.local/lib/python3.10/site-packages/pandas/core/nanops.py:1016: RuntimeWarning: invalid value encountered in subtract
  sqr = _ensure_numeric((avg - values) ** 2)
           exec_time                            
l/r              lhs            rhs         diff
count  1173.000000    1173.000000    1149.000000
mean   2071.330440    2088.736758    inf        
std    25427.791777   25650.260153  NaN         
min    0.000000       0.000000      -1.000000   
25%    1.204800       1.205000      -0.010028   
50%    5.687519       5.608971       0.000000   
75%    133.613683     132.687198     0.017685   
max    643010.637615  646757.697936  inf 
```

I should do some fine tuning:
- Only enable this optimization from O2 or O3 or Os.
- If O3 or O3 is enabled, then take instruction latencies into account when doing the instruction moving.
- If Os (size opt if I am correct) is enabled then do the optimization regardless of latencies.

What do you think would this PR be fine for O3 or O2 or Os?

https://github.com/llvm/llvm-project/pull/105562