[PATCH] D140542: [MachineCombiner] Support local strategy for traces

Tue Jan 10 08:38:26 PST 2023

asi-sc added a comment.

>> For in-order cores MachineCombiner makes better decisions when the critical path
>
> @asi-sc  did you do any measurements to collect empirical data?

I have performance impact only for microbenchmarks as execution time fluctuations on SPEC is higher than the performance change. However, there are some statistics just in case

  Program                                       machine-combiner.NumInstCombined                                                                                                                                                                                                            
                                                results                          min-instr diff                                                                                                                                                                                             
  83.xalancbmk/483.xalancbmk                     243.00                           328.00    35.0%                                                                                                                                                                                           
  64.h264ref/464.h264ref                         878.00                           909.00     3.5%                                                                                                                                                                                           
  00.perlbench/400.perlbench                     272.00                           279.00     2.6%                                                                                                                                                                                           
  03.gcc/403.gcc                                 946.00                           946.00     0.0%                                                                                                                                                                                           
  29.mcf/429.mcf                                   2.00                             2.00     0.0%                                                                                                                                                                                           
  73.astar/473.astar                               5.00                             5.00     0.0%                                                                                                                                                                                           
  45.gobmk/445.gobmk                            1025.00                          1020.00    -0.5%                                                                                                                                                                                           
  62.libquantum/462.libquantum                    27.00                            26.00    -3.7%                                                                                                                                                                                           
  58.sjeng/458.sjeng                              80.00                            76.00    -5.0%                                                                                                                                                                                           
  01.bzip2/401.bzip2                             135.00                           127.00    -5.9%                                                                                                                                                                                           
  71.omnetpp/471.omnetpp                          12.00                            11.00    -8.3%                                                                                                                                                                                           
  56.hmmer/456.hmmer                             429.00                           360.00   -16.1%

I cannot share details but my testing shows that FLOPS module 7 is 1.5% faster for in-order RISCV core when local strategy is used. The test I attached to this patch is a minimization of a performance problem in a real application that with different strategies shows  ~3% performance change (~1.5% for sifive-u74). `MultiSource/Benchmarks/Ptrdist/bc/bc` from llvm-test-suite also speedups by 3%. Other tests from llvm-test-suite show no measurable performance difference.

Apart from execution time, local strategy reduces compilation time as traces become smaller. I randomly ran time-report and there are no regressions and often improvements up to 20% of the pass time (total impact is hardly noticeable).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D140542/new/

https://reviews.llvm.org/D140542