[PATCH] D140542: [MachineCombiner] Support local strategy for traces
Anton Sidorenko via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Jan 10 08:38:26 PST 2023
asi-sc added a comment.
>> For in-order cores MachineCombiner makes better decisions when the critical path
>
> @asi-sc did you do any measurements to collect empirical data?
I have performance impact only for microbenchmarks as execution time fluctuations on SPEC is higher than the performance change. However, there are some statistics just in case
Program machine-combiner.NumInstCombined
results min-instr diff
83.xalancbmk/483.xalancbmk 243.00 328.00 35.0%
64.h264ref/464.h264ref 878.00 909.00 3.5%
00.perlbench/400.perlbench 272.00 279.00 2.6%
03.gcc/403.gcc 946.00 946.00 0.0%
29.mcf/429.mcf 2.00 2.00 0.0%
73.astar/473.astar 5.00 5.00 0.0%
45.gobmk/445.gobmk 1025.00 1020.00 -0.5%
62.libquantum/462.libquantum 27.00 26.00 -3.7%
58.sjeng/458.sjeng 80.00 76.00 -5.0%
01.bzip2/401.bzip2 135.00 127.00 -5.9%
71.omnetpp/471.omnetpp 12.00 11.00 -8.3%
56.hmmer/456.hmmer 429.00 360.00 -16.1%
I cannot share details but my testing shows that FLOPS module 7 is 1.5% faster for in-order RISCV core when local strategy is used. The test I attached to this patch is a minimization of a performance problem in a real application that with different strategies shows ~3% performance change (~1.5% for sifive-u74). `MultiSource/Benchmarks/Ptrdist/bc/bc` from llvm-test-suite also speedups by 3%. Other tests from llvm-test-suite show no measurable performance difference.
Apart from execution time, local strategy reduces compilation time as traces become smaller. I randomly ran time-report and there are no regressions and often improvements up to 20% of the pass time (total impact is hardly noticeable).
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D140542/new/
https://reviews.llvm.org/D140542
More information about the llvm-commits
mailing list