[llvm-branch-commits] [RISCV] Set DisableLatencyHeuristic to true (PR #115858)
Pengcheng Wang via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Tue Nov 12 21:21:27 PST 2024
wangpc-pp wrote:
I added two experimental options: `-riscv-disable-latency-heuristic` and `-riscv-should-track-lane-masks` and evaluated the statistics (`regalloc.NumSpills`/`regalloc.NumReloads`) on llvm-test-suite (option: `-O3 -march=rva23u64`):
1. `-riscv-disable-latency-heuristic=true` and `-riscv-should-track-lane-masks=false`:
```
Program regalloc.NumSpills regalloc.NumReloads
00 10 diff 00 10 diff
SingleSour...ce/UnitTests/matrix-types-spec 8823.00 6166.00 -2657.00 15603.00 13403.00 -2200.00
External/S...rate/510.parest_r/510.parest_r 43817.00 43262.00 -555.00 87058.00 87033.00 -25.00
External/S...017speed/625.x264_s/625.x264_s 2373.00 1991.00 -382.00 4808.00 4287.00 -521.00
External/S...2017rate/525.x264_r/525.x264_r 2373.00 1991.00 -382.00 4808.00 4287.00 -521.00
MultiSourc...ks/ASCI_Purple/SMG2000/smg2000 2684.00 2334.00 -350.00 4820.00 4349.00 -471.00
MultiSourc...nchmarks/FreeBench/pifft/pifft 442.00 126.00 -316.00 595.00 281.00 -314.00
MultiSourc.../Applications/JM/ldecod/ldecod 1335.00 1131.00 -204.00 2311.00 2142.00 -169.00
External/S...00.perlbench_s/600.perlbench_s 4354.00 4154.00 -200.00 9615.00 9435.00 -180.00
External/S...00.perlbench_r/500.perlbench_r 4354.00 4154.00 -200.00 9615.00 9435.00 -180.00
MultiSourc.../Applications/JM/lencod/lencod 3368.00 3172.00 -196.00 7261.00 7069.00 -192.00
External/S...te/538.imagick_r/538.imagick_r 4163.00 4000.00 -163.00 10354.00 9964.00 -390.00
MultiSourc...ch/consumer-lame/consumer-lame 722.00 559.00 -163.00 1098.00 994.00 -104.00
External/S...ed/638.imagick_s/638.imagick_s 4163.00 4000.00 -163.00 10354.00 9964.00 -390.00
MultiSource/Applications/oggenc/oggenc 970.00 817.00 -153.00 2327.00 2120.00 -207.00
MultiSourc...e/Applications/ClamAV/clamscan 2072.00 1937.00 -135.00 4836.00 4648.00 -188.00
regalloc.NumSpills regalloc.NumReloads
run 00 10 diff 00 10 diff
mean 87.747460 84.068699 -3.678761 1371.475285 1357.146154 -3.792937
```
2. `-riscv-disable-latency-heuristic=false` and `-riscv-should-track-lane-masks=true`:
```
Program regalloc.NumSpills regalloc.NumReloads
00 01 diff 00 01 diff
SingleSour...ce/UnitTests/matrix-types-spec 8823.00 8233.00 -590.00 15603.00 15020.00 -583.00
MultiSourc...ch/consumer-lame/consumer-lame 722.00 689.00 -33.00 1098.00 1065.00 -33.00
MultiSourc...s/Prolangs-C/football/football 248.00 250.00 2.00 349.00 350.00 1.00
MultiSourc...ench/telecomm-gsm/telecomm-gsm 182.00 181.00 -1.00 196.00 195.00 -1.00
MultiSourc...Benchmarks/7zip/7zip-benchmark 1272.00 1273.00 1.00 2436.00 2437.00 1.00
MicroBench...arks/ImageProcessing/Blur/blur 114.00 113.00 -1.00 136.00 136.00 0.00
MultiSourc...rks/mediabench/gsm/toast/toast 182.00 181.00 -1.00 196.00 195.00 -1.00
MultiSourc...gs-C/TimberWolfMC/timberwolfmc 1196.00 1195.00 -1.00 2036.00 2029.00 -7.00
SingleSour.../execute/GCC-C-execute-pr36321 0.00 0.00 0.00 0.00
SingleSour.../execute/GCC-C-execute-pr36077 0.00 0.00 0.00 0.00
SingleSour...xecute/GCC-C-execute-pr33779-1 0.00 0.00 0.00 0.00
SingleSour.../execute/GCC-C-execute-pr33669 0.00 0.00 0.00 0.00
SingleSour.../execute/GCC-C-execute-pr33631 0.00 0.00 0.00 0.00
SingleSour.../execute/GCC-C-execute-pr33382 0.00 0.00 0.00 0.00
SingleSour.../execute/GCC-C-execute-pr37102 0.00 0.00 0.00 0.00
regalloc.NumSpills regalloc.NumReloads
run 00 01 diff 00 01 diff
mean 87.747460 87.445573 -0.301887 1371.475285 1369.091255 -0.303338
```
3. `-riscv-disable-latency-heuristic=true` and `-riscv-should-track-lane-masks=true`:
```
Program regalloc.NumSpills regalloc.NumReloads
00 11 diff 00 11 diff
SingleSour...ce/UnitTests/matrix-types-spec 8823.00 6320.00 -2503.00 15603.00 13544.00 -2059.00
External/S...rate/510.parest_r/510.parest_r 43817.00 43262.00 -555.00 87058.00 87033.00 -25.00
External/S...017speed/625.x264_s/625.x264_s 2373.00 1991.00 -382.00 4808.00 4287.00 -521.00
External/S...2017rate/525.x264_r/525.x264_r 2373.00 1991.00 -382.00 4808.00 4287.00 -521.00
MultiSourc...ks/ASCI_Purple/SMG2000/smg2000 2684.00 2334.00 -350.00 4820.00 4349.00 -471.00
MultiSourc...nchmarks/FreeBench/pifft/pifft 442.00 126.00 -316.00 595.00 281.00 -314.00
MultiSourc.../Applications/JM/ldecod/ldecod 1335.00 1131.00 -204.00 2311.00 2142.00 -169.00
External/S...00.perlbench_s/600.perlbench_s 4354.00 4154.00 -200.00 9615.00 9435.00 -180.00
External/S...00.perlbench_r/500.perlbench_r 4354.00 4154.00 -200.00 9615.00 9435.00 -180.00
MultiSourc.../Applications/JM/lencod/lencod 3368.00 3172.00 -196.00 7261.00 7069.00 -192.00
External/S...te/538.imagick_r/538.imagick_r 4163.00 4000.00 -163.00 10354.00 9964.00 -390.00
MultiSourc...ch/consumer-lame/consumer-lame 722.00 559.00 -163.00 1098.00 994.00 -104.00
External/S...ed/638.imagick_s/638.imagick_s 4163.00 4000.00 -163.00 10354.00 9964.00 -390.00
MultiSource/Applications/oggenc/oggenc 970.00 817.00 -153.00 2327.00 2120.00 -207.00
MultiSourc...e/Applications/ClamAV/clamscan 2072.00 1937.00 -135.00 4836.00 4648.00 -188.00
regalloc.NumSpills regalloc.NumReloads
run 00 11 diff 00 11 diff
mean 87.747460 84.142235 -3.605225 1371.475285 1357.692308 -3.724238
```
We can see that both options can reduce the mean of spills/reloads. `ShouldTrackLaneMasks` has smaller influence because only vector registers (with sub-registers) can benefit from this.
I didn't run these tests on real hardwares, so these data may not be so convincing. I'd appreciate it if you can evaluate this on some platforms, that will be helpful. If you find this common setting is not suitable for your microarchitectures, please let me know, we can make it a tune feature. All I want is just to unify the common sched policy and make part of the policy being tune features.
https://github.com/llvm/llvm-project/pull/115858
More information about the llvm-branch-commits
mailing list