[llvm-branch-commits] [RISCV] Set DisableLatencyHeuristic to true (PR #115858)

Pengcheng Wang via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Tue Nov 12 21:21:27 PST 2024


wangpc-pp wrote:

I added two experimental options: `-riscv-disable-latency-heuristic` and `-riscv-should-track-lane-masks` and evaluated the statistics (`regalloc.NumSpills`/`regalloc.NumReloads`) on llvm-test-suite (option: `-O3 -march=rva23u64`):
1. `-riscv-disable-latency-heuristic=true` and `-riscv-should-track-lane-masks=false`:
```
Program                                       regalloc.NumSpills                   regalloc.NumReloads                  
                                              00                 10       diff     00                  10       diff    
SingleSour...ce/UnitTests/matrix-types-spec    8823.00            6166.00 -2657.00 15603.00            13403.00 -2200.00
External/S...rate/510.parest_r/510.parest_r   43817.00           43262.00  -555.00 87058.00            87033.00   -25.00
External/S...017speed/625.x264_s/625.x264_s    2373.00            1991.00  -382.00  4808.00             4287.00  -521.00
External/S...2017rate/525.x264_r/525.x264_r    2373.00            1991.00  -382.00  4808.00             4287.00  -521.00
MultiSourc...ks/ASCI_Purple/SMG2000/smg2000    2684.00            2334.00  -350.00  4820.00             4349.00  -471.00
MultiSourc...nchmarks/FreeBench/pifft/pifft     442.00             126.00  -316.00   595.00              281.00  -314.00
MultiSourc.../Applications/JM/ldecod/ldecod    1335.00            1131.00  -204.00  2311.00             2142.00  -169.00
External/S...00.perlbench_s/600.perlbench_s    4354.00            4154.00  -200.00  9615.00             9435.00  -180.00
External/S...00.perlbench_r/500.perlbench_r    4354.00            4154.00  -200.00  9615.00             9435.00  -180.00
MultiSourc.../Applications/JM/lencod/lencod    3368.00            3172.00  -196.00  7261.00             7069.00  -192.00
External/S...te/538.imagick_r/538.imagick_r    4163.00            4000.00  -163.00 10354.00             9964.00  -390.00
MultiSourc...ch/consumer-lame/consumer-lame     722.00             559.00  -163.00  1098.00              994.00  -104.00
External/S...ed/638.imagick_s/638.imagick_s    4163.00            4000.00  -163.00 10354.00             9964.00  -390.00
MultiSource/Applications/oggenc/oggenc          970.00             817.00  -153.00  2327.00             2120.00  -207.00
MultiSourc...e/Applications/ClamAV/clamscan    2072.00            1937.00  -135.00  4836.00             4648.00  -188.00
      regalloc.NumSpills                            regalloc.NumReloads                           
run                   00            10         diff                  00            10         diff
mean   87.747460          84.068699    -3.678761     1371.475285         1357.146154  -3.792937   
```
2. `-riscv-disable-latency-heuristic=false` and `-riscv-should-track-lane-masks=true`:
```
Program                                       regalloc.NumSpills                 regalloc.NumReloads                 
                                              00                 01      diff    00                  01       diff   
SingleSour...ce/UnitTests/matrix-types-spec   8823.00            8233.00 -590.00 15603.00            15020.00 -583.00
MultiSourc...ch/consumer-lame/consumer-lame    722.00             689.00  -33.00  1098.00             1065.00  -33.00
MultiSourc...s/Prolangs-C/football/football    248.00             250.00    2.00   349.00              350.00    1.00
MultiSourc...ench/telecomm-gsm/telecomm-gsm    182.00             181.00   -1.00   196.00              195.00   -1.00
MultiSourc...Benchmarks/7zip/7zip-benchmark   1272.00            1273.00    1.00  2436.00             2437.00    1.00
MicroBench...arks/ImageProcessing/Blur/blur    114.00             113.00   -1.00   136.00              136.00    0.00
MultiSourc...rks/mediabench/gsm/toast/toast    182.00             181.00   -1.00   196.00              195.00   -1.00
MultiSourc...gs-C/TimberWolfMC/timberwolfmc   1196.00            1195.00   -1.00  2036.00             2029.00   -7.00
SingleSour.../execute/GCC-C-execute-pr36321      0.00               0.00    0.00                                 0.00
SingleSour.../execute/GCC-C-execute-pr36077      0.00               0.00    0.00                                 0.00
SingleSour...xecute/GCC-C-execute-pr33779-1      0.00               0.00    0.00                                 0.00
SingleSour.../execute/GCC-C-execute-pr33669      0.00               0.00    0.00                                 0.00
SingleSour.../execute/GCC-C-execute-pr33631      0.00               0.00    0.00                                 0.00
SingleSour.../execute/GCC-C-execute-pr33382      0.00               0.00    0.00                                 0.00
SingleSour.../execute/GCC-C-execute-pr37102      0.00               0.00    0.00                                 0.00
      regalloc.NumSpills                            regalloc.NumReloads                           
run                   00            01         diff                  00            01         diff
mean   87.747460          87.445573    -0.301887     1371.475285         1369.091255  -0.303338   
```
3. `-riscv-disable-latency-heuristic=true` and `-riscv-should-track-lane-masks=true`:
```
Program                                       regalloc.NumSpills                   regalloc.NumReloads                  
                                              00                 11       diff     00                  11       diff    
SingleSour...ce/UnitTests/matrix-types-spec    8823.00            6320.00 -2503.00 15603.00            13544.00 -2059.00
External/S...rate/510.parest_r/510.parest_r   43817.00           43262.00  -555.00 87058.00            87033.00   -25.00
External/S...017speed/625.x264_s/625.x264_s    2373.00            1991.00  -382.00  4808.00             4287.00  -521.00
External/S...2017rate/525.x264_r/525.x264_r    2373.00            1991.00  -382.00  4808.00             4287.00  -521.00
MultiSourc...ks/ASCI_Purple/SMG2000/smg2000    2684.00            2334.00  -350.00  4820.00             4349.00  -471.00
MultiSourc...nchmarks/FreeBench/pifft/pifft     442.00             126.00  -316.00   595.00              281.00  -314.00
MultiSourc.../Applications/JM/ldecod/ldecod    1335.00            1131.00  -204.00  2311.00             2142.00  -169.00
External/S...00.perlbench_s/600.perlbench_s    4354.00            4154.00  -200.00  9615.00             9435.00  -180.00
External/S...00.perlbench_r/500.perlbench_r    4354.00            4154.00  -200.00  9615.00             9435.00  -180.00
MultiSourc.../Applications/JM/lencod/lencod    3368.00            3172.00  -196.00  7261.00             7069.00  -192.00
External/S...te/538.imagick_r/538.imagick_r    4163.00            4000.00  -163.00 10354.00             9964.00  -390.00
MultiSourc...ch/consumer-lame/consumer-lame     722.00             559.00  -163.00  1098.00              994.00  -104.00
External/S...ed/638.imagick_s/638.imagick_s    4163.00            4000.00  -163.00 10354.00             9964.00  -390.00
MultiSource/Applications/oggenc/oggenc          970.00             817.00  -153.00  2327.00             2120.00  -207.00
MultiSourc...e/Applications/ClamAV/clamscan    2072.00            1937.00  -135.00  4836.00             4648.00  -188.00
      regalloc.NumSpills                            regalloc.NumReloads                           
run                   00            11         diff                  00            11         diff
mean   87.747460          84.142235    -3.605225     1371.475285         1357.692308  -3.724238   
```

We can see that both options can reduce the mean of spills/reloads. `ShouldTrackLaneMasks` has smaller influence because only vector registers (with sub-registers) can benefit from this.

I didn't run these tests on real hardwares, so these data may not be so convincing. I'd appreciate it if you can evaluate this on some platforms, that will be helpful. If you find this common setting is not suitable for your microarchitectures, please let me know, we can make it a tune feature. All I want is just to unify the common sched policy and make part of the policy being tune features.

https://github.com/llvm/llvm-project/pull/115858


More information about the llvm-branch-commits mailing list