[llvm] [MachineScheduler] Experimental option to partially disable pre-ra scheduling. (PR #90181)

Mon May 20 16:17:00 PDT 2024

michaelmaitland wrote:

# Summary

I ran some experiments based on https://github.com/llvm/llvm-project/pull/90181.
I collected dynamic instruction count and number of spills for spec2006int and
spec2006fp.

I only ran spec2006 C/C++ benchmarks.

The experiments use a value of -nosched-above=350 and compares to the default
value of `~0U`.

In summary, there is no significant change on spec2006 with this option enabled
when it comes to dynamic IC or spills on the C/C++ spec2006 benchmarks on
RISC-V.

# Dynamic IC
## spec2006int

There was no change in dynamic IC for most benchmarks. 400.perlbench and 403.gcc
dynamic IC decreased by 0.0002% and 0.000000002% respectively. 456.hmmer and
483.xalancbmk saw increase in dynamic IC by 0.00000003% and 0.027% respectively.

## spec2006fp

There was no change in dynamic IC for 433.milc, 444.namd, 450.soplex, 470.lbm,
482.sphinx3, and 998.specrand.

447.deal saw a 0.00003% decrease in dynamic IC and 453.povray saw a 0.0007%
increase in dynamic IC.

# Spills
## spec2006int

|config| Number of Spills Inserted | Number of Spilled Live Ranges | Number of Spilled Snippets |
|-|-|-|-|
|default|16101|33827|714|
|-nosched-above=350|16111|33839|709|

## spec2006fp

|config| Number of Spills Inserted | Number of Spilled Live Ranges | Number of Spilled Snippets |
|-|-|-|-|
|default|6585|10433|131|
|-nosched-above=350|6586|10420|131|

# Conclusion

I don't mind adding this functionality, especially since it will be behind an option, and that
option will be marked experimental. But I think it is important that we agree that its only
helping as a bandaid to solve an underlying problem. The long term solution is to pinpoint
why you are seeing these spills and fix that problem. It is possible that by fixing the
underlying problem, you may see better scheduling on smaller sized regions too.

## Some ideas

* Have you looked at the instructions that are leading to the performance differences? Are there
  certain kinds of instructions or instruction sequences that are getting impacted?
* Have you looked at what heuristics are causing the ordering that leads to the spills?
* How many instructions are in the `Available` queue during scheduling of the larger
  regions? Is there opportunity to refine your SchedModel so that the `Avaliable` and
  `Pending` are better populated?

https://github.com/llvm/llvm-project/pull/90181