[llvm] [AMDGPU][MachineScheduler] Alternative way to control excess RP. (PR #68004)

Wed Dec 27 09:25:55 PST 2023

alex-t wrote:

> Continuing the discussion here as opposed to email.
> 
> > The UnclusteredHighRP is intended for those regions which have high RP after the scheduling is done.
> > I think that we should run the UnclusteredHighRP only for regions which have excess Rp after the scheduling is done.
> 
> The original intent of unclustered scheduling was to increase occupancy in the kernel when it was possible to do so if we tried scheduling without mutations. The extra checks for excess RP and spilling were added later. There were concrete cases that motivated both of these changes.
> 
> That's not to say I don't approve of the new approach, any simplification of the current logic would be welcome, but I think it needs to be supported by performance numbers both on compute and graphics.

I have asked CQE to run the extended testing cycle. Here is their response: "We have completed the staging testing (daily cycle coverage) with your patch. Its Conditional-Go, we didn’t observe any new failures in this cycle (except the staging branch’s known issues)."
So, as this change was not assumed to introduce any improvements - just makes the excess regions handling more clear, I would consider it as ready for upstream.

https://github.com/llvm/llvm-project/pull/68004