[llvm] [AMDGPU] Prefer lower total register usage in regions with spilling (PR #71882)
Jeffrey Byrnes via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 14 13:42:57 PST 2023
jrbyrnes wrote:
Updated review to reflect some ideas contained in offline discussion.
The primary goal of the weighted cost calculation is to determine which RP results in less excess VGPR since these will need to be spilled to memory. Since SGPR can be spilled into VGPR, we can reformulate the excess SGPR in terms of VGPR pressure as so: 1 VGPR = WavefrontSize Excess SGPR.
There are some subtlties to consider, however. 1. [0:Wavefrontsize] Excess SGPR are all the same in terms VGPRs reguired (1) -- in other words 1 ExcessSGPR uses the same number of VGPR as 31 ExcessSGPR, 2. N excess SGPR is worse than N - 1 excess SGPR even if they require the same number of VGPRs due to spill code inserted (write/readlane), 3. (2 Excess VGPR, 0 Excess SGPR) is preferable to (1 Excess VGPR, 1 Excess SGPR) since the latter needs to insert more spill code.
Subtlties 2 and 3 can also be thought of as one RP being less than the other.
We should have the following behavior (ExcessVGPR, ExcessSGPR) (WavefrontSize is 64):
|Case |Pressure 1 |Pressure 2 |Winner |
|-----------|---------------|-------------------|-----------|
|0 |(2,0) |(1,1) |Pressure 1 |
|1 |(1,1) |(1,63) |Pressure 1 |
|2 |(2,0) |(1,63) |Pressure 1 |
|3 |(2,0) |(1,64) |PRessure 1 |
|4 |(3,0) |(1,64) |PRessure 2 |
|5 |(3,0) |(1,128) |PRessure 1 |
|6 |(3,0) |(0,1280)* |PRessure 2 |
*VGPR usage is not excess even with VGPR spills from SGPRs.
It is a bit hard to write lit tests for all of these conditions, but I have unit tested the logic and validated the above cases.
Also,Removed the flag to set the relative weighting of excess SGPR and VGPR since the current formulation only really makes sense if using the WavefrontSize.
https://github.com/llvm/llvm-project/pull/71882
More information about the llvm-commits
mailing list