[llvm] [AMDGPU] Skip VGPR deallocation for waveslot limited kernels (PR #112765)
Stanislav Mekhanoshin via llvm-commits
llvm-commits at lists.llvm.org
Fri Oct 18 10:16:21 PDT 2024
================
@@ -2606,15 +2606,24 @@ bool SIInsertWaitcnts::runOnMachineFunction(MachineFunction &MF) {
// Insert DEALLOC_VGPR messages before previously identified S_ENDPGM
// instructions.
- for (MachineInstr *MI : ReleaseVGPRInsts) {
- if (ST->requiresNopBeforeDeallocVGPRs()) {
- BuildMI(*MI->getParent(), MI, MI->getDebugLoc(), TII->get(AMDGPU::S_NOP))
- .addImm(0);
+ // Skip deallocation if kernel is waveslot limited vs VGPR limited. A short
+ // waveslot limited kernel runs slower with the deallocation.
+ if (!ReleaseVGPRInsts.empty() &&
+ (MF.getFrameInfo().hasCalls() ||
+ AMDGPU::IsaInfo::getTotalNumVGPRs(ST) /
+ TRI->getNumUsedPhysRegs(*MRI, AMDGPU::VGPR_32RegClass) <
----------------
rampitec wrote:
Yes, this is also simpler. Done.
https://github.com/llvm/llvm-project/pull/112765
More information about the llvm-commits
mailing list