[llvm] [AMDGPU] Don't DEALLOC_VGPRS from callable functions (PR #72245)

Tue Nov 14 04:17:18 PST 2023

================
@@ -1039,10 +1041,13 @@ bool SIInsertWaitcnts::generateWaitcntInstBefore(MachineInstr &MI,
   // Identify S_ENDPGM instructions which may have to wait for outstanding VMEM
   // stores. In this case it can be useful to send a message to explicitly
   // release all VGPRs before the stores have completed, but it is only safe to
-  // do this if there are no outstanding scratch stores.
+  // do this if:
+  // * there are no outstanding scratch stores
+  // * this is not a callable function
   else if (MI.getOpcode() == AMDGPU::S_ENDPGM ||
            MI.getOpcode() == AMDGPU::S_ENDPGM_SAVED) {
----------------
jasilvanus wrote:

Do we have to care about callers if we are about to terminate the wave with `S_ENDPGM`? Is this mixing up callees and callers?

https://github.com/llvm/llvm-project/pull/72245