[llvm] [AMDGPU] Include unused preload kernarg in KD total SGPR count (PR #104743)

Austin Kerbow via llvm-commits llvm-commits at lists.llvm.org
Wed Sep 18 10:28:03 PDT 2024


================
@@ -900,6 +900,14 @@ void AMDGPUAsmPrinter::getSIProgramInfo(SIProgramInfo &ProgInfo,
 
     ProgInfo.NumVGPR = AMDGPUMCExpr::createTotalNumVGPR(
         ProgInfo.NumAccVGPR, ProgInfo.NumArchVGPR, Ctx);
+  } else if (isKernel(F.getCallingConv()) && MFI->getNumKernargPreloadedSGPRs()) {
+    // Consider cases where the total number of UserSGPRs with trailing
+    // allocated preload SGPRs, is greater than the number of explicitly
+    // referenced SGPRs.
+    const MCExpr *MaxUserSGPRs = MCBinaryExpr::createAdd(
+        CreateExpr(MFI->getNumUserSGPRs()), ExtraSGPRs, Ctx);
----------------
kerbowa wrote:

ExtraSGPRs here doesn't refer to anything from my change. It is the extra SGPRs the HW reserves for architected flat scratch, XNACK, and debugger. I think the name MaxUserSGPRs is maybe a misnomer and confusing though. It's really calculating UserSGPRs including preloads+extra reserved by the HW.

These "extra" SGPRs are not UserSGPRs in the way the HW uses or sets them up, or in the way they are encoded in the KD. So they should not be included in our backends' tracking of them. Preload kernarg SGPRs are already included in MFI->getNumUserSGPRs.

https://github.com/llvm/llvm-project/pull/104743


More information about the llvm-commits mailing list