[PATCH] D101633: [AMDGPU] Set number vgprs used in PS shaders based on input registers actually used

Wed Jun 16 02:56:58 PDT 2021

dstuttard marked 6 inline comments as done.
dstuttard added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:1136
+    uint32_t InputAddr = 0;
+    unsigned LastEna;
+
----------------
foad wrote:
> Doesn't LastEna need to be initialized here too? I assume all these initializations are just to avoid "may be used uninitialized" warnings.
Yes, I've added an initialization for consistency.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:1140
+      InputEna = MFI->getPSInputEnable();
+      InputAddr = MFI->getPSInputEnable();
+      // We only need to consider input args up to the last used arg
----------------
foad wrote:
> Should this be getPSInputAddr, and if so why didn't the tests you added fail?
Yes, it should be getPSInputAddr.

It is actually very hard to generate a case where this makes a difference! It is when no input args are used, there is an PSInputAddr that sets inputs as allocated, plus there are extra args, which are also unused. Test is added.

The reason is makes little difference is that the line setting the NumVGPRs uses a max of the value calculated here and the actual allocation done for inputs - this usually fixes any discrepancies.
The reason we can't just use the allocation is that when there are extra arguments it will get a different result.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:1142
+      // We only need to consider input args up to the last used arg
+      LastEna = findLastSet(InputEna) + 1;
+    }
----------------
foad wrote:
> Can InputEna ever be zero, and if so what should LastEna be?
InputEna can never be zero - I should probably put in an assert for that.

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:1180
     ProgInfo.NumSGPR = std::max(ProgInfo.NumSGPR, WaveDispatchNumSGPR);
-    ProgInfo.NumVGPR = std::max(ProgInfo.NumVGPR, WaveDispatchNumVGPR);
+    ProgInfo.NumArchVGPR = std::max(ProgInfo.NumVGPR, WaveDispatchNumVGPR);
+    ProgInfo.NumVGPR =
----------------
foad wrote:
> What effect does setting NumArchVGPR here have, given that it was not being set before?
NumArchVGPR was introduced to hold the original NumVGPRs (it's set to that at the start of this function). This is useful on architectures with AGPR support as the NumVGPRs (total vgprs) is adjusted with any AGPR registers that are used.

Since we're altering the original VGPR use with this change - we have to alter that value and set NumVGPRs to a new total with AGPR use (for architectures that support it).

================
Comment at: llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.h:56
     int32_t getTotalNumSGPRs(const GCNSubtarget &ST) const;
+    int32_t getTotalNumVGPRs(const GCNSubtarget &ST, int32_t NumAGPR,
+                             int32_t NumVGPR) const;
----------------
foad wrote:
> This function (and possibly the ones next to it) could do with a comment. It's not obvious to me what TotalNumVGPRs means, given that one of the inputs is already called NumVGPR.
It is slightly confusing. The issue is that there's a difference between "actual number of VGPR registers required" and the NumVGPRs which is the number of explicitly tracked VGPR registers. Depending on the architecture, AGPR registers are included in the total VGPRs - and there are some alignment constraints too, which make the total different.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101633/new/

https://reviews.llvm.org/D101633