[llvm] [AMDGPU] Use correct VGPR threshold for flagging ExcessRP regions in unified register file case (PR #85860)

Stanislav Mekhanoshin via llvm-commits llvm-commits at lists.llvm.org
Thu Mar 21 13:03:23 PDT 2024


================
@@ -1155,12 +1155,16 @@ unsigned getMinNumVGPRs(const MCSubtargetInfo *STI, unsigned WavesPerEU) {
   return std::min(MinNumVGPRs, AddrsableNumVGPRs);
 }
 
-unsigned getMaxNumVGPRs(const MCSubtargetInfo *STI, unsigned WavesPerEU) {
+unsigned getMaxNumVGPRs(const MCSubtargetInfo *STI, unsigned WavesPerEU,
+                        bool WholeRegisterFile) {
   assert(WavesPerEU != 0);
 
-  unsigned MaxNumVGPRs = alignDown(getTotalNumVGPRs(STI) / WavesPerEU,
-                                   getVGPRAllocGranule(STI));
-  unsigned AddressableNumVGPRs = getAddressableNumVGPRs(STI);
+  unsigned MaxNumVGPRs =
+      alignDown(getTotalNumVGPRs(STI, WholeRegisterFile) / WavesPerEU,
----------------
rampitec wrote:

This assumes equal allocation. I.e. assume you want occupancy 2 and you have no agprs at all. You will end up with the limit of 128 vgprs, but you can use 256.

https://github.com/llvm/llvm-project/pull/85860


More information about the llvm-commits mailing list