[PATCH] D141876: [AMDGPU] Tune scheduler on GFX10 and GFX11 for regions with spilling

Mon Jan 16 15:32:00 PST 2023

rampitec created this revision.
rampitec added reviewers: kerbowa, foad.
Herald added subscribers: kosarev, StephenFan, hiraditya, tpr, dstuttard, yaxunl, jvesely, kzhuravl, arsenm.
Herald added a project: All.
rampitec requested review of this revision.
Herald added a subscriber: wdng.
Herald added a project: LLVM.

Unlike older ASICs GFX10+ have a lot of VGPRs. Therefore, it is possible
to achieve high occupancy even with all or almost all addressable VGPRs
used. Our scheduler was never tuned for this scenario. The VGPR Critical
Limit threshold always comes very high, even if maximum occupancy is
targeted. For example on gfx1100 it is set to 192 registers even with
the requested occupancy 16. As a result scheduler starts prioritizing
register pressure reduction very late and we easily end up spilling.

This patch makes VGPR critical limit similar to what we would have on
pre-gfx10 targets with much more limited VGPR budget while still trying
to maintain occupancy as it does now.

Pre-gfx10 ASICs shall not be affected as the limit shall be the same
as before, and on gfx10+ it shall only affect regions where we have
to spill.


https://reviews.llvm.org/D141876

Files:
  llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
  llvm/lib/Target/AMDGPU/GCNSchedStrategy.h


Index: llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
===================================================================

--- llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
+++ llvm/lib/Target/AMDGPU/GCNSchedStrategy.h
@@ -75,6 +75,10 @@
   // track register pressure for actual scheduling heuristics.
   bool HasHighPressure;
 
+  // Schedule known to have excess register pressure. Be more conservative in
+  // increasing ILP and preserving VGPRs.
+  bool KnownExcessRP = false;
+
   // An error margin is necessary because of poor performance of the generic RP
   // tracker and can be adjusted up for tuning heuristics to try and more
   // aggressively reduce register pressure.
@@ -296,6 +300,11 @@
   // Returns true if scheduling should be reverted.
   virtual bool shouldRevertScheduling(unsigned WavesAfter);
 
+  // Returns true if current region has known excess pressure.
+  bool isRegionWithExcessRP() const {
+    return DAG.RegionsWithExcessRP[RegionIdx];
+  }
+
   // Returns true if the new schedule may result in more spilling.
   bool mayCauseSpilling(unsigned WavesAfter);
 
Index: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
===================================================================
--- llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
+++ llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp
@@ -70,8 +70,20 @@
   TargetOccupancy = MFI.getOccupancy();
   SGPRCriticalLimit =
       std::min(ST.getMaxNumSGPRs(TargetOccupancy, true), SGPRExcessLimit);
-  VGPRCriticalLimit =
-      std::min(ST.getMaxNumVGPRs(TargetOccupancy), VGPRExcessLimit);
+
+  if (!KnownExcessRP) {
+    VGPRCriticalLimit =
+        std::min(ST.getMaxNumVGPRs(TargetOccupancy), VGPRExcessLimit);
+  } else {
+    // This is similar to ST.getMaxNumVGPRs(TargetOccupancy) result except
+    // returns a reasonably small number for targets with lots of VGPRs, such
+    // as GFX10 and GFX11.
+    unsigned Granule = AMDGPU::IsaInfo::getVGPRAllocGranule(&ST);
+    unsigned Addressable = AMDGPU::IsaInfo::getAddressableNumVGPRs(&ST);
+    unsigned VGPRBudget = alignDown(Addressable / TargetOccupancy, Granule);
+    VGPRBudget = std::max(VGPRBudget, Granule);
+    VGPRCriticalLimit = std::min(VGPRBudget, VGPRExcessLimit);
+  }
 
   // Subtract error margin from register limits and avoid overflow.
   SGPRCriticalLimit =
@@ -603,6 +615,8 @@
     for (auto Region : Regions) {
       RegionBegin = Region.first;
       RegionEnd = Region.second;
+      S.KnownExcessRP = Stage->isRegionWithExcessRP();
+
       // Setup for scheduling the region and check whether it should be skipped.
       if (!Stage->initGCNRegion()) {
         Stage->advanceRegion();
@@ -1135,7 +1149,7 @@
 bool GCNSchedStage::mayCauseSpilling(unsigned WavesAfter) {
   if (WavesAfter <= MFI.getMinWavesPerEU() &&
       !PressureAfter.less(ST, PressureBefore) &&
-      DAG.RegionsWithExcessRP[RegionIdx]) {
+      isRegionWithExcessRP()) {
     LLVM_DEBUG(dbgs() << "New pressure will result in more spilling.\n");
     return true;
   }


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D141876.489644.patch
Type: text/x-patch
Size: 2995 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20230116/9bcf89cb/attachment.bin>