[llvm] [AMDGPU] Optionally Use GCNRPTrackers during scheduling (PR #93090)

Jeffrey Byrnes via llvm-commits llvm-commits at lists.llvm.org
Mon Oct 7 08:54:17 PDT 2024


================
@@ -148,17 +154,44 @@ static bool canUsePressureDiffs(const SUnit &SU) {
   return true;
 }
 
-static void getRegisterPressures(bool AtTop,
-                                 const RegPressureTracker &RPTracker, SUnit *SU,
-                                 std::vector<unsigned> &Pressure,
-                                 std::vector<unsigned> &MaxPressure) {
+static void getRegisterPressures(
+    bool AtTop, const RegPressureTracker &RPTracker, SUnit *SU,
+    std::vector<unsigned> &Pressure, std::vector<unsigned> &MaxPressure,
+    GCNDownwardRPTracker &DownwardTracker, GCNUpwardRPTracker &UpwardTracker,
+    ScheduleDAGMI *DAG, const SIRegisterInfo *SRI) {
   // getDownwardPressure() and getUpwardPressure() make temporary changes to
   // the tracker, so we need to pass those function a non-const copy.
   RegPressureTracker &TempTracker = const_cast<RegPressureTracker &>(RPTracker);
-  if (AtTop)
-    TempTracker.getDownwardPressure(SU->getInstr(), Pressure, MaxPressure);
-  else
-    TempTracker.getUpwardPressure(SU->getInstr(), Pressure, MaxPressure);
+  if (!GCNTrackers) {
+    AtTop
+        ? TempTracker.getDownwardPressure(SU->getInstr(), Pressure, MaxPressure)
+        : TempTracker.getUpwardPressure(SU->getInstr(), Pressure, MaxPressure);
+
+    return;
+  }
+
+  // GCNTrackers
+  Pressure.resize(4, 0);
+  MachineInstr *MI = SU->getInstr();
+  if (AtTop) {
+    GCNDownwardRPTracker TempDownwardTracker(DownwardTracker);
+    TempDownwardTracker.bumpDownwardPressure(MI, SRI);
+    Pressure[AMDGPU::RegisterPressureSets::SReg_32] =
+        TempDownwardTracker.getPressure().getSGPRNum();
+    Pressure[AMDGPU::RegisterPressureSets::VGPR_32] =
+        TempDownwardTracker.getPressure().getArchVGPRNum();
+    Pressure[AMDGPU::RegisterPressureSets::AGPR_32] =
+        TempDownwardTracker.getPressure().getAGPRNum();
+  } else {
+    GCNUpwardRPTracker TempUpwardTracker(UpwardTracker);
----------------
jrbyrnes wrote:

Yes you are correct, there is no issue with this example. I simplified out the problem.

Please consider

```
---
name:            pdiff
tracksRegLiveness: true
body:             |
  bb.0:
  %49:areg_1024 = IMPLICIT_DEF
  undef %63.sub0:vreg_128 = COPY %49.sub0:areg_1024
  %63.sub1:vreg_128 = COPY %49.sub1:areg_1024
  %63.sub2:vreg_128 = COPY %49.sub2:areg_1024
  %63.sub3:vreg_128 = COPY %49.sub3:areg_1024
  
  S_ENDPGM 0, amdgpu_allvgprs
...
```


When speculating the first bottom-up instruction, we will calculate the RP from each of the COPYs. If we do so using recede, we will look at the live lanes of the register based on the LIS ordering (`collectVirtualRegUses` -> `getLiveLaneMask`). For the instruction defining undef %63.sub0, %49 has a LaneMask of 0xFF (all four lanes), The RPTracker would then determine this single lane copy uses 4 VGPRs. Similar problems for the instructions defining sub1 and sub2. Moreover, after scheduling an instruction bottomup, we will still have incorrect difference based RP calculations. bumpUpwardPressure calculates the usemask by doing something similar to what you suggest -- it will calculate the lanemask directly from the subreg being used by the instruction. As a result, it doesn't have these issues.

recede really doesn't work as is for this usecase. As you mentioned, we can adjust it to work for speculation, but I would prefer to contain the changes under the optional flag if possible.

https://github.com/llvm/llvm-project/pull/93090


More information about the llvm-commits mailing list