[llvm] [AMDGPU] Optionally Use GCNRPTrackers during scheduling (PR #93090)

Mon Oct 7 05:13:47 PDT 2024

================
@@ -148,17 +154,44 @@ static bool canUsePressureDiffs(const SUnit &SU) {
   return true;
 }
 
-static void getRegisterPressures(bool AtTop,
-                                 const RegPressureTracker &RPTracker, SUnit *SU,
-                                 std::vector<unsigned> &Pressure,
-                                 std::vector<unsigned> &MaxPressure) {
+static void getRegisterPressures(
+    bool AtTop, const RegPressureTracker &RPTracker, SUnit *SU,
+    std::vector<unsigned> &Pressure, std::vector<unsigned> &MaxPressure,
+    GCNDownwardRPTracker &DownwardTracker, GCNUpwardRPTracker &UpwardTracker,
+    ScheduleDAGMI *DAG, const SIRegisterInfo *SRI) {
   // getDownwardPressure() and getUpwardPressure() make temporary changes to
   // the tracker, so we need to pass those function a non-const copy.
   RegPressureTracker &TempTracker = const_cast<RegPressureTracker &>(RPTracker);
-  if (AtTop)
-    TempTracker.getDownwardPressure(SU->getInstr(), Pressure, MaxPressure);
-  else
-    TempTracker.getUpwardPressure(SU->getInstr(), Pressure, MaxPressure);
+  if (!GCNTrackers) {
+    AtTop
+        ? TempTracker.getDownwardPressure(SU->getInstr(), Pressure, MaxPressure)
+        : TempTracker.getUpwardPressure(SU->getInstr(), Pressure, MaxPressure);
+
+    return;
+  }
+
+  // GCNTrackers
+  Pressure.resize(4, 0);
+  MachineInstr *MI = SU->getInstr();
+  if (AtTop) {
+    GCNDownwardRPTracker TempDownwardTracker(DownwardTracker);
+    TempDownwardTracker.bumpDownwardPressure(MI, SRI);
+    Pressure[AMDGPU::RegisterPressureSets::SReg_32] =
+        TempDownwardTracker.getPressure().getSGPRNum();
+    Pressure[AMDGPU::RegisterPressureSets::VGPR_32] =
+        TempDownwardTracker.getPressure().getArchVGPRNum();
+    Pressure[AMDGPU::RegisterPressureSets::AGPR_32] =
+        TempDownwardTracker.getPressure().getAGPRNum();
+  } else {
+    GCNUpwardRPTracker TempUpwardTracker(UpwardTracker);
----------------
vpykhtin wrote:

After (in bottom-up order) the instruction defining %2 we have %0:1100b live for %0.sub1. After we speculate over instruction defining %1, we get %0:1111b live indeed because now sub0 and sub1 are live. But when we calculate pressure we're accounting for a mask increase from %0:1100b to %0:1111b that is for a 0011b difference and this seems correct.

It seems though that it would be more correct if the mask found by getLiveLaneMask would be ANDed with the mask of the actually used subreg in the instruction. This way it can allow to find live masks for a complicated cases like sub1_sub0 subregs (it can have sub0, sub1 or sub1 and sub0 lives) but not wider than actually used subreg mask.


https://github.com/llvm/llvm-project/pull/93090