[PATCH] D158368: [AMDGPU][MISCHED] GCNBalancedSchedStrategy.

Fri Aug 25 12:59:26 PDT 2023

jrbyrnes added inline comments.

================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:1491
+    unsigned TheHighestInBot = 0;
+    for (auto &Succ : TheFurthestPred->Succs) {
+      if (Succ.isAssignedRegDep()) {
----------------
The FurthestPred won't always have the edge that creates the gap.

================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:1500
+    unsigned BotDistance = Bot.getCurrCycle() - TheHighestInBot;
+    TopToBotDistance = TopDistance - BotDistance;
+  }
----------------
TopDistance should be based on successor used in TheHighestInBot calculation.

  auto TopToBotDistance = 0;
  for (auto &Succ : TheFurthestPred->Succs) {
    if (!Succ.isAssignedRegDep) continue;
    unsigned SuccDistance = M.computeInstrLatency(Succ.getSUnit()->getInstr()) + TheFurthestPred->TopReadyCycle;
    if (SuccDistance <= Top.getReadyCycle()) cotninue;
    unsigned BotReadyCycle = Succ.getSUnit()->BotReadyCycle;
    unsigned BotDistance = Bot.getCurrCycle() - BotReadyCycle
    unsigned Gapsize = TopDistance - BotDistance;;
    if (GapSize > MaxGapSize) {
      TopToBotDistance = GapSize;
    }
  }

================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.h:499
+    SchedStages.push_back(GCNSchedStageID::UnclusteredHighRPReschedule);
+    SchedStages.push_back(GCNSchedStageID::ClusteredLowOccupancyReschedule);
+    SchedStages.push_back(GCNSchedStageID::PreRARematerialize);
----------------
alex-t wrote:
> jrbyrnes wrote:
> > Probably shouldn't allow occupancy drops in ClusteredLowOccupancyReschedule, otherwise we will need to rerun the phase. 
> Okay, it may achieve better ILP, sacrificing the occupancy, and the resulting metric will be the same or even better. Did you mean we should have a separate check that reverts the  ClusteredLowOccupancyReschedule if the occupancy has been dropped?
> Did you mean we should have a separate check that reverts the ClusteredLowOccupancyReschedule if the occupancy has been dropped?

Yes something like that -- if ClusteredLowOccupancy drops the occupancy then revert. Phase needs a stable occupancy in order for it to achieve its purpose.

================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.h:498
+    SchedStages.push_back(GCNSchedStageID::OccInitialSchedule);
+    SchedStages.push_back(GCNSchedStageID::UnclusteredHighRPReschedule);
+    SchedStages.push_back(GCNSchedStageID::ClusteredLowOccupancyReschedule);
----------------
alex-t wrote:
> jrbyrnes wrote:
> > I'm not sure I understand the purpose of UnclusteredHighRPReschedule stage in the context of a balanced scheduler.
> As I understood, the UnclusteredHighRPReschedule attempts to achieve better occupancy by removing most mutations, allowing for loose instruction placement.
> Since our metric aims for the balance between better occupancy and better ILP, we might accept the result of the UnclusteredHighRPReschedule phase if it was managed to achieve better occupancy w/o the loss in ILP.
The phase is only run for a subset of regions -- DAG.RegionsWithHighRP are in that subset. If we have high RP such that we are in danger of dropping occupancy, we reschedule the region without mutations to attempt to reduce RP and save our occupancy.

The meta heuristic used to determine which regions need rescheduling is based on register pressure and occupancy -- this doesn't seem in accordance with the spirit of a "balanced" scheduler.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158368/new/

https://reviews.llvm.org/D158368