[PATCH] D158368: [AMDGPU][MISCHED] GCNBalancedSchedStrategy.

Jeffrey Byrnes via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 22 16:28:12 PDT 2023


jrbyrnes added inline comments.


================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:1474
+  bool Result = false;
+  unsigned ScheduleLength = Top.getCurrCycle() + Bot.getCurrCycle();
+#ifndef NDEBUG
----------------
alex-t wrote:
> jrbyrnes wrote:
> > alex-t wrote:
> > > jrbyrnes wrote:
> > > > I think this is a lower bound on what we have previously called ScheduleLength? 
> > > Exactly. I realized that the ratio of the total stall cycles to the total amount of instructions better reflects the metric than the ratio of total stalls to the modeled length (i.e., the total amount of working cycles + the total amount of stalls).
> > That makes sense -- though, I would think we would still want to capture the latency as well. For example, StallTotal of say 15 when the total instruction latency is 30 means something different than if the total instruction latency was 150 (even if number of instructions is the same) -- the second schedule is able to hide latency better, thus has better ILP.
> Oops... I was wrong here.
> The SchedBoundary::bumpNode sets the boundary current cycle considering the scheduled instruction latency. The current cycle is set to the recently scheduled instruction "ready cycle" - the latency is already counted.
> Thus, Top.CurrCycle + Bot.CurrCycle gives us the total instruction latency, indeed.
Yes, bumpNode resolves the required latency for the instruction currently being scheduled, so Top.CurrCycle includes how many stalls are required before issuing the instructions in the top-down portion of schedule -- similar story for Bot.CurrCyle. 

However, we don't know how many stalls are required between the instructions in the bottom-up portion of schedule and top-down portion of schedule. If none are required, then the sum of CurrCycle is the ScheduleLength. However, if, for example, there is dependency between last node in TopDown and  last node in BottomUp, the stalls required to resolve that latency won't be accounted for in Top.CurrCycle + Bot.CurrCycle.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158368/new/

https://reviews.llvm.org/D158368



More information about the llvm-commits mailing list