[PATCH] D158368: [AMDGPU][MISCHED] GCNBalancedSchedStrategy.
Jeffrey Byrnes via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Tue Aug 22 16:28:12 PDT 2023
jrbyrnes added inline comments.
================
Comment at: llvm/lib/Target/AMDGPU/GCNSchedStrategy.cpp:1474
+ bool Result = false;
+ unsigned ScheduleLength = Top.getCurrCycle() + Bot.getCurrCycle();
+#ifndef NDEBUG
----------------
alex-t wrote:
> jrbyrnes wrote:
> > alex-t wrote:
> > > jrbyrnes wrote:
> > > > I think this is a lower bound on what we have previously called ScheduleLength?
> > > Exactly. I realized that the ratio of the total stall cycles to the total amount of instructions better reflects the metric than the ratio of total stalls to the modeled length (i.e., the total amount of working cycles + the total amount of stalls).
> > That makes sense -- though, I would think we would still want to capture the latency as well. For example, StallTotal of say 15 when the total instruction latency is 30 means something different than if the total instruction latency was 150 (even if number of instructions is the same) -- the second schedule is able to hide latency better, thus has better ILP.
> Oops... I was wrong here.
> The SchedBoundary::bumpNode sets the boundary current cycle considering the scheduled instruction latency. The current cycle is set to the recently scheduled instruction "ready cycle" - the latency is already counted.
> Thus, Top.CurrCycle + Bot.CurrCycle gives us the total instruction latency, indeed.
Yes, bumpNode resolves the required latency for the instruction currently being scheduled, so Top.CurrCycle includes how many stalls are required before issuing the instructions in the top-down portion of schedule -- similar story for Bot.CurrCyle.
However, we don't know how many stalls are required between the instructions in the bottom-up portion of schedule and top-down portion of schedule. If none are required, then the sum of CurrCycle is the ScheduleLength. However, if, for example, there is dependency between last node in TopDown and last node in BottomUp, the stalls required to resolve that latency won't be accounted for in Top.CurrCycle + Bot.CurrCycle.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D158368/new/
https://reviews.llvm.org/D158368
More information about the llvm-commits
mailing list