[PATCH] D98976: [CodeGen] Use ProcResGroup information in SchedBoundary
Andrea Di Biagio via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Thu Apr 15 04:15:41 PDT 2021
andreadb added a comment.
In D98976#2690205 <https://reviews.llvm.org/D98976#2690205>, @dpenry wrote:
> I have added a test case which might clarify how the scheduling improves with the scheduler changes; the t2ADDrr is able to dual-issue with VADDD, but VLDRS is not.
>
> But more fundamentally, I am not sure that I'm understanding the semantics of ProcResGroup in the same way, so I'll just ask a few questions...
>
> Assume:
>
> let BufferSize = 0 in {
> def X : ProcResource<1>;
> def Y : ProcResource<1>;
> def A : ProcResGroup<[X, Y]>;
> }
>
> def MyWideWrite : SchedWriteRes<[X,Y]> {
> let ResourceCycles = [1, 1];
> }
>
> def MyNarrowWrite: SchedWriteRes<[A]> {
> let ResourceCycles = [1];
> }
>
>
>
> 1. How many instances of A does the scheduling model intend to say that there are: one or two?
> 2. How many instances of A is MyWideWrite intended to consume?
> 3. How many instances of A is MyNarrowWrite intended to consume?
> 4. The goal is to be able to allow two instructions which use MyNarrowWrite to issue together, but to prevent a pair of instructions, one using MyNarrowWrite and the other using MyWideWrite, from dispatching/issuing together. Is this the right way to represent that restriction?
This is how I see it:
A group behaves like a hardware scheduler. Consuming a group for 1cy is equivalent to consuming one of its underlying sub-units for 1cy.
That is because a group acts as a proxy/dispatcher of uOPs to its sub-units. When a group is consumed by a write, the consumption is effectively redirected towards one of its sub-units.
uOPs issued to a sub-unit always go through the parent group. Basically, a group is always either implicitly or explicitly used when one of its sub-units is consumed.
A group is available if at least one of its sub-units is also available. Conversely, a group is unavailable if all its sub-units are also unavailable. A group becomes available again when at least one of its sub-units becomes available.
How a "victim" resource sub-unit is selected by a group, is unfortunately unspecified, and therefore it is implementation dependent. llvm-mca allows users to customise the selection logic of individual groups for specific targets. By default, it assumes a (pseudo) LRU (so that, over time, resource consumption through a group is always uniformly spread among all the sub-units). That being said, since this is all unspecified, it is fine for algorithms to just pick the first sub-unit available from the set.
In your example, there is only one instance of group A, and it intercepts all the uses/consumptions of X and Y. So, both MyWideWrite and MyNarrowWrite use A.
Since MyWideWrite consumes both X and Y for 1cy, group A becomes also unavailable for 1cy. MyWideWrite is basically consuming the entire group, effectively stalling the dispatch from A until at least one between X and Y becomes available again.
MyNarrowWrite only consumes one between X and Y for 1cy. That is because consuming A for 1cy is equivalent to consuming `one of {X, Y}` for 1cy.
Group A is still available after the issue of MyNarrowWrite if not all sub-units are busy. Note that this can only happen when X and Y are both available before the issue of MyNarrowWrite. Group A is fully consumed for at least 1cy if either X or Y was busy before MyNarrowWrite was issued.
For that reason, the sequence `MyNarrowWrite, MyWideWrite` can never be issued during a same cycle; MyWideWrite requires that both resources are avaible.
Sequence `MyNarrowWrite, MyNarrowWrite` can be issued during a same cycle. However, this can only happen if both X and Y are available at the beginning of the sequence.
If one between X and Y is unavailable at the beginning of that sequence, then the second MyWideWrite will need to wait for one cycle.
A sequence `MyWideWrite, MyNarrowWrite` can never be issued during a same cycle, because MyNarrowWrite consumes the entire group of A, this making resource A unavailable for 1cy.
-
Back to your patch. I am OK with a simplified approach like yours as long as a) it fixes your particular cases, b) your case is likely to be the most common one, and c) we document all the assumptions made by your algorithm somewhere in the MachineScheduler.
I don't want to block this progress because, as Dave pointed out, it does improve the runtime in some cases.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D98976/new/
https://reviews.llvm.org/D98976
More information about the llvm-commits
mailing list