[llvm-bugs] [Bug 42353] New: [SchedModel][MCA] Add the ability to specify a different DispatchWidth and a different number of micro opcodes for the ROB/Dispatch Logic.

Fri Jun 21 08:04:40 PDT 2019

https://bugs.llvm.org/show_bug.cgi?id=42353

            Bug ID: 42353
           Summary: [SchedModel][MCA] Add the ability to specify a
                    different DispatchWidth and a different number of
                    micro opcodes for the ROB/Dispatch Logic.
           Product: tools
           Version: trunk
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: enhancement
          Priority: P
         Component: llvm-mca
          Assignee: unassignedbugs at nondot.org
          Reporter: andrea.dibiagio at gmail.com
                CC: andrea.dibiagio at gmail.com, llvm-bugs at lists.llvm.org,
                    matthew.davis at sony.com

This problem came up while investigating the quality of llvm-mca reports on
modern Intel processors.

tl;dr: Two points:

1) The IssueWidth from the scheduling models cannot always be used by llvm-mca
to simulate the processor dispatch width. For llvm-mca, we need a way to
specify a different value to model the processor dispatch width.

2) We want to allow users to optionally define a different number of opcodes
for the purpose of dispatch, ROB/Scheduler entries consumed. This information
could be used by Intel models to describe fused domains in Intel processors.

Long story:

Scheduling models were originally introduced to help scheduling algorithms
identify optimal sequences of instructions.

Models didn't need to provide too much information about the target processor.
Models simply had to describe the out-of-order as a unified reservation station
which "sees" all the instructions from an input scheduling region in input.
There is basically no concept of "instruction dispatch" in the scheduling
model. Instructions from a code region are all immediately available in the
idealized reservation station that sees all the processor resources.
There is also no need to model the decoder's queue: instructions don't need to
be fetched from a decoder's queue; they simply exist in an ideal (potentially
unbounded) reservation station which internally classifies instructions as
either "pending" or "ready" (based on hazards/data dependencies).

So what is in practice the so-called IssueWidth?
The model needed a way to superiorly limit the number of opcodes issued per
cycles. IssueWidth serves that specific purpose.
It can be seen in practice as a "magic number" (often empirically computed by
running several benchmarks) that ideally summarizes:
- throughput from the decoders
- availability of buffers in the out-of-order (notably, the ROB)
- dispatch throughput
- presence or absence of loop buffers, etc.

For most processors, that value (by luck) often matches what we call dispatch
throughput. However, things get complicated when the processor performs micro
fusion.

So here is the idea:

In the processor model, we introduce a tablegen class named DispatchLogic which
to start declares a single field named 'DispatchWith'.
If a model defines DispatchLogic, then llvm-mca uses field DispatchWidth
instead of IssueWidth to model the processo dispatch rate.

Note that this would be opt-in for the targets. In the absence of DispatchLogic
definition, llvm-mca would fall back to using IssueWidth as the actuall
dispatch rate. So, processors would not need to be changed if we decide to
implement it that way.

I suggest to add two extra (completely optional) fields in scheduling classes
to describe:
 - NumDispatchEntriesConsumed (or NumMicroOpcodesForDispatch)
 - NumROBEntriesConsumed (or NumSchedulerEntries)

Those two extra fields would default to NumMicroOpcodes.
So, users don't need to worry about changing their models if they are already
happy with NumMicroOpcodes.

We can structure things in the subtarget emitter so that information about
these opcodes is only emitted as "extra processor information".

I think that this may be a good way to solve some issues with simulating Intel
processors. If people agree with this approach, I may start working on a patch
to address this.

What do you think?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190621/b44024b1/attachment.html>