[PATCH] D47676: [X86][Znver1] Specify Register Files, RCU; FP scheduler capacity.

Andrea Di Biagio via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jun 15 03:34:13 PDT 2018


andreadb added inline comments.


================
Comment at: lib/Target/X86/X86ScheduleZnver1.td:96
+// Reference: "Software Optimization Guide for AMD Family 17h Processors"
+def ZnIntegerPRF : RegisterFile<168, [GR8, GR16, GR32, GR64, CCR]>;
+
----------------
GGanesh wrote:
> I am not sure  why we include CCR register.
As a rule of thumb:
a register class should only be added if you want to allow those registers to be renamed. In this case, CCR should be added if you want to count EFLAGS renames agains the number of available physical registers in the PRF.
If you don't think EFLAGS renames should be counted against the total, then it is correct to remove it. Otherwise, it is incorrect.


================
Comment at: lib/Target/X86/X86ScheduleZnver1.td:111
+// Reference: "Software Optimization Guide for AMD Family 17h Processors"
+def ZnRCU : RetireControlUnit<192, 8>;
+
----------------
lebedev.ri wrote:
> GGanesh wrote:
> > The retire unit is shared between integer and FP ops. In SMT mode it is 96 entry per thread. So, I think we shall consider only 96 entry as a conservative value.
> Aha, i was wondering how SMT was considered here.
> But then what about `MicroOpBufferSize` in `SchedMachineModel`?
> Is that supposed to be keep at `192`?
llvm-mca doesn't make assumptions on whether the CPU is in SMT mode or not. Same for the scheduling model, which assumes an optimistic micro-op buffer. For now, it is better to specify the resources seen available by a single thread running on the CPU (i.e. with no other concurrent threads).
Basically, I think you should not go for a conservative value here.


================
Comment at: lib/Target/X86/X86ScheduleZnver1.td:113
+
+// FIXME: there are 72 read buffers and 44 write buffers.
+
----------------
lebedev.ri wrote:
> GGanesh wrote:
> > I assume these are the load/store queue entries. The FPU has
> > 1. 44 entry Load Queue
> > 2. 72 Out of Order Loads
> > 3. 44 entry Store Queue
> > So, we are concerned only about the queues, we can have only 44 marked for LD and ST.
> > 
> I think i would prefer to keep this just as a fixme note for now, since
> while llvm-mca does have an options to specify the sizes of these queues,
> it does not seem to read those from the sched model otherwise,
> i guess because it can't currently be expressed here.
In theory, the processor model could specify load/store queues as a buffered processor resources that are consumed by instructions.
In practice, llvm-mca has the concept of `LSUnit`. There is an open bugzilla about how to better describe load/store queues in llvm-mca and the scheduling model.


Repository:
  rL LLVM

https://reviews.llvm.org/D47676





More information about the llvm-commits mailing list