[PATCH] D30941: Better testing of schedule model instruction latencies/throughputs

Wed Mar 15 09:11:46 PDT 2017

hfinkel added a comment.

In https://reviews.llvm.org/D30941#701621, @spatel wrote:

> In https://reviews.llvm.org/D30941#701498, @avt77 wrote:
>
> > hfinkel, could you help me? First of all could you give me a link(s) to any doc(s) related to our MCSchedModel except sources?
> >  Next, I was told that ResourceCycles here:
> >  ===============================
> >  class ProcWriteResources<list<ProcResourceKind> resources> {
> >
> >   list<ProcResourceKind> ProcResources = resources;
> >   list<int> ResourceCycles = [];
> >   int Latency = 1;
> >   int NumMicroOps = 1;
> >
> > ================================
> >  could be used as Throughput of the given instruction. Is it right? Does it mean I could include it in generated comment as well? If YES I suppose it should be the max of the Cycles, right?
>
>
> I don't know if there are any docs besides the code and the code comments,

The documentation is primarily in the header and TableGen files (for better or worse).

> but I think you are correct - the max of ResourceCycles is the inverse throughput for the instruction:

That's correct if the instruction can only dispatch through that one resource.

First, for itineraries, I think you can do something like this:

  double Unknown = std::numeric_limits<double>::infinity()
  double Throughput = Unknown;
  if (IID.isEmpty())
    return Throughput;

  for (const InstrStage *IS = IID.beginStage(ItinClassIndx),
             *E = IID.endStage(ItinClassIndx); IS != E; ++IS) {
    unsigned Cycles = IS->getCycles();
    if (!Cycles)
      continue;

    Throughput = std::min(Throughput, popcnt(IS->getUnits()) * 1.0/Cycles);
  }

  return Throughput;

For resource descriptions, I think that you want the inverse of ResourceCycles multiplied by the number of applicable resources. Something like this:

  for (MCWriteProcResEntry *WPR = STI.getWriteProcResBegin(SCClass),
                                                       *WEnd = STI.getWriteProcResEnd(SCClass); WPR != WEnd; ++WPR) {
    unsigned Cycles = WPR->Cycles;
    if (!Cycles)
      return Unknown;

    unsigned NumUnits = SCModel->getProcResource(WPR->ProcResourceIdx)->NumUnits;
    Throughput = std::min(Throughput, NumUnits * 1.0/Cycles);
  }

> This is from include/llvm/Target/TargetSchedule.td :
> 
>   // Optionally, ResourceCycles indicates the number of cycles the
>   // resource is consumed. Each ResourceCycles item is paired with the
>   // ProcResource item at the same position in its list. Since
>   // ResourceCycles are rarely specialized, the list may be
>   // incomplete. By default, resources are consumed for a single cycle,
>   // regardless of latency, which models a fully pipelined processing
>   // unit. A value of 0 for ResourceCycles means that the resource must
>   // be available but is not consumed, which is only relevant for
>   // unbuffered resources.
>    
> 
> And this is in MachineScheduler.cpp:
> 
>   // For reserved resources, record the highest cycle using the resource.
>   // For top-down scheduling, this is the cycle in which we schedule this
>   // instruction plus the number of cycles the operations reserves the
>   // resource.
>    
> 
> We could abbreviate the comment string that you are adding like: [7:2].
>  I'm biased because that's the way I've always formatted [ latency : inverse throughput ], but I think that people that care about CPU timing will recognize that format, so you don't have to print out the words "latency" or "throughput".

https://reviews.llvm.org/D30941