[PATCH] D43235: [SchedModel] Complete models shouldn't match against itineraries when they don't use them (PR35639) (WIP)

Tue Feb 20 16:22:29 PST 2018

> On Feb 16, 2018, at 10:57 AM, Nemanja Ivanovic via Phabricator <reviews at reviews.llvm.org> wrote:
> 
> nemanjai added a comment.
> 
> In https://reviews.llvm.org/D43235#1008022, @atrick wrote:
> 
>> I can confirm that the purpose of ItinRW is to avoid the need to define InstRW for a subtarget when the target already had itinerary classes. (Things would be *a lot* simpler if all the subtargets used the same style of machine description.)
>> 
>> CompleteModel can mean whatever is currently most useful to devs, but we don't want to lose the scheduling-time checks just because a subtarget doesn't define InstRW for everything. Maybe it would work for those subtargets should have InstRW { let Unsupported = 1 }  for opcodes that aren't covered by itinerary classes?
> 
> 
> I am not sure I completely follow what you're suggesting. I'll use Power9 scheduling model in the PPC back end as an example.
> We've implemented the model to try to be as accurate as we can. A vast majority of instructions that were defined with the same itinerary use the same functional units, so we just used `ItinRW` to map instructions with a specific itinerary to specific functional units. For instructions that don't have an itinerary, we've provided an `InstRW` or set the "no scheduling information" bit. Also, for instructions that have an itinerary, but use different functional units from most other instructions with that itinerary, we've overridden the resources that come from `ItinRW` with `InstRW`. This has allowed us to mark the model complete.

What you've done sounds like what I was suggesting.

I was referring to the fact that we check for a complete model in two places. Once statically during TableGen, which you refer to below, and again dynamically during scheduling, at TargetSchedModel::computeOperandLatency. It's common for subtargets to only support a subset of the instruction set. In that case, the machine model may be statically incomplete but dynamically complete. One solution is to explicitly mark all the unsupported instructions, making the model both statically and dynamically complete. It sounds like that's what you've already done, so it's not relevant here.

I would also expect that you're getting scheduling info for all instructions that have an itinerary class covered by ItinRW. Always check the table-generated output to be sure, or trace the scheduler itself. You can see every instruction's scheduling class number in PPCGenInstrInfo.inc, PPCInsts.
class MCInstrDesc {
  unsigned short Opcode;         // The opcode number
  unsigned short NumOperands;    // Num of args (may be more if variable_ops)
  unsigned char NumDefs;         // Num of args that are definitions
  unsigned char Size;            // Number of bytes in encoding.
  unsigned short SchedClass;     // enum identifying instr sched class
  ...

You can see the per-operand scheduling info for each class for your subtarget in PPCGenSubtargetInfo.inc, 
e.g. P9ModelSchedClasses.

You can also use -debug-only=machine-scheduler on some tiny example to confirm that everything is wired up.

Naturally, you do *not* want to mark your subtarget as incomplete, because you would lose verification.

I guess the problem is that the machine model completeness checking may not be able to verify that all instructions are covered for some subtarget by some combination ItinRW and InstRW. However that gets fixed, it should not involve declaring complete subtargets as incomplete.

-Andy

> 
> Since all of our instructions either had an itinerary specified or have been covered in `InstRW` (and all our itineraries covered by `ItinRW`), we assumed that all our instructions have all the processor resources correctly specified. However, judging by this patch and the fact that we don't satisfy the condition on line 1637 (`!SC.Writes.empty()`), it appears that we actually misunderstood what `ItinRW` actually does. In fact, it appears to me that a vast majority of our instructions (not covered by `InstRW`) don't actually provide any information to the scheduler about which functional resources in the CPU they consume. And if so, I am of course very much in favour of this patch as it highlights a problem we didn't know we had.
> 
> Is this an accurate understanding? If so, I'm not sure what the actual semantics of `ItinRW` are. Also, if we have to write `InstRW`'s for all our instructions in our case, we don't mind doing that. I just want to make sure that's what we should be doing before spending time on it.
> 
> 
> Repository:
>  rL LLVM
> 
> https://reviews.llvm.org/D43235
> 
> 
>