[LLVMdev] extra one cycle of getOperandLatency

Thu Dec 19 22:35:15 PST 2013

Hi llvm-dev,

I wonder why there is an extra cycle for getOperandLatency.
It doesn't seem intuitive.

  UseCycle = DefCycle - UseCycle + 1;

When I read the comments in TargetItinerary.td, it said

  OperandCycles are optional "cycle counts". They specify the cycle after
  instruction issue the values which correspond to specific operand indices
  are defined or read.

I thought if an instruction reads the operands at the first cycle
and produces the result at the second cycle.  InstrItinData should be written
in something like this,

   InstrItinData<IIC_iALUr ,[InstrStage<1, [FU_x]>], [2, 1, 1]>

Therefore, for operand latency of iALUr output to iALUr input is latency
of "1".  However, by the implementatoin of getOperandLatency, the latency
of such definition is latency of "2".  That's not what I want.

After some digging around, I found the expression, "DefCycle - UseCycle + 1",
was first appearing in r79425 committed by David Goodwin, and seems
OperandCycles
was initially designed for ARM cortex-a8 (see also r79247 and r79436).

Then I checked "Cortex-A8 Technical Reference Manual - Instruction
Cycle Timing".
There are tables for instructions, for example

   Data-processing instructions
   Source1    Source2    Result1
   Rn:E2      Rm:E2      Rd:E2

That means Rn and Rm are read at the begin of E2 stage,
Rd is produced at the end of E2, and there is 1 cycle latency.

And that was implemented in llvm as such

  InstrItinData<IIC_iALUr ,[InstrStage<1, [A8_Pipe0, A8_Pipe1]>], [2, 2, 2]>,

Is that mean, OperandCycles and getOperandLatency were simply designed
in such a way, so it is easier to use the table from cortex-a8 RTM?
So OperandCycles are not actually referred to "cycle",
for input operand it means at the begin of what stage
and for output operand it means at the end of what stage?

If so, is there any other reasons it should be designed this way?
What not remove the +1 cycle and define the instruction as such?

  InstrItinData<IIC_iALUr ,[InstrStage<1, [A8_Pipe0, A8_Pipe1]>], [3, 2, 2]>,

Thanks

Wei-cheng Wang