[PATCH] D76909: [MachineScheduler] Update available queue on the first mop of a new cycle

Mon Mar 30 03:44:59 PDT 2020

dmgreen marked 2 inline comments as done.
dmgreen added a comment.

In D76909#1946106 <https://reviews.llvm.org/D76909#1946106>, @jsji wrote:

> Looks like this change has more impact to pre-P9 models, especially ppc32/ppc64, which we were using itineraries . (And I know there might be bugs in those itineraries)
>  Although some changes doesn't look good at first glance, we should dig into itineraries of each model to see whether this is problem in itineraries or due to this change.
>
> @steven.zhang Can you help to look into some of them , especially for P6 <https://reviews.llvm.org/P6>-P8 tests. Thanks.
>  @jhibbits FYI, you may want to check whether those changes in G3/SPE are good.

In D76909#1945713 <https://reviews.llvm.org/D76909#1945713>, @steven.zhang wrote:

> Can we guard it with getMicroOpBufferSize() == 0 to avoid the impact of the out of order CPU as you motivate this patch with inorder cpu ?

Thanks for taking a look guys. I believe this patch is improving the modelling of any cpu, given the schedule model it's working with. If I am understanding this correctly then if the last instruction scheduled in a cycle might hazard with others in the available queue, then the start of the next cycle can pick an instruction that should not be available.

I don't have any knowledge of whether it's actually better for these cpu's that have changed here though. Happy enough to add something to turn it off if required. How about an option in MachineSchedPolicy like DisableLatencyHeuristic?

================
Comment at: llvm/test/CodeGen/PowerPC/fp128-bitcast-after-operation.ll:31
 ; PPC32-DAG: lwz [[LO0:[0-9]+]], 16(1)
+; PPC32: rlwinm [[FLIP_BIT:[0-9]+]], [[HI0]], 0, 0, 0
 ; PPC32-DAG: lwz [[HI1:[0-9]+]], 28(1)
----------------
jsji wrote:
> This looks like a regression, rlwinm is using HI0, and lwz should be 3-4 cycle.
> But we may need to check the itineraries to see whether this is due to resource hazard.
I think schedule might be believing that the second lwz will stall for a number of cycles because of the resource conflict between it and the first lwz? So from the schedule's point of view we are a few cycles further on than we would expect.

================
Comment at: llvm/test/CodeGen/PowerPC/fp128-bitcast-after-operation.ll:73
 ; PPC32-DAG: xoris [[HI0]], [[HI0]], 32768
+; PPC32-DAG: lwz [[LO1:[0-9]+]], 20(1)
 ; PPC32-DAG: xoris [[LO0]], [[LO0]], 32768
----------------
jsji wrote:
> This looks like unexpected? we are moving something across BARRIER now?
Yeah, the BARRIER is a CHECK-NOT. The only difference in code was that the xoris and lwz have switched places, so the CHECK-DAG's no longer match as they did in the past.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D76909/new/

https://reviews.llvm.org/D76909