[PATCH] D94928: [llvm-mca] Add support for in-order CPUs

Wed Jan 20 11:22:30 PST 2021

asavonic added inline comments.

================
Comment at: llvm/test/tools/llvm-mca/AArch64/Cortex/A55-all-views.s:116-117
+# CHECK-NEXT: [0,1]     D=eeeER   .    .    .   ldr	w5, [x3]
+# CHECK-NEXT: [0,2]     .D===eeeeER    .    .   madd	w0, w5, w4, w0
+# CHECK-NEXT: [0,3]     .   DeeeER.    .    .   add	x3, x3, x13
+# CHECK-NEXT: [0,4]     .    DeeeER    .    .   subs	x1, x1, #1
----------------
andreadb wrote:
> asavonic wrote:
> > andreadb wrote:
> > > Why are these two executing out of order?
> > Madd and add are issued in the same cycle, subs is issued next.
> > However, they should not retire out-of-order. Some instructions can
> > retire out-of-order, but not these.
> > 
> > I have to look into this. Probably an RCU is actually needed for the
> > in-order pipeline.
> > 
> In theory, younger instructions should not be allowed to reach the write-back stage before older instructions because that would lead to out-of-order execution. 
> In this case I was expecting a compulsory stall to artificially delay the issue of the `add` so that it can write-back in-order w.r.t. the madd.
> What are those cases where it is allowed to write-back instructions out of order? Shouldn't architectural commits always happen in-order?
> In theory, younger instructions should not be allowed to reach the write-back stage before older instructions because that would lead to out-of-order execution. 
> In this case I was expecting a compulsory stall to artificially delay the issue of the `add` so that it can write-back in-order w.r.t. the madd.

I wonder how this works for instructions with early termination (sdiv, udiv). 
@dmgreen, can you please comment on this?

> What are those cases where it is allowed to write-back instructions out of order? Shouldn't architectural commits always happen in-order?

>From Cortex-A55 optimization manual, s3.5.1 "Instructions with out-of-order completion":
> While the Cortex-A55 core only issues instructions in-order, due to the number of cycles required to complete more complex floating-point and NEON instructions, out-of-order retire is allowed on the instructions described in this section. The nature of the Cortex-A55 microarchitecture is such that NEON and floating-point instructions of the same type have the same timing characteristics.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94928/new/

https://reviews.llvm.org/D94928