[PATCH] D103955: [MCA] Use LSU for the in-order pipeline

Wed Jul 7 02:51:32 PDT 2021

dmgreen added inline comments.

================
Comment at: llvm/test/tools/llvm-mca/AArch64/Cortex/A55-load-store-noalias.s:95-96
+
+# CHECK:      [0,0]     DeeeE.    ..   str	x1, [x10]
+# CHECK-NEXT: [0,1]     .DeeeE    ..   str	x1, [x10]
+# CHECK-NEXT: [0,2]     .DeeE.    ..   ldr	x2, [x10]
----------------
andreadb wrote:
> andreadb wrote:
> > dmgreen wrote:
> > > I think I would expect most CPU's to work like this, whether the addresses alias or not :)
> > You mean the store sequence. Of course.
> > 
> > My concern was related to instructions that appear to commit out of order like the load and the nop after it.
> > We have flag RetireOOO for cases where we want to allow it.
> If instead you are concerned about whether this patch might end up delaying the second store, then don't worry. That's not how flag -noalias should work: it only affects interactions between loads and stores. It is about whether a younger load is allowed to pass an older store. It should not affect pairs of adjacent stores.
Sorry, I was hoping to look into the schedule over the weekend to see what is going on, but didn't get the chance to look into the correct bit yet.

I believe there are 2 different optimizations that can happen here:
 - Do two stores to the same address have some penalty.
 - Do loads from the same address as a load have a penalty.
The first sounds to me like it should almost always be no, and the second requires store->load forwarding which I believe is very common in most cpus of sufficient complexity.

It comes down to what does the latency of a store mean. I was under the impression that it didn't mean anything in normal llvm scheduling, but it appears that it does have some effect on the latency of an store to the end of the block (I think). In llvm-mca it means the latency of the write into L1 cache?
The Cortex-A55 optimization guide specifies the latency of stores as 1, and that would probably be a better value to use in the A55 schedule model. I've put together a patch to do that in D105541.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103955/new/

https://reviews.llvm.org/D103955