[PATCH] D112201: [CortexA55][SchedModels] Complete Cortex-A55 scheduler model

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Dec 17 01:23:05 PST 2021


dmgreen added a comment.

In D112201#3155985 <https://reviews.llvm.org/D112201#3155985>, @kpdev42 wrote:

> I’ve attached test results for LLVM test suite (MultiSource and MicroBenchmarks suites) which show difference between complete and incomplete Cortex-A55 scheduler model.
> I’d also mention that we’ve got small (~1%) improvements in GeekBench SGEMM and AES-XTS tests.
>
> F20726645: microbm-a55.png <https://reviews.llvm.org/F20726645>
>
> F20726656: multisource.png <https://reviews.llvm.org/F20726656>

OK Thanks.  I presume this is run on a Cortex-A55? And the noise is low enough to make them meaningful?

We wrote a few different updates to the CortexA55 schedule prior to making it the default under cpu=generic. We had already written patches a lot like this (not this exactly - the neon part of this patch. This patch is trying to do too much at once and needs to be split up). The problem is that the A55 is notoriously difficult to schedule for and a lot of the patches we tried ended up making the performance worse, not better. We run a set of some benchmarks on an RTL simulator to get deterministic results. They are perhaps not the best benchmarks, but are very accurate, and this patch shows the same results where things don't look better.

(We also had a few other reasons for keeping the higher latencies, like the A510 sometimes having higher latencies but higher throughputs, and this schedule being used for cpu=generic allows it to produce better code in more cases. Plus it effecting many test now that it is the default. I was at least hoping to give it some times before we changed everything again.)

I think there is values to having more accurate scheduling, even if the performance results we have are not perfect. I would suggest trying to split this patch up a bit though, to make sure we can check that the parts are correct. At least the LDP and NEON parts are logically separate.



================
Comment at: llvm/test/tools/llvm-mca/AArch64/Cortex/A55-neon-instructions.s:6
+  add	v31.8b, v31.8b, v31.8b
+  sub	v0.2d, v0.2d, v0.2d
+  fadd	v0.4s, v0.4s, v0.4s
----------------
Why has this file been rewritten?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D112201/new/

https://reviews.llvm.org/D112201



More information about the llvm-commits mailing list