[llvm-dev] [RFC][AArch64] Make -mcpu=generic schedule for an in-order core

Mon Oct 4 05:10:01 PDT 2021

Hi Renato,

>> The largest part of that is enabling in-order scheduling using the Cortex-A55 schedule model. This is similar to the Arm backend change from eecb353d0e25ba which made -mcpu=generic perform inorder scheduling using the Cortex-A8 scheduling model.
>>
> I think this makes sense because the A55 scheduling model is more likely to benefit the chips produced nowadays than the A8's.

Just to be explicit, eecb353d0e25ba was for the ARM backend, so AArch32, this is for AArch64. But I agree the ARM backend could benefit from an update too.

> Thinking out loud, what do people think of creating an additional "ooo" target? So, "generic" is the same as "in-order", but the "ooo" (or "unordered", whatever) would pick a base OOO target, like A57, A72, etc.

Sounds interesting!

________________________________
From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Renato Golin via llvm-dev <llvm-dev at lists.llvm.org>
Sent: 04 October 2021 10:08
To: David Green <David.Green at arm.com>
Cc: llvm-dev <llvm-dev at lists.llvm.org>
Subject: Re: [llvm-dev] [RFC][AArch64] Make -mcpu=generic schedule for an in-order core

On Mon, 4 Oct 2021 at 08:43, David Green via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote:
Hello folks,

We would like to start pushing -mcpu=generic for AArch64 towards enabling a set of features that is believed to be beneficial in general - that improve performance the for some CPUs without hurting it on any others. A blend of the performance options hopefully beneficial to all CPUs.

Hi David,

This is the usual LLVM definition of "generic", so working on that goal is always good.

The largest part of that is enabling in-order scheduling using the Cortex-A55 schedule model. This is similar to the Arm backend change from eecb353d0e25ba which made -mcpu=generic perform inorder scheduling using the Cortex-A8 scheduling model.

I think this makes sense because the A55 scheduling model is more likely to benefit the chips produced nowadays than the A8's.

When specifying an Apple target, clang will set "-target-cpu apple-a7" on the command line, so should not be affected by this change when running from clang. This also doesn't enable more runtime unrolling like -mcpu=cortex-a55 does, only changing the schedule used.

Thinking out loud, what do people think of creating an additional "ooo" target? So, "generic" is the same as "in-order", but the "ooo" (or "unordered", whatever) would pick a base OOO target, like A57, A72, etc.

A few years ago, when I was doing benchmarks for OpenBLAS changes on Arm, I realised doing that was beneficial to most targets, often only beaten by specifying the correct target.

cheers,
--renato
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211004/3ad438f8/attachment.html>