[llvm] [AArch64] Initial sched model for Neoverse V3, V3AE (PR #163932)

Tue Oct 21 06:59:33 PDT 2025

================
@@ -0,0 +1,2781 @@
+//=- AArch64SchedNeoverseV3.td - NeoverseV3 Scheduling Defs --*- tablegen -*-=//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file defines the scheduling model for the Arm Neoverse V3 processors.
+// All information is taken from the V3 Software Optimization guide:
+//
+// https://developer.arm.com/documentation/109678/300/?lang=en
+//
+//===----------------------------------------------------------------------===//
+
+def NeoverseV3Model : SchedMachineModel {
+  let IssueWidth            =   8; // Expect best value to be slightly higher than V2
+  let MicroOpBufferSize     = 320; // Entries in micro-op re-order buffer.
+  let LoadLatency           =   4; // Optimistic load latency.
+  let MispredictPenalty     =  10; // Extra cycles for mispredicted branch.  NOTE: Copied from N2.
+  let LoopMicroOpBufferSize =  16; // NOTE: Copied from Cortex-A57.
+  let CompleteModel         =   1;
+
+  list<Predicate> UnsupportedFeatures = !listconcat(SMEUnsupported.F,
+                                                    [HasSVE2p1, HasSVEB16B16,
+                                                     HasCPA, HasCSSC]);
+}
+
+//===----------------------------------------------------------------------===//
+// Define each kind of processor resource and number available on Neoverse V3.
+// Instructions are first fetched and then decoded into internal macro-ops
+// (MOPs). From there, the MOPs proceed through register renaming and dispatch
+// stages. A MOP can be split into two micro-ops further down the pipeline
+// after the decode stage. Once dispatched, micro-ops wait for their operands
+// and issue out-of-order to one of twenty-one issue pipelines. Each issue
+// pipeline can accept one micro-op per cycle.
+
+let SchedModel = NeoverseV3Model in {
+
+// Define the (21) issue ports.
+def V3UnitB   : ProcResource<3>;  // Branch 0/1/2
+def V3UnitS0  : ProcResource<1>;  // Integer single-cycle 0
+def V3UnitS1  : ProcResource<1>;  // Integer single-cycle 1
+def V3UnitS2  : ProcResource<1>;  // Integer single-cycle 2
+def V3UnitS3  : ProcResource<1>;  // Integer single-cycle 3
+def V3UnitS4  : ProcResource<1>;  // Integer single-cycle 4
+def V3UnitS5  : ProcResource<1>;  // Integer single-cycle 5
+def V3UnitM0  : ProcResource<1>;  // Integer single/multicycle 0
+def V3UnitM1  : ProcResource<1>;  // Integer single/multicycle 1
+def V3UnitV0  : ProcResource<1>;  // FP/ASIMD 0
+def V3UnitV1  : ProcResource<1>;  // FP/ASIMD 1
+def V3UnitV2  : ProcResource<1>;  // FP/ASIMD 2
+def V3UnitV3  : ProcResource<1>;  // FP/ASIMD 3
+def V3UnitLS0 : ProcResource<1>;  // Load/Store 0
+def V3UnitL12 : ProcResource<2>;  // Load 1/2
+def V3UnitST1 : ProcResource<1>;  // Store 1
+def V3UnitD   : ProcResource<2>;  // Store data 0/1
+def V3UnitFlg : ProcResource<8>;  // Flags
----------------
Asher8118 wrote:

`def V3UnitFlg : ProcResource<8>;  // Flags` Looking at this in V3, it seems to be used similarly to `def V2UnitFlg : ProcResource<3>;  // Flags` from the V2 scheduling model. While it does not appear in the SWOG, this unit is meant to model the behaviour of flag setting instructions. I'm wondering, what is the reason for giving this unit `ProcResource<8>`?

https://github.com/llvm/llvm-project/pull/163932