[llvm] [AArch64] Neoverse V1 scheduling info (PR #126707)
Julien Villette via llvm-commits
llvm-commits at lists.llvm.org
Wed Feb 12 01:57:15 PST 2025
================
@@ -525,33 +640,48 @@ def V1Rd_BFMLA : SchedReadAdvance<2, [V1Wr_BFMLA]>;
def V1Wr_CRC : SchedWriteRes<[V1UnitM0]> { let Latency = 2; }
def V1Rd_CRC : SchedReadAdvance<1, [V1Wr_CRC]>;
-def V1Wr_ZDOTB : SchedWriteRes<[V1UnitV01]> { let Latency = 3; }
+def V1Wr_ZDOTB : SchedWriteRes<[V1UnitSVE01]> { let Latency = 3;
+ let NumMicroOps = 2;
+ let ReleaseAtCycles = [2]; }
def V1Rd_ZDOTB : SchedReadAdvance<2, [V1Wr_ZDOTB]>;
-def V1Wr_ZUDOTB : SchedWriteRes<[V1UnitV]> { let Latency = 3; }
+def V1Wr_ZUDOTB : SchedWriteRes<[V1UnitV]> { let Latency = 3; let ReleaseAtCycles = [2]; }
def V1Rd_ZUDOTB : SchedReadAdvance<2, [V1Wr_ZUDOTB]>;
-def V1Wr_ZDOTH : SchedWriteRes<[V1UnitV0]> { let Latency = 4; }
+def V1Wr_ZDOTH : SchedWriteRes<[V1UnitSVE0]> { let Latency = 4;
+ let NumMicroOps = 2;
+ let ReleaseAtCycles = [2]; }
def V1Rd_ZDOTH : SchedReadAdvance<3, [V1Wr_ZDOTH]>;
def V1Wr_ZMMA : SchedWriteRes<[V1UnitV01]> { let Latency = 3; }
def V1Rd_ZMMA : SchedReadAdvance<2, [V1Wr_ZMMA]>;
-let Latency = 5, NumMicroOps = 2 in
-def V1Wr_ZMAD : SchedWriteRes<[V1UnitV0, V1UnitV0]>;
+def V1Wr_ZMABHS : SchedWriteRes<[V1UnitSVE0]> { let Latency = 4;
+ let NumMicroOps = 2;
+ let ReleaseAtCycles = [2]; }
+def V1Rd_ZMABHS : SchedReadAdvance<2, [V1Wr_ZMABHS]>;
+
+def V1Wr_ZMAD : SchedWriteRes<[V1UnitSVE0]> { let Latency = 5;
+ let NumMicroOps = 2;
+ let ReleaseAtCycles = [4]; }
def V1Rd_ZMAD : SchedReadAdvance<3, [V1Wr_ZMAD]>;
-def V1Wr_ZFCMA : SchedWriteRes<[V1UnitV01]> { let Latency = 5; }
+def V1Wr_ZFCMA : SchedWriteRes<[V1UnitSVE01,V1UnitSVE01]> { let Latency = 5;
+ let NumMicroOps = 2; }
def V1Rd_ZFCMA : SchedReadAdvance<3, [V1Wr_ZFCMA]>;
-def V1Wr_ZFMA : SchedWriteRes<[V1UnitV01]> { let Latency = 4; }
+def V1Wr_ZFMA : SchedWriteRes<[V1UnitSVE01,V1UnitSVE01]> { let Latency = 4;
+ let NumMicroOps = 2; }
----------------
jvillette38 wrote:
It is to consider following constraints explained in chapter 4.16 of SOG.
> Maximum issue bandwidth is sustained using one of the following combinations:
> • 2 SVE Uops.
> • 4 ASIMD Uops.
> • 1 SVE Uop on V0 and 2 ASIMD Uops on VX13.
> • 1 SVE Uop on V1 and 2 ASIMD Uops on V02.
https://github.com/llvm/llvm-project/pull/126707
More information about the llvm-commits
mailing list