[PATCH] D151894: [AArch64] Neoverse V2 scheduling model

Dave Green via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Jun 7 03:47:25 PDT 2023


dmgreen added inline comments.


================
Comment at: llvm/lib/Target/AArch64/AArch64SchedNeoverseV2.td:1026-1027
+// SDIV, UDIV
+def : SchedAlias<WriteID32,  V2Write_12cyc_1M0>;
+def : SchedAlias<WriteID64,  V2Write_20cyc_1M0>;
+
----------------
rjj wrote:
> dmgreen wrote:
> > 12 and 20 are worst-case times. Would a value more in the middle of the range be better?
> Sure, so maybe 8 and 12 respectively? Do you have a better suggestion? What about the throughput, 1/8 and 1/12?
It is the let ResourceCycles = [12] that will define the throughput. If the instruction uses the V2UnitM0 pipeline for multiple cycles, then other operations that use the same pipeline (for example other divs) will not be able to issue.


================
Comment at: llvm/lib/Target/AArch64/AArch64SchedNeoverseV2.td:1035
+// Multiply long
+// NOTE: SOG p. 16, n. 2: How to specify late-forwarding between similar ops?
+def : InstRW<[V2Write_Mul], (instregex "^M(ADD|SUB)[WX]rrr$")>;
----------------
rjj wrote:
> dmgreen wrote:
> > It is usually done with read advances.
> Thanks, I'll have a look. If you have any pointers to examples where read advances were used to model forwarding of instructions like `madd` and such, that would be greatly appreciated!
The A53/A55 scheduling models have tried to model that on a number of operations (but it hasn't always worked very well). Other use a line like this, although it wont distinguish between IM32 and IM64 (the optimization guide it never very clean what "similar µOPs" mean):
```
def : ReadAdvance<ReadIMA,     2, [WriteIM32, WriteIM64]>;
```
If it overrides the regex, it may need to add Reads in the same way as the Falkor processor does.

It is probably fine to leave it as a NOTE for the moment, although it would be nice to model eventually.


================
Comment at: llvm/test/tools/llvm-mca/AArch64/Neoverse/V2-neon-instructions.s:399
+fsub v0.2s, v0.2s, v0.2s
+ld1 { v0.16b }, [x0]
+ld1 { v0.2d, v1.2d, v2.2d }, [x0], #48
----------------
rjj wrote:
> dmgreen wrote:
> > Add more ldr tests perhaps.
> I added a few more for H-form LDRs, but if you're referring to the FP loads they should be here already (you can grep for `ldr\s[hwxq]`).
Oh I see. I was expecting them to be in the neon tests, not the basic ones. Sounds good.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D151894/new/

https://reviews.llvm.org/D151894



More information about the llvm-commits mailing list