<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/122124>122124</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[AArch64] Correct scheduling information for flag manipulation instructions in Neoverse-V2
</td>
</tr>
<tr>
<th>Labels</th>
<td>
backend:AArch64
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
Rin18
</td>
</tr>
</table>
<pre>
Some instructions have incorrect scheduling information when compared to the Neoverse-V2 Software optimisation Guide(link to V2 SWOG: https://developer.arm.com/documentation/109898/latest/) :
| Instruction Group | AArch64 Instructions | Exec Latency | Exec Throughput | Utilised Pipelines |
| ------------------------------ | ------------------------- | ------------ | --------------- | ------------------ |
| Flag manipulation instructions | SETF8, SETF16,RMIF, CFINV | 1 | 1 | F |
For example:
```
rmif
cfinv
setf8 w1
setf16 w1
```
Running `llvm-mca -mtriple=aarch64 -mcpu=neoverse-v2 -instruction-tables` on the above instructions gives the following output:
```
Instruction Info:
[1]: #uOps
[2]: Latency
[3]: RThroughput
[4]: MayLoad
[5]: MayStore
[6]: HasSideEffects (U)
[1] [2] [3] [4] [5] [6] Instructions:
1 1 0.17 U rmif #0, #0, #0
1 1 0.06 U cfinv
1 1 0.17 U setf8 w1
1 1 0.17 U setf16 w1
Resources:
[0.0] - V2UnitB
[0.1] - V2UnitB
[1.0] - V2UnitD
[1.1] - V2UnitD
[2] - V2UnitL2
[3.0] - V2UnitL01
[3.1] - V2UnitL01
[4] - V2UnitM0
[5] - V2UnitM1
[6] - V2UnitS0
[7] - V2UnitS1
[8] - V2UnitS2
[9] - V2UnitS3
[10] - V2UnitV0
[11] - V2UnitV1
[12] - V2UnitV2
[13] - V2UnitV3
Resource pressure per iteration:
[0.0] [0.1] [1.0] [1.1] [2] [3.0] [3.1] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
- - - - - - - 0.50 0.50 0.50 0.50 0.50 0.50 - - - -
Resource pressure by instruction:
[0.0] [0.1] [1.0] [1.1] [2] [3.0] [3.1] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] Instructions:
- - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - rmif #0, #0, #0
- - - - - - - - - - - - - - - - - cfinv
- - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - setf8 w1
- - - - - - - 0.17 0.17 0.17 0.17 0.17 0.17 - - - - setf16 w1
```
The output shows that every instruction has latency 1, throughput 6 and uses pipeline I. This is incorrect and should be fixed in the Neoverse-V2 scheduling model to match the SWOG: https://github.com/llvm/llvm-project/blob/f37bee1d929a90dd3dbb67a4a9d0a52400a8a78f/llvm/lib/Target/AArch64/AArch64SchedNeoverseV2.td#L1139-L1140
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzUV11vozoT_jXOzSiRbT5CLnKRbTf7VuruvmranmsDQ_BZwMg26fbfHxmckq9GrXSk1UEoHj2Mh_l4PBmEMXLbIC5J9IVEtxPR2VLp5YNsWDJJVf663KgaQTbG6i6zUjUGSrFzSKa0xsyCyUrMu0o2W5BNoXQtnBq8lNhApupWaMzBKrAlwg9UO9QGp88cNqqwL0IjqNbKWpph27dO5kh4Usnml9vlFP_6-Y0EKyitbQ0JVoSvCV_nuMNKtahnQtezTNUOU1lXY2N7U4SvGV0ki4TwdSUsGttvXIAzQff3_AbuxuDgm1ZdC0eXU1mtdFbG4aGqGZ9-_Y0Z3AuLTfY6Ao-lVt22bDvbY09WVtJgDv-XLVayQeNg78L06gVXVc6eXlJ_x8SBC-tKbKEWjWy7aijFUdGdyubr4zohfBBYTPjNw_e7tQNu1nc_nnsddpo6BufpXMPZ5R2hq7XSgL9F3VboCxVTf9OVrmVB6CorZLMjdGXQFgm8MC-yeJAPdxC6euiaxrGTxLSqdvW0zgRMa6tl_4pbIYbaTuus7Uhw2-w5uuMwPcjB1Iq0QuPM3nCyoqrpKS1StTs5IFu5Q9M_LFRVqRf3ctXZtrPnER2S764plNeIvjAS3TrWEx50P1szgNyDnmwDGHjwYWTc8CD0D76L13sl8gGMRnBjlcYBjT36P2E2MsevRYGZNUB48kT4wp-Vwam-WoMrXgxGMRzFaBRjLx4enyHQPT2Ghc7Y_JwaT_2vKz24bFBHuKP13AyN3zOzp87HX3xAss9tYm9ODKwcyIhGdTpD81ZoOqMuPVN45k-NtF_2KLuAsmPd2z3KLqC-QHv0nnu2HJu4p2yPs4t4eGzmOx1pdAizkUcH8MZrz09gr52cwN7FxQkc-DB7x9_gZ2-bsWPY22b8GPa2WXAMB2eVgVajMZ1GaFGDtKiHv5OTesFYJBgrA2M5Tg_JqBCMCtfPy5g5GLMFY4ZgzAqMmYAxehgjBsff6cDHDy90FtGPLdesvJff9PWwb_7nE3yhu30-331L-cByzcrQKuH9bvlZt_6d5a31_pmsDH18343_oBP9_8KlQeWxRD8ngCnVixshhAXcoT46KFAKA5UfN5mrqx1HzRhEk0Nn0EDrB81hXLmbwWMpDbj7bXZ3uqZUXZVDilDI35iDbM5G9YMRv1Y5Vm4wr4XNyl7z8ny-lbbsUj-Yu7nLL9NWq78xc8N4WqmU8HURzFNEli_4Qixongd5msZzEYpFTkXEQ0pFIuZJcWBHun2PQm_R2fHT-ShtnL_7AJ75zOaEB_eMBYvpPWMhneTLIF8ECzHBJZsHcRAu5kE8KZdFVPAsoozTHBlPYgwLisE84RFGGAs-kUtOeUQZTVjEOYtneRhzGuciDFmSRlFCQoq1kNXMuTpTejuRxnS4ZJwzHk4qkWJl-q8tzlOR_cImJ8HqLQLXVSZ62Scq7baGhLSSxprRnJW26j_X9nuiW7i5_ilWKA3F9fFeNocFn3S6Wn66nH2gxn12DbHulvyfAAAA___5c6wT">