<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/99395>99395</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[MCA] Inaccuracy in small snippet
</td>
</tr>
<tr>
<th>Labels</th>
<td>
tools:llvm-mca
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
boomanaiden154
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
boomanaiden154
</td>
</tr>
</table>
<pre>
Take the small snippet:
```asm
incq %r15
addq $0x4, %r13
cmpq $0x3f, %r15
```
Running this through MCA on `skylake`/`skylake-avx512` produces the following:
```
Iterations: 100
Instructions: 300
Total Cycles: 104
Total uOps: 300
Dispatch Width: 6
uOps Per Cycle: 2.88
IPC: 2.88
Block RThroughput: 0.8
Instruction Info:
[1]: #uOps
[2]: Latency
[3]: RThroughput
[4]: MayLoad
[5]: MayStore
[6]: HasSideEffects (U)
[1] [2] [3] [4] [5] [6] Instructions:
1 1 0.25 incq %r15
1 1 0.25 addq $4, %r13
1 1 0.25 cmpq $63, %r15
Resources:
[0] - SKXDivider
[1] - SKXFPDivider
[2] - SKXPort0
[3] - SKXPort1
[4] - SKXPort2
[5] - SKXPort3
[6] - SKXPort4
[7] - SKXPort5
[8] - SKXPort6
[9] - SKXPort7
Resource pressure per iteration:
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9]
- - 0.75 0.75 - - - 0.75 0.75 -
Resource pressure by instruction:
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] Instructions:
- - 0.24 0.25 - - - 0.26 0.25 - incq %r15
- - 0.25 0.25 - - - 0.25 0.25 - addq $4, %r13
- - 0.26 0.25 - - - 0.24 0.25 - cmpq $63, %r15
```
However, running this within `llvm-exegesis` (`llvm-exegesis -snippets-file=/tmp/test.s --mode=latency`) produces the following:
```
---
mode: latency
key:
instructions:
- 'INC64r R15 R15'
- 'ADD64ri8 R13 R13 i_0x4'
- 'CMP64ri8 R15 i_0x3f'
config: ''
register_initial_values:
- 'R15=0x123456'
- 'R13=0x123456'
cpu_name: skylake-avx512
llvm_triple: x86_64-grtev4-linux-gnu
min_instructions: 10000
measurements:
- { key: latency, value: 0.4234, per_snippet_value: 1.26995, validation_counters: {} }
error: ''
info: ''
assembled_snippet: 4157415549BF563412000000000049BD563412000000000049FFC74983C5044983FF3F49FFC74983C5044983FF3F49FFC74983C5044983FF3F49FFC74983C5044983FF3F415D415FC3
...
```
The predicted throughput from `llvm-mca` is almost 40% less than the experimental value. UICA seems to agree with the experimental value, predicting 1.25 cycles/iteration as the reciprocal throughput.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJysV91yo7gSfhr5RgUFQgJz4YvEPq5Jnck5qUymdu9csmjb2gBiJZGJ335LCAP-SWa2dikbRH_don_U6hY3Ru5rgAVi94iQrVIVr7ksoI4ZRYQgtprx1h6UXpxjs60qjosX_grYHgCbipclNrVsGrAouUPRCkV3KI38j5vKU2Qt_sSIMB0zT-BF4Qg0eqeILD2UeEhUTQ8luwFjFzP3r939ua1rWe-xPUiD7UGrdn_Aj8s7rGqM0si8Hkv-Ck6IrMf3gL-9s5igNMKNVkUrwHQm7VRZqh-y3l-b418fLGhupaoNSu5wf8XRCa2N1a0Y8OQEvCjLS7w8ihJGwTiiU7T9fzOddJD195U0DbfigH-ThT30fKnHnCR-Au0_0GMknM97rZ6Wk3n7a4TvSyVe8fOL913TulDiKJxPv35lHn6od2p0EruPEVs5QUSSzpATnfT0r9xCLY4DPenp0--eMNpjj_z4VfFioLOR_s0qDQOQ9sAXbr7JAv6z24GwBiMy_45IfmaI19R5oFeuHybjkI5DNg7dN_BFgP2cOO7j2d2jkDD8weUyAUX5dFHj-BfkXMJ0cvQqYW583OWQ506Tqxyapg4Y1WoBZhrHyFsc4G___X0l32QB-tJzHbZ-ukTJBH1S2kZnsZ4i8Vmkpwg5i_UUSc6CPUXogGQOGehsoM8vJNIByS-Q7GMv4UaDMa0G3IDG8rQP3PAd_oeL7GSJJ83HodO2j3rgox70Uc_Y-DjHpiwDw-f2bY9Yjuv8Fw38Zev88GMD3fB2mg0GRSGhk4y5bS9Jr1hupV8wCrCfz3nN8llq_lyjK5Zzuz7N5FvV8Iv6AW-gHa-eFsYf0h5kVxDL8q0K4B32YKRx9Q-R-SUZB31RN8FOuoKyQmRtq8bdwdjQ4CCoVOGAst_WXX3N_2YtDYJ-Ifq5zipUOS0Xr3AcV4G8uTY6dyGSPfxvmVKNn2Pm_ohkF_jdapVSLef4OU66v9x0XUg2hAyRbPn4dGJiHYPrRYaZhKp3cn-mr0MHBg17aSzojayllbzcvPGyhRu6OgWTVfQek4Sy9FrX5zi5hYum3dS8mnrsoqvp2FxEN1bLxncE7_N0k9Jgry280aCUdfse7Ou2D4CsNxd-dU3NqQOpgLuNoYLaTs0IMMrusQ_OEDCyxJ29vomgJOnSogG96dfUZoDjkKR5znoRWXT76UaotragOx1Qdo-yFUZZv-2B1kqfr5Sp66XvSfAHMDcGqm0JxWbsWTGNWUZjxmh-v2ZpQmMSDRfN71fXtPV6mdF8nixZRN1zvU7W_wI1Zisas_Wy3zzCMPwkz18O3X5dSGGhOLW9TWvxTqtqSPNKcJfh0mBeVspYTCNEGC7BuAzldZem8N6Ali62vPShC_H3h-UdNgCVwVZhvtcA3RbygUAXYa-N23HirgnxvS5ZD5USc78vaBCy0UrwcqJ4OCsWSZEnOZ_BIs5ITBiZR_nssIDdNorj3W47JwJoCiLPiZgXIs3TgrMsnckFiQiNMidFU5qHqShoQqlgNC9iNgdEI6i4LEPnlFDp_Uwa08Iiz5OczUq-hdL0ZyGrVOlW3uA-fxrSi46wbfcG0aiUxppxMitt2Z2lHpd3vkfkQrSaC1dGzw9Js1aXi4O1XaPvDiRkvZf20G5DoSpE1m7O_hE0Wv0BwjoPOm2dK73CbwvyVwAAAP__o_2a0Q">