<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/143005>143005</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[RISCV] 507.cactuBSSN_r regression after bidirectional scheduling/register pressure tracking
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
sihuan
</td>
</tr>
</table>
<pre>
Initially noticed that commit 7b09d7b (20.1.5), compared to commit c9a6e993, had about a 50% increase in dynamic instruction count on 507.cactuBSSN_r.
Further testing shows that this regression is introduced by commit 9122c52 (#115445).
Below are some of the test results:
Compilation flags used:
```
CFLAGS := -std=c99 -march=rv64gc -O3 -DSPEC -DSPEC_CPU -DNDEBUG -DSPEC_AUTO_SUPPRESS_OPENMP -Iinclude -DCCODE -DSPEC_LP64
CXXFLAGS := -march=rv64gc -O3 -DSPEC -DSPEC_CPU -DNDEBUG -DSPEC_AUTO_SUPPRESS_OPENMP -Iinclude -DCCODE -DSPEC_LP64
FFLAGS := -march=rv64gc -O3 -Iinclude
LDFLAGS := -march=rv64gc -O3
```
On commit 9122c52 :
```
perf stat ./9122c5235ec8 spec_test.par 1>/dev/null
Performance counter stats for './9122c5235ec8 spec_test.par':
27237.56 msec task-clock # 0.991 CPUs utilized
3034 context-switches # 0.111 K/sec
4 cpu-migrations # 0.000 K/sec
25671 page-faults # 0.942 K/sec
50278526154 cycles # 1.846 GHz
39011759196 instructions # 0.78 insn per cycle
350744284 branches # 12.877 M/sec
6958541 branch-misses # 1.98% of all branches
27.494980251 seconds time elapsed
26.937902000 seconds user
0.275487000 seconds sys
```
On commit 5bbe63e (previous commit):
```
perf stat ./5bbe63ec9122 spec_test.par 1>/dev/null
Performance counter stats for './5bbe63ec9122 spec_test.par':
23636.29 msec task-clock # 0.995 CPUs utilized
2099 context-switches # 0.089 K/sec
2 cpu-migrations # 0.000 K/sec
25670 page-faults # 0.001 M/sec
43618927076 cycles # 1.845 GHz
28844201052 instructions # 0.66 insn per cycle
348492599 branches # 14.744 M/sec
8688534 branch-misses # 2.49% of all branches
23.758763429 seconds time elapsed
23.395399000 seconds user
0.227527000 seconds sys
```
It is worth noting that, although the instruction count increased by 35%, the runtime regressed by only 16%.
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJy8Vk1v2zgb_DX05YEFfkriwQfHjvsGb9sY9WbRW0BRtMStPgySSjb99QvKcusmdZoFiiUMSBCHw3lIDj3Ke1t1xiyQuEJiPVNDqHu38LYeVDcr-vJpcdPZYFXTPEHXB6tNCaFWAXTftjZAVmBZZgUgmlOckEQgKhFdxe6DchHcn6CgpUqNlCz216oEVfRDAAUCIyrAdtoZ5Q3YDqB86lRrNdjOBzfoYPsOdD90AfoOBM4SrXQYrna7j_cuAYSXCC83gwu1cRCMD7arwNf9oz-KDbX14EzljPeRynqwXXB9OcR6iqeTREko1YLGahBlhAjOY0EJwssr0_SPoJwB37cG-j2E2oxzgTN-aIJHLKpY9e3BNmpUvG9U5WHwpjz2oRRPP7xcbd4v3-0AAGIfW8PchxKxtZYS5q1yukZs7R5SXmmY3zKYr3fb69X0uF9t72C-_ri-vrp7d_q2vPvj9n53t91-ut7t7m-31x8_bGF-YzvdDKWB-Xq1ul1fn9DvtymPMj5_noScZPx3c29eLMHP5j6RILx8vz6NuDzg2TLfdi_29sVWHIzbgw8qQILoZsIxYXQO_mD0fdzk5KAcEMSuEd2U5gHRTTc0zfHgwda4fe9a1WlzPKXGjXwe9r0DRLPXeRHNJlF4Cd8azSjLEpFC642GoPyXuW56_QWeNUQZ4ERKAqvtnYch2MZ-NeUZ4Jw1NoYZP77pvgvm7zD3jzbo2viRDABwQgiB_yO6iXO_aC8YAeDEeBjmra3caIAzPozx2_moSDMCB1WZ-V5Fa_2k5JFVcvo6q8A0ywVNiTjpe9KNec53TkqSnKfw7n9fz1QxiQnJhCQyPX44u5b8BWlZPsI6OBh3nPWHMpnAGec0n2QVTnXjBlyQRWiSZxl8mGr9ccFSKXLByTnTvLXeP6OLTCSRebxs-z2opnkx7fkZpFnCJZc5poKAN7rvSg_BtgZMow7xTjuCaZpIlklM4xafcIM37rtInNBM8Dw7R_gnf9GqoihMyky8hg_OPNh-8FNX_Hf5hX-nwTr67Xf69zLvBf-ylKUJlW_1r3izfymW8vh22b84l__Ov3Ri_F3-xW_xL8bk25n-KStnKcklzXA2Oe9N_hXP_EvznHOKCRZTlW_wb5q-7l-ec0nFaSN-6V-eZJxf8G-e5rlgP9wEl_xLEy7f6l-WZCLPUsapfNW_LGFSMClf9S_NBP2FfxFe3oQYrR57F-oxLXbVmL9i4FNNqPuhqsfY9DLWndLfmMaYQFTEQRHrhm5UPcW3I6LvmicgKaIimZULVkom1cwsSMYl4YTxbFYvNE-5IKXGmcmzYl8UVKaCsbwoizxlBZnZBcVU4BQLkjLB08TQYm_2rOB7iku-14hj0yrbJE3z0Ca9q2bW-8EsCGcYi1mjCtP4MTpT2plHGHsRpTFJu0UcNC-GyiOOG-uD_04TbGjGzP3pZrf6E4n180x7HlbVPt5HhS2tM-OSqQa8rk05NLarEN04U1kfMYc4ZHAGglP6i-2q2eCaRR3CYQyndIPoprKhHopE9y2imyhoeswPrv_L6IDoZizDI7qZ6nxY0H8CAAD__yVtUoY">