<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/129750>129750</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
Missed optimization: eager spills mess up hot path
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
travisdowns
</td>
</tr>
</table>
<pre>
Consider the following function:
```
[[noreturn]] [[gnu::cold]]
void cold_function(const int& x, const int& y);
int hot_function(int x, int y) {
if (x < y) [[unlikely]] {
cold_function(x, y);
}
return x + y;
}
```
In clang++ this generates the following code at -O3:
```
hot_function(int, int):
push rax
mov dword ptr [rsp + 4], edi
mov dword ptr [rsp], esi
cmp edi, esi
jl .LBB0_2
add esi, edi
mov eax, esi
pop rcx
ret
.LBB0_2:
lea rdi, [rsp + 4]
mov rsi, rsp
call cold_function(int const&, int const&)@PLT
```
However the whole spilling of the in-register variables, and the alignment of the stack frame (`push rax`) could be deferred to the cold branch instead:
```
hot_function(int, int):
cmp edi, esi
jl .LBB0_2
add esi, edi
mov eax, esi
ret
.LBB0_2:
push rax
mov dword ptr [rsp + 4], edi
mov dword ptr [rsp], esi
lea rdi, [rsp + 4]
mov rsi, rsp
call cold_function(int const&, int const&)@PLT
```
Cutting the hot path almost in half and avoiding an expensive store-forwarding stall (`pop rax` reads the qword at `[rsp]` which was immediately before written in two dword halves during the spill, this causes an expensive (~10ish cycles) stall on all modern big cores I'm aware of).
https://godbolt.org/z/nTvnj4r1r
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJy0VU2PozgQ_TXOpdSRMZCEA4ekW9GONKvdw9xHBhfgHmOztoFkDvvbVzakP9Kzh13NRJGA-nz16mG4c7LViCXJTyR_2vDRd8aW3vJJOmFm7TaVEdfy0WgnBVrwHUJjlDKz1C00o669NJqkR0Ljf0fXPz3GkidtLPrRapI_kfwJFmOrx5CSHmujxOIh9DgZKSBYvr7UZYfaaOdBak_YDi6EPcI7y5WwgqSnpbvUHjrj36YHU8wKNyEYyD5EAwDIBgg7XICkj6srghu1kt9QXW-IX-LD7x5erP0KIoSQ_dN6t0wOFyDsBNcVZvS-44keP2moFdctYacQ6jvpoEWNlnt0d5zXRiBwDw9_pD-k_SMB6_gR5DFCG0bXgeWX-NCbCcRsrIDB28CBdUNEnAUG2COgkP8aeAtxS0jdDzH8jelZwfbz6US_svjIhYiuu7rIL2-TBjOArRd8Fj2hx1uNdQKFHOzS6B7xraRdugSQERpX6sP6giqingjb3VTy-lyQjP75-cvHff1mZpzWt2HujEJwg1QqrMc00Sr1g8VWOo8WJm4lrxS60IFrEQO4kq3uUftbhvO8_gaN5T0GWZIdfdnSjgZ11mZUAioEgQ1aiwK8iZlhKKgs13UHUjuPXPxvZfysBf5wab9Mdv9ZDT9TDI-j92HzYRWd8TBw3wFXvYmnFHRcNXHrPJxvIZBrwMuA2skprN1YfGiMnbmNXucDtlUB4TWIAgCLXCxHwV-RC-4hIHhhY0dh7mTdwcwdyL5HIblHdYUKG2MRZiu9Rx0Q-dmshHZcTehAjPY2QNRxGD6eQTUfHbr3gAk7_J1Q6Tqor3UUdbFiNhrCpTcCrYZKhrPKooNPhO174DO3CKYhrNguxHXeDy5og50JO7dGVEb5rbEtYefvhJ31l0k_ZzaxG1GmokgLvsEy2WdJktAkyTdd2RxYygqeJVUqRL6vkqoqsuyQpI1odlmdbWTJKMtpSrOQl-dbuhN1cWBZxXa7RuQVySj2XKqtUlMfem-kcyOWCSv2Od0oXqFy8evImMYZopcwFj6WtgxJD9XYOpJRJZ13r2W89ArL36VzKMAMXvbyO1-_lYC8Rbtw7aBH52AcXrSzGa0q77iRvhurbW16ws6hxXp5GKx5xtoTdo7AHGHnFflUsn8CAAD__43obJI">