<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/129750>129750</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Missed optimization: eager spills mess up hot path
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            new issue
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          travisdowns
      </td>
    </tr>
</table>

<pre>
    Consider the following function:

```
[[noreturn]] [[gnu::cold]]
void cold_function(const int& x, const int& y);

int hot_function(int x, int y) {
    if (x < y) [[unlikely]] {
        cold_function(x, y);
    }
    return x + y;
}
```

In clang++ this generates the following code at -O3:

```
hot_function(int, int):
  push rax
  mov dword ptr [rsp + 4], edi
  mov dword ptr [rsp], esi
  cmp edi, esi
  jl .LBB0_2
  add esi, edi
  mov eax, esi
  pop rcx
  ret
.LBB0_2:
  lea rdi, [rsp + 4]
  mov rsi, rsp
  call cold_function(int const&, int const&)@PLT
```

However the whole spilling of the in-register variables, and the alignment of the stack frame (`push rax`) could be deferred to the cold branch instead:

```
hot_function(int, int):
  cmp edi, esi
  jl .LBB0_2
  add esi, edi
  mov eax, esi
  ret
.LBB0_2:
  push rax
  mov dword ptr [rsp + 4], edi
  mov dword ptr [rsp], esi
  lea rdi, [rsp + 4]
  mov rsi, rsp
 call cold_function(int const&, int const&)@PLT
```

Cutting the hot path almost in half and avoiding an expensive store-forwarding stall (`pop rax` reads the qword at `[rsp]` which was immediately before written in two dword halves during the spill, this causes an expensive (~10ish cycles) stall on all modern big cores I'm aware of).

https://godbolt.org/z/nTvnj4r1r
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJy0VU2PozgQ_TXOpdSRMZCEA4ekW9GONKvdw9xHBhfgHmOztoFkDvvbVzakP9Kzh13NRJGA-nz16mG4c7LViCXJTyR_2vDRd8aW3vJJOmFm7TaVEdfy0WgnBVrwHUJjlDKz1C00o669NJqkR0Ljf0fXPz3GkidtLPrRapI_kfwJFmOrx5CSHmujxOIh9DgZKSBYvr7UZYfaaOdBak_YDi6EPcI7y5WwgqSnpbvUHjrj36YHU8wKNyEYyD5EAwDIBgg7XICkj6srghu1kt9QXW-IX-LD7x5erP0KIoSQ_dN6t0wOFyDsBNcVZvS-44keP2moFdctYacQ6jvpoEWNlnt0d5zXRiBwDw9_pD-k_SMB6_gR5DFCG0bXgeWX-NCbCcRsrIDB28CBdUNEnAUG2COgkP8aeAtxS0jdDzH8jelZwfbz6US_svjIhYiuu7rIL2-TBjOArRd8Fj2hx1uNdQKFHOzS6B7xraRdugSQERpX6sP6giqingjb3VTy-lyQjP75-cvHff1mZpzWt2HujEJwg1QqrMc00Sr1g8VWOo8WJm4lrxS60IFrEQO4kq3uUftbhvO8_gaN5T0GWZIdfdnSjgZ11mZUAioEgQ1aiwK8iZlhKKgs13UHUjuPXPxvZfysBf5wab9Mdv9ZDT9TDI-j92HzYRWd8TBw3wFXvYmnFHRcNXHrPJxvIZBrwMuA2skprN1YfGiMnbmNXucDtlUB4TWIAgCLXCxHwV-RC-4hIHhhY0dh7mTdwcwdyL5HIblHdYUKG2MRZiu9Rx0Q-dmshHZcTehAjPY2QNRxGD6eQTUfHbr3gAk7_J1Q6Tqor3UUdbFiNhrCpTcCrYZKhrPKooNPhO174DO3CKYhrNguxHXeDy5og50JO7dGVEb5rbEtYefvhJ31l0k_ZzaxG1GmokgLvsEy2WdJktAkyTdd2RxYygqeJVUqRL6vkqoqsuyQpI1odlmdbWTJKMtpSrOQl-dbuhN1cWBZxXa7RuQVySj2XKqtUlMfem-kcyOWCSv2Od0oXqFy8evImMYZopcwFj6WtgxJD9XYOpJRJZ13r2W89ArL36VzKMAMXvbyO1-_lYC8Rbtw7aBH52AcXrSzGa0q77iRvhurbW16ws6hxXp5GKx5xtoTdo7AHGHnFflUsn8CAAD__43obJI">