<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/57638>57638</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[Clang] stack frame is way too large in coroutine at low optimization levels
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
jacobsa
</td>
</tr>
</table>
<pre>
Our internal build system just shipped opaque pointers, by removing `-Xclang=-no-opaque-pointers` from our build arguments. When this happened I noticed a regression: **at the default optimization level, clang makes coroutine resume function stacks much larger than necessary**.
Here is a simple program with a function `ArrayOnCoroutineFrame` that creates a large local array that must go on the coroutine frame because it may need to survive a suspension:
```c++
#include <array>
#include <coroutine>
#include <optional>
struct MyTask{
struct promise_type {
MyTask get_return_object() { return {std::coroutine_handle<promise_type>::from_promise(*this)}; }
std::suspend_always initial_suspend() { return {}; }
void unhandled_exception();
void return_void() {}
auto await_transform(MyTask task) {
struct Awaiter {
bool await_ready() { return false; }
std::coroutine_handle<promise_type> await_suspend(std::coroutine_handle<promise_type> h) {
caller.resume_when_done = h;
return std::coroutine_handle<promise_type>::from_promise(callee);
}
void await_resume() {
std::coroutine_handle<promise_type>::from_promise(callee).destroy();
}
promise_type& caller;
promise_type& callee;
};
return Awaiter{*this, task.handle.promise()};
}
auto final_suspend() noexcept {
struct Awaiter {
bool await_ready() noexcept { return false; }
std::coroutine_handle<promise_type> await_suspend(std::coroutine_handle<promise_type> h) noexcept {
return to_resume;
}
void await_resume() noexcept;
std::coroutine_handle<promise_type> to_resume;
};
return Awaiter{resume_when_done};
}
// The coroutine to resume when we're done.
std::coroutine_handle<promise_type> resume_when_done;
};
// A handle for the coroutine that returned this task.
std::coroutine_handle<promise_type> handle;
};
MyTask DoSomething();
MyTask ArrayOnCoroutineFrame() {
std::array<std::optional<int>, 10'000> vals;
for (auto& val : vals) {
(void)val;
co_await DoSomething();
}
}
```
When [compiled with](https://godbolt.org/z/9819jWE9h) `-std=c++20 -Xclang=-no-opaque-pointers`, clang correctly observes that `ArrayOnCoroutineFrame.resume` needs only a small stack size, since the array is on the coroutine frame:
```asm
ArrayOnCoroutineFrame() [clone .resume]: # @ArrayOnCoroutineFrame() [clone .resume]
push rbp
mov rbp, rsp
sub rsp, 304
mov qword ptr [rbp - 168], rdi # 8-byte Spill
mov qword ptr [rbp - 8], rdi
[...]
```
But when you [compile with](https://godbolt.org/z/756a3d43f) just `-std=c++20` it fails to do this, giving it a huge stack frame despite the fact that it does seem to build the array on the coroutine frame:
```asm
ArrayOnCoroutineFrame() [clone .resume]: # @ArrayOnCoroutineFrame() [clone .resume]
push rbp
mov rbp, rsp
sub rsp, 80368
mov qword ptr [rbp - 80240], rdi # 8-byte Spill
mov qword ptr [rbp - 8], rdi
mov rax, rdi
add rax, 80081
mov qword ptr [rbp - 80232], rax # 8-byte Spill
mov rax, rdi
add rax, 80
mov qword ptr [rbp - 80224], rax # 8-byte Spill
[...]
mov rdi, qword ptr [rbp - 80224] # 8-byte Reload
call std::array<std::optional<int>, 10000ul>::array() [base object constructor]
[...]
```
As far as I can tell clang has reserved approximately enough space for _two_ copies of the array on the stack?
This is not a bug so much as a missed optimization. I don't know if it's reasonable to expect this optimization to be done at the default optimization level, but it was done before opaque pointers shipped so it's sort of a regression. Is it easy to make it work again?
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzlWMFy4jgQ_Rpz6Qpl7EDgwIGQSe0ctqZqZ6p2b5RsC1BiW4wkQ5iv3yfJGGMgQ2rmsFVLnMRYUvfr1lPrWYnM9tMvlSJRGq5KllNSiTwjvdeGF_RSaUN6LTYbnpHcsO8Vp410fXUQzSnZk-KF3IpyRcEovPsnzVm5CuKnu1Le-f53Tf9RSEslC5Jw570wtaoKXhrdp7_XvCSzFprWDN5K-PtMpTQixR2Dl5XiWgtZBvGMgmiGixkM4JTxJatyA3hGFOIHM-hEOd_y3CJ0gKhgr1xTKpWsjCg5zGk4pmVVpq67Nix91VRU6ZpyoOIKpllJJU_hlam999gPwqcgnPm_f3DFCXgZaVFsciRGyZViBe2EWeNpYxyBz5Ri-y_l_ADgGf24TQi8GEoVZ4ZbS8435TLFRDA7xnco7DSsJMnSRXyMY2ntUMJTVmmAQU8MKTlSZiTpSm3FlluAlUZK6-y1YwAEf6VB9Ggv_zSKRZnmVcYpiOcORxB_utTWILnSbudEglXHZvdXG1Wlhv7cf2P6NXio3RLVz5HIQmi-MPsNrBybqR5BK24WiptKlQuZvPDUBNE4iCa2L_nn9labzIYbzxqUC8xplgPsvO3CgnP9LDsXdYuzOLOEhOHg4SmIH8n-a6A01n1yswXLd2yvsZCEESxf1I8vIuvaO1rdSpFRVXqc2YK_pdyl0JvBoE7XOgv2_ugJdqlrmFVgBNsxYRZGsVIvpSowok6osfNQD26GNPMxs8OwJDqtRImUeW0UFM7258EuWY5cdnN3kr-fz07t4pjRj4xdX4jLfrDEcq76vhIsdig_i0yWlrVPGBOf9a8D-lVSObe8O5f2c4ENzTQfcmyxtua5A_G3YetnHFMv9xdY9w7SExfRqM7w2eiL3Xinm18h517qWagZaXNwWKRzR-K-D7jfXsWTtrET_O7L6RJZivJs8ZbSr8PftDza5v5j6-RapK3UG3ng4S8S-ODr4jx_LO5rmG5mUbcKXGVM61H0jIu-nezHIFAtLawp2iHWB6gEa7J_Yeu4IbQzZEdY59E1LR7bjLxZQrHvCAcnK3warFiwusstn-NO_AHm1A0HKGew6k3mSX6VBYcvKMTTwnLS7bJWOit6DcJan8ybB0fRMYfytMUOxWEQYirCMLSAt1hsrTza7MC-Xf62HqGVrMh0vbqFFv38XjvZWgetllQuHMnfCbPNo-bmoMDamXBiOBg-prLYCOgAJymD4RPsrY3ZaBunm-KVzBKZm75U8PX8A7-T8WDy8venid_2IMldWp5qfReF9DONfhTNmHoFbZXvSSaaqy0UqqPNNT1b76VW1loJqiFXMRbqs0CF9xIbUvkHtx40RCJ3nPQyV-gr4vaaYGW68E_epQtSmNtN_QANKYwPtSGKKbi_osyvDz_dyiq9dvUk2Zw24JWIDg2IVulOu64S365dexzeXx7_fSdVRhujLBYYozsajMaOCbCaiWMo47tkbzh9BV_ym221LNXpHT72-_0mzovkfKyMr257WbVI-gGOPgxHLM7u46VNsnvFvEBUSyOspiUTubZ1NZN02OhXwr1wopXRusILk-eWfxmCdNmgqjsuLVlqPGfRN5MgsOZ4q4U1__555N__mHzjMAalbqZMGN2HpwT8ffQ7i4C9XWxnWdZuH4fhePCRCOLo4Jm93R7BzWg-AiW6vwVKZ12eIQMkmHjHQ8f0XzyXLDs1ZnW4o8dHN1Z8qrx5p_CjGg4nTHPyL-lYXaWXzFIdS8wNFWemsZIVMU2fgRIrlQOp36PWeIj1YXcniMwNpMmbKBg67ImXslqtSW9Y6hXQwuzkAiA2AoVALs9XvysjQfzc9v3NSiNcpbTVJqlWpKU_J2L2wAZCSLujsePZUx8oM_vG_mDotZQ7EkuUH3y1SJlGApPcKUX-tuGuPtnNr314ZeuTF4100xlXUrkCtwMiNyjhCJd3z-uaczwEUOPRUhmbifb5GuBraw1Q9xaJPTpz1qV6JbZiokSGenw6GI2i0XgUT-572TTOJvGE9YwwOZ9iSudOYoB27cqMMHf2REvK-pxLlK2Ci0hzJOs8Qt2rVD7t7CnYa6qkj60HX_J8e_h3BwL486BnTEzFsV08Dx9G8bi3no6G8WgQZtkwSng2miyTdBSGUTh4WE6i8Wgy7OUsgTcLP4iikmPirAncI5CemEZYSuEEP4PBZBj3H8bDKONRmgyzLFrGDOWcF9is-haH3ex6auoggTMajbnQRh8bGZK9gkxy7mAf6nMt1fSFpRBbrOdcTx30fwHIxI2C">