<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/65018>65018</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [bug] clang incorrectly uses coroutine frame for scratch space after suspending
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          jacobsa
      </td>
    </tr>
</table>

<pre>
    Here is a simple library with a function that uses a foreign `await_suspend` method with symmetric transfer of control to await each element of an array of 32 integers, simulating a more complex construct in my codebase:

```c++
#include <coroutine>

// A simple awaiter type with an await_suspend method that can't be
// inlined.
struct Awaiter {
  const int& x;

  bool await_ready() { return false; }
  std::coroutine_handle<> await_suspend(const std::coroutine_handle<> h);
  void await_resume() {}
};

struct MyTask {
  // A lazy promise with an await_transform method that supports awaiting
  // integer references using the Awaiter struct above.
  struct promise_type {
 MyTask get_return_object() { return {}; }
    std::suspend_always initial_suspend() { return {}; }
    std::suspend_always final_suspend() noexcept { return {}; }
    void unhandled_exception();

    auto await_transform(const int& x) { return Awaiter{x}; }
  };
};

// A global array of integers.
int g_array[32];

// A coroutine that awaits each integer in the global array.
MyTask FooBar() {
  for (const int& x : g_array) {
    co_await x;
  }
}
```

Clang at trunk (currently `a738bdf35eaa`, using `-std=c++20 -O1 -fno-exceptions`) miscompiles this, using the coroutine frame for scratch space after `await_suspend` returns:

```asm
FooBar() [clone .resume]: # @FooBar() [clone .resume]
        push    r15
        push r14
        push    r12
        push    rbx
        push rax
        
        ; Throughout the function, rbx contains the address of the coroutine
        ; frame.
        mov     rbx, rdi
[...]

.LBB4_2:
        lea     r14, [rbx + 40]
 
        ; In this basic block, r15 contains offset 32 in the coroutine frame,
        ; which clang uses for scratch space. In the larger example in my real
        ; codebase it reuses this for multiple purposes, but in this small
        ; function it uses it only once (see below).
 lea     r15, [rbx + 32]
[...]

        ; When the coroutine suspends, clang dumps the handle returned by
        ; await_suspend into the scratch space.
        call Awaiter::await_suspend(std::__n4861::coroutine_handle<void>)@PLT
 mov     qword ptr [rbx + 32], rax
        
        ; Afterward it loads that handle back from scratch space and uses the
        ; resume function in the first word of the coroutine frame it refers
        ; to as the target of an indirect jump, for symmetric transfer of
        ; control.
        mov     rdi, r15
        call std::__n4861::coroutine_handle<void>::address() const
        mov     rdi, rax
        add     rsp, 8
        pop     rbx
        pop r12
        pop     r14
        pop     r15
        jmp     qword ptr [rax]                 # TAILCALL
[...]
```

([Compiler Explorer](https://godbolt.org/z/64oE6b3n6))

As discussed in #56301, **it is not safe to use the coroutine frame for scratch space after the coroutine has suspended**. This introduces a data race, because by the time the suspending thread writes to the scratch space the coroutine may have been resumed or destroyed by another thread. In my codebase this causes segfaults and use-after-free reports.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJykWNFv6jry_mvcl1FRSAiFBx6gPdXvSv1p96HSPqKJPSE-x7GzttOW-9evbIdAgO65V1shoIk9nvnmm28moHPyoIk2rNyx8uUBe98Yu_mJ3FQOHyojjpv_I0sgHSA42XaKQMnKoj3Cp_QNINS95l4aDb5BD72jsLQ2luRBA1tm-InS713vOtKCLTNoyTdGpO3u2LbkreTgLWpXkwVTAzfaW6PAG4i7gZA3QIpa0j4sQA1oLR7D9yIHqT0dyDqWPwcne4Ve6gMgtMYScBPc_gpWnbc99yA1tEfgRlCFjlixZdkLy07vyyy9OMt34ZWu5oXUXPWCgBXP3FjTe6mJFT8me_NXlr_C9gRV9J4s-GNHA14aJoCc0IjgcdQsf_JQ0cSc1EpqErN0cYhhO5hmT4OHkAIMYLB8CV-s2F26BlAZo4bDLaE4snzF8nUwAJZ8bzXUqAIcO2BPL6ddzosAULEdY943qIUiVjyz4sc0GpavkhO_29WwfD06CPBhpBhdc31LZ99GV8KXaUgDEv9_fEf36xKIMQ0K_zxCZ00r3TX-iW_GtpMMuL7rjPUuLZL6cGVyYBpYqsmS5uSgd4FrvqExJYNfWJkPmp1xjBcHZ_aREWeXhxgOFAAIudib6idxf5ujAZJpki7SNGRij-oTjw6kll6iukjQ_2avlvrGmjb0xanzf8FsTHSvExfEPu2TRidLN5QFwP4kAueMjSwbqT4NacgDe9p93fowodENpUbmHJSpUJ1V5iQxQz6l9nDYx7us3BU5K7-1NFZAYliMxSVFO7FJ6sifyzOHcwZevBqzQ3tRFEMwtbFwgwawYjs6N10fNGKfJPXrovwmNTYVwcuQnhUGVfXgba9_xYN7a0l7dYxC_1SsKlEXJSGGnfnzUBtsmT1GPr0Mkppn8PiPOTzW2jyOFHBxzxpa6YJiS0UOfCPd2U7A6IxmbbGliIDjFj1vwHXICbCOsnin8SR6uO_0Hl2brkzRLndcGU0wG5QpZHoLLC-ALbLfLh2BD39d75rwaeflnRt2vvhuef7Njerrnh28unr1X7GD98aa_tCY3kdQTy08QG2r2Ck9Su3iTRTCknOhCCYJuLUaMzKbXm_NB5xcDdaFHFAvd7PZbEQovc_edrvFPh8TdDKiCJOR-SIYYeUueMnyHSyyM8i3Dv2hI4OgQic5VMrwX9GJeXkO0dS1I5-miHsMY_nzreHPRvIGeKyHOPHc0HCWDidQaEOJ0xfGiSCNHpZQ3Vo9DSQgPViKdqP7wXjbKy-Dga63nXEUy6LqffJaOnAtqjs2x-lMDrOZ9GC0OoLRnEIJOyKoSJlPlq9PuTvjXV7hnZTu-wxeHv2vhq4RHYoxOp_QE33bJZ6lnjAUKQmojrcmp6OT1N7ErVPkp9s4KjV2hNjRrieWsdXt93qxWs6_HVxC8woDX75mi-yfb-_DQSeK__vTWAGdt7eABdL9viq3Qbk-0YqQJGVQuNQyBmQq5L-gtqa9FjwtYCDLnaJMUnTBg5STWlrnIXp8XdiDskYS1mGqvrEZWnJKmg_sPo3kUgtpiXv42bddiDlWxb0J_x7348j_nXoIORTuneT-7fwlFiRZG6Q7dtD_fvZ1_lCIdN_FWFdXSmy6UfdubtwK-mn1TQsYb1xF_rPt7pAOv1j5Atd_oVe9b_94e96-vd2v3XvdPiBT7p5TK7bw46tTxpKNdF413nexkcYp52BEZZSfGXtg-eufLH9dLsyPZVXoZSiWfH1pd-tASMd75yiUcPCuXBbZPCpNvmX5VvrwvKmNB4c1Bbb1jv5W95-ubdCdlIdEOmIG70E0ZSCd6Hl8ZhXoESxyitJKHMOp1THxXLbJg8FOGkfCYxR8WulD8d2Rois_WjxCgx9Bb0kPhSnAWBDkvDXHKHqA2vgmxhDMxz5y8bCaxD765sDRocZehceVpAGPMfzH2lIQ0vgkM3sQm0KsizU-0Ga-XBfz-Xperh6azfopL-cCCyqrckk11uUTVohUl3y9rqrVg9zkWV5kq_wpW5Z5sZoJQcVCZHW5WPBcILJFRi1KNVPqow3Zf5DO9bRZltl89aCwIuVOPy3YTVj0WPUHxxaZks678zYvvYo_QlT9ITA49QapubFBUNQxCdxfTf85SQ-9VZsrskrf9NWMm5blr8GD4eOxsyY9cr3GKFygcQjkPwEAAP__JI86Iw">