<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/148380>148380</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            [clang] Missed optimization regression: coro.destroy not devirtualized even when possible
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            clang
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          eyalz800
      </td>
    </tr>
</table>

<pre>
    In the below example - clang 18 generates good code, and clang 19 and up to trunk fail to do so.
```cpp
#include <coroutine>
    
struct coro
{
 struct promise_type
    {
        constexpr coro get_return_object() { return coro{}; }
        constexpr auto initial_suspend() noexcept { return std::suspend_never{}; }
        constexpr auto final_suspend() noexcept { return std::suspend_never{}; }
        auto unhandled_exception() {}
 constexpr auto return_void() {}
    };

    constexpr auto await_ready() { return false; }
    constexpr auto await_suspend(auto handle) { handle.destroy(); }
    constexpr auto await_resume() {}

};

coro f1() noexcept;
coro f2() noexcept
{
    co_await f1();
}
```

Clang 18:
```asm
f2():
        jmp     f1()@PLT
```

In Clang 19 and above:
```asm
f2():
        push    rbx
        mov     edi, 24
 call    operator new(unsigned long)@PLT
        mov     rbx, rax
 lea     rax, [rip + f2() (.resume)]
        mov     qword ptr [rbx], rax
 lea     rax, [rip + f2() (.destroy)]
        mov     qword ptr [rbx + 8], rax
        call    f1()@PLT
        mov     rdi, rbx
        mov byte ptr [rbx + 17], 0
        pop     rbx
        jmp     qword ptr [rdi + 8]

f2() (.resume):
        mov     esi, 24
        jmp     operator delete(void*, unsigned long)@PLT

f2() (.destroy):
        mov     esi, 24
        jmp     operator delete(void*, unsigned long)@PLT
```

It seems to be ralted to the compiler not realizing that `jmp     qword ptr [rdi + 8]` can be devirtualized into coro.destroy, which results in failing to inline it and optimize the code.

Please see the godbolt link:
https://godbolt.org/z/MnWrKjxEn
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJy8Vk2P4ygT_jXkUpqI4HzYBx-S9EQave9Ic1hpjxE2FZseDF7ASbp__QrsfDk92p7DbqslAkU9T_FU4YI7JyuNmJPFhixeJrzztbE5vnH1nlI6KYx4y79p8DVCgcqcAM-8aRXCFygV1xXMUqhQo-UeHVTGCCiNQMK2wLW47MnipGvBG_C20z_hwKUKM2HAmSmha7Kk_X_ZtmHKEqlL1QkEkmxLY03npUaSfCV0DQBA6Np525UegjF4rDbBNCy21jTS4d6_tXjx6DcMf6XRzuO5tdEfKvR7i76zem-KVyw9YSlhWXCCfr3nWW3I6oUkGwjDR2i88wakll5ytXeda1GLAUsbPJfY-ntQ5wVJ1iRZD1v3Go9oP0lzkPpfIInQna65FgrFvoeTRt8UGfaPohnkOxopnrZG-QNXSFQ_H3nzE5chBVy8PUt_4MrhY6Qfut-kiIv9ES5Y_Wwq0HlrBpJPYFp0XYPjE8WCux0o1tBhNspBb-5tbGy7Vmwk3keyK8QA3FNdbkZPtR2uXcjonZG7htD1haY3XjL62rRxvKLP6Y____EM_U3D9v7C8sIc8Td42s7VYbTF-W61Mcc4opDhs8DmsXa4UmHRtOHTYSxoPBGWdjp-jwQoo6v7SMdggYJtwfLIpJD3qzyuksXGyhYI29x0JyydXlKZkcXLB5h_nYwV0HobAYpz2PXbHNfy-iRJREgfqS63fdDoKW1PYvTKPstevHkcc81WAxm9z5xp4Tlzl7p5jFnIW8yxbH6h8UNpXIvA3RXBiOZaCwIV-nDl-m_JOnj8sjLGAdwl4L-KYHSLPDjExoX2ViBYrjyK2PlqhNI0rVRoQRsPFrmS71JX4GvugSzpPyu-pFByHYAFHqX1XYBAAVJ7E1vUTYAtnGpZ1hBSorwDqWPbjXyhRympEaSPV920XjbyHYcgBU77w_xQyB2GA0VLZURhlAcl9c9e3tr71oWfbEfYbrBPja0I270Ttvuu_7T_ez1_1RORJyJLMj7BfLZasCRZpBmd1Llg82K-RLpinJdpkmUZQ855mRySZSaSxUTmjLIFXc3YbLFIZ3RK6TIpBKeYHWaUF4zMKTZcqqlSxyZwT6RzHeazeZqkdKJ4gcrFJw5j8UVCGAuvHZsHhy9FVzkyp0o6724QXnoV30W9x-IFvkvn8CoVDz0RLFYWnQvtMVk_6B8z_JgjPKKGU40aWuOcLBROOqvykYTS110xLU1D2C5EMwxfWmv6h8kuns4RthsOeMzZ3wEAAP__4U7lkQ">