[llvm-bugs] [Bug 43508] New: Coroutine symmetric transfer tail call optimization not working on AArch64

via llvm-bugs llvm-bugs at lists.llvm.org
Mon Sep 30 09:17:22 PDT 2019


            Bug ID: 43508
           Summary: Coroutine symmetric transfer tail call optimization
                    not working on AArch64
           Product: clang
           Version: 9.0
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: C++2a
          Assignee: unassignedclangbugs at nondot.org
          Reporter: bartde at microsoft.com
                CC: blitzrakete at gmail.com, erik.pilkington at gmail.com,
                    llvm-bugs at lists.llvm.org, richard-llvm at metafoo.co.uk

The following code:

  task<void> sync_async() { co_return; }

  task<void> do_async()
    for (int i = 0; i < 1024 * 1024; i++)
      co_await sync_async();

Causes a stack overflow


using clang9 on AArch64, without optimization (-O0). The task<T> implementation
uses symmetric transfer for the final awaiter:

  template <typename Promise>
  coroutine_handle_t await_suspend(std::experimental::coroutine_handle<Promise>
h) const noexcept
    return h.promise().m_waiter;

and for its operator co_await implementation:

  coroutine_handle_t await_suspend(coroutine_handle_t h) const
    m_coro.promise().m_waiter = h;
    return m_coro;

I think the task<T> provided by cppcoro should behave completely similar, so it
can be used for the repro.

The stack overflow doesn't repro on x86/x64, or for higher levels of
optimization on AArch64. Both for -O0 builds, the x86 version emits a tail call
by means of a jmp:

   b68d2:   e8 49 bc f7 ff           callq  32520
   b68d7:   48 89 c1                 mov    %rax,%rcx
   b68da:   48 8b 00                 mov    (%rax),%rax
   b68dd:   48 89 cf                 mov    %rcx,%rdi
   b68e0:   48 81 c4 a0 00 00 00     add    $0xa0,%rsp
   b68e7:   5d                       pop    %rbp
   b68e8:   ff e0                    jmpq   *%rax

while the AArch64 version emits:

   a8be0:   97fe1fbf    bl    30adc
   a8be4:   f9400008    ldr   x8, [x0]
   a8be8:   d63f0100    blr   x8
   a8bec:   a9497bfd    ldp   x29, x30, [sp, #144]
   a8bf0:   910283ff    add   sp, sp, #0xa0
   a8bf4:   d65f03c0    ret

which seems to perform a regular call using blr.

You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20190930/b14b8909/attachment.html>

More information about the llvm-bugs mailing list