[llvm] [Coroutines] Drop dead instructions more aggressively in addMustTailToCoroResumes() (PR #85271)

via llvm-commits llvm-commits at lists.llvm.org
Tue Mar 19 08:53:32 PDT 2024


zmodem wrote:

My test case does come from a C++ program. Here is a reasonably small reproducer:

```
$ cat /tmp/b.cc
#include <coroutine>
struct Task {
  struct promise_type {
    Task get_return_object() { return Task(std::coroutine_handle<promise_type>::from_promise(*this)); }
    std::suspend_always initial_suspend() { return {}; }
    struct FinalAwaiter {
      bool await_ready() noexcept { return false; }
      std::coroutine_handle<> await_suspend(std::coroutine_handle<promise_type> handle) {
        return handle.promise().continuation_;
      }
      void await_resume() noexcept {}
    };
    FinalAwaiter final_suspend() noexcept { return {}; }
    void return_void() {}
    void unhandled_exception() { }

    std::coroutine_handle<> continuation_;
  };

  explicit Task(std::coroutine_handle<promise_type> handle) : handle_(handle) {}

  std::coroutine_handle<promise_type> handle_;
  int x = 42;
};

void g();
extern "C" Task foo() {
  g();
  co_return;
}

$ build/bin/clang++ -c -fno-exceptions /tmp/b.cc -std=c++20 -O1 -mllvm -print-after=coro-split -mllvm -filter-print-funcs=foo.resume -o /dev/null -w
; *** IR Dump After CoroSplitPass on (foo.resume) ***
; Function Attrs: mustprogress nounwind uwtable
define internal fastcc void @foo.resume(ptr nocapture noundef nonnull align 8 dereferenceable(32) %0) #0 {
entry.resume:
  %__promise.reload.addr = getelementptr inbounds i8, ptr %0, i64 16
  call void @_Z1gv() #2
  store ptr null, ptr %0, align 8
  %retval.sroa.0.0.copyload.i.i = load ptr, ptr %__promise.reload.addr, align 8, !tbaa !5
  %1 = call ptr @llvm.coro.subfn.addr(ptr %retval.sroa.0.0.copyload.i.i, i8 0)
  call fastcc void %1(ptr %retval.sroa.0.0.copyload.i.i) #2
  ret void
}
```

Note the missing `musttail`.

(In this case, later passes will make that a tail call anyway, at least on X86, but that wasn't the case in the original program which overflowed the stack instead.)

(The trick to the reproducer is that the coroutine has a return value the requires a couple of instructions to put together, which blocks `simplifyTerminatorLeadingToRet()`'s search for the return. At `-O0` those instructions come after the `llvm.coro.end` call so they don't block the search for return, but at `-O1` they come before.)

https://github.com/llvm/llvm-project/pull/85271


More information about the llvm-commits mailing list