[clang] b32aa72 - Recommit [C++20] [Coroutines] Mark await_suspend as noinline if the awaiter is not empty

Chuanqi Xu via cfe-commits cfe-commits at lists.llvm.org
Mon Aug 28 02:14:00 PDT 2023


Author: Chuanqi Xu
Date: 2023-08-28T17:07:30+08:00
New Revision: b32aa72afc1d6a094fde6ca04d8a1124af34a2ad

URL: https://github.com/llvm/llvm-project/commit/b32aa72afc1d6a094fde6ca04d8a1124af34a2ad
DIFF: https://github.com/llvm/llvm-project/commit/b32aa72afc1d6a094fde6ca04d8a1124af34a2ad.diff

LOG: Recommit [C++20] [Coroutines] Mark await_suspend as noinline if the awaiter is not empty

The original patch is incorrect since it marks too many calls to be
noinline. It shows that it is bad to do analysis in the frontend again.
This patch tries to mark the await_suspend function as noinlne only.

---

Close https://github.com/llvm/llvm-project/issues/56301
Close https://github.com/llvm/llvm-project/issues/64151
Close https://github.com/llvm/llvm-project/issues/65018

See the summary and the discussion of
https://reviews.llvm.org/D157070
to get the full context.

As @rjmccall pointed out, the key point of the root cause is that
currently we didn't implement the semantics for '@llvm.coro.save'
well ("after the await-ready returns false, the coroutine is considered
to be suspended ") well.
Since the semantics implies that we (the compiler) shouldn't write
the spills into the coroutine frame in the await_suspend. But now it is
possible due to some combinations of the optimizations so the semantics are
broken. And the inlining is the root optimization of such optimizations.
So in this patch, we tried to add the `noinline` attribute to the
await_suspend function.

This looks slightly problematic since the users are able to call the
await_suspend function standalone. This is limited by the
implementation. On the one hand, we don't want the workaround solution
(See the proposed solution later) to be too complex. On the other hand,
it is rare to call await_suspend standalone. Also it is not semantically
incorrect to do so since the inlining is not part of the C++ standard.

Also as an optimization, we don't add the `noinline` attribute to
the await_suspend function if the awaiter is an empty class. This should be
correct since the programmers can't access the local variables in
await_suspend if the awaiter is empty. I think this is necessary for
the performance since it is pretty common.

The long term solution is:

    call @llvm.coro.await_suspend(ptr %awaiter, ptr %handle,
                                  ptr @awaitSuspendFn)

Then it is much easier to perform the safety analysis in the middle
end. If it is safe to inline the call to awaitSuspend, we can replace it
in the CoroEarly pass. Otherwise we could replace it in the CoroSplit
pass.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D157833

Added: 
    clang/test/CodeGenCoroutines/coro-awaiter-noinline-suspend.cpp
    clang/test/CodeGenCoroutines/pr56301.cpp
    clang/test/CodeGenCoroutines/pr65018.cpp

Modified: 
    clang/docs/ReleaseNotes.rst
    clang/lib/CodeGen/CGCoroutine.cpp
    clang/lib/Sema/SemaCoroutine.cpp

Removed: 
    


################################################################################
diff  --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 767861df912998..6dc3c1c5fbcef8 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -174,6 +174,13 @@ Bug Fixes in This Version
   ``abs`` builtins.
   (`#45129 <https://github.com/llvm/llvm-project/issues/45129>`_,
   `#45794 <https://github.com/llvm/llvm-project/issues/45794>`_)
+- Fixed an issue where accesses to the local variables of a coroutine during
+  ``await_suspend`` could be misoptimized, including accesses to the awaiter
+  object itself.
+  (`#56301 <https://github.com/llvm/llvm-project/issues/56301>`_)
+  The current solution may bring performance regressions if the awaiters have
+  non-static data members. See
+  `#64945 <https://github.com/llvm/llvm-project/issues/64945>`_ for details.
 - Clang now prints unnamed members in diagnostic messages instead of giving an
   empty ''. Fixes
   (`#63759 <https://github.com/llvm/llvm-project/issues/63759>`_)

diff  --git a/clang/lib/CodeGen/CGCoroutine.cpp b/clang/lib/CodeGen/CGCoroutine.cpp
index cf260570510176..47c077be4138a6 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -201,6 +201,7 @@ static LValueOrRValue emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
   CGF.CurCoro.InSuspendBlock = true;
   auto *SuspendRet = CGF.EmitScalarExpr(S.getSuspendExpr());
   CGF.CurCoro.InSuspendBlock = false;
+
   if (SuspendRet != nullptr && SuspendRet->getType()->isIntegerTy(1)) {
     // Veto suspension if requested by bool returning await_suspend.
     BasicBlock *RealSuspendBlock =

diff  --git a/clang/lib/Sema/SemaCoroutine.cpp b/clang/lib/Sema/SemaCoroutine.cpp
index c7d88f7784c187..0b2987376b8b3b 100644
--- a/clang/lib/Sema/SemaCoroutine.cpp
+++ b/clang/lib/Sema/SemaCoroutine.cpp
@@ -360,6 +360,63 @@ static Expr *maybeTailCall(Sema &S, QualType RetType, Expr *E,
                                 JustAddress);
 }
 
+/// The await_suspend call performed by co_await is essentially asynchronous
+/// to the execution of the coroutine. Inlining it normally into an unsplit
+/// coroutine can cause miscompilation because the coroutine CFG misrepresents
+/// the true control flow of the program: things that happen in the
+/// await_suspend are not guaranteed to happen prior to the resumption of the
+/// coroutine, and things that happen after the resumption of the coroutine
+/// (including its exit and the potential deallocation of the coroutine frame)
+/// are not guaranteed to happen only after the end of await_suspend.
+///
+/// See https://github.com/llvm/llvm-project/issues/56301 and
+/// https://reviews.llvm.org/D157070 for the example and the full discussion.
+///
+/// The short-term solution to this problem is to mark the call as uninlinable.
+/// But we don't want to do this if the call is known to be trivial, which is
+/// very common.
+///
+/// The long-term solution may introduce patterns like:
+///
+///  call @llvm.coro.await_suspend(ptr %awaiter, ptr %handle,
+///                                ptr @awaitSuspendFn)
+///
+/// Then it is much easier to perform the safety analysis in the middle end.
+/// If it is safe to inline the call to awaitSuspend, we can replace it in the
+/// CoroEarly pass. Otherwise we could replace it in the CoroSplit pass.
+static void tryMarkAwaitSuspendNoInline(Sema &S, OpaqueValueExpr *Awaiter,
+                                        CallExpr *AwaitSuspend) {
+  // The method here to extract the awaiter decl is not precise.
+  // This is intentional. Since it is hard to perform the analysis in the
+  // frontend due to the complexity of C++'s type systems.
+  // And we prefer to perform such analysis in the middle end since it is
+  // easier to implement and more powerful.
+  CXXRecordDecl *AwaiterDecl =
+      Awaiter->getType().getNonReferenceType()->getAsCXXRecordDecl();
+
+  if (AwaiterDecl && AwaiterDecl->field_empty())
+    return;
+
+  FunctionDecl *FD = AwaitSuspend->getDirectCallee();
+
+  assert(FD);
+
+  // If the `await_suspend()` function is marked as `always_inline` explicitly,
+  // we should give the user the right to control the codegen.
+  if (FD->hasAttr<NoInlineAttr>() || FD->hasAttr<AlwaysInlineAttr>())
+    return;
+
+  // This is problematic if the user calls the await_suspend standalone. But on
+  // the on hand, it is not incorrect semantically since inlining is not part
+  // of the standard. On the other hand, it is relatively rare to call
+  // the await_suspend function standalone.
+  //
+  // And given we've already had the long-term plan, the current workaround
+  // looks relatively tolerant.
+  FD->addAttr(
+      NoInlineAttr::CreateImplicit(S.getASTContext(), FD->getLocation()));
+}
+
 /// Build calls to await_ready, await_suspend, and await_resume for a co_await
 /// expression.
 /// The generated AST tries to clean up temporary objects as early as
@@ -431,6 +488,10 @@ static ReadySuspendResumeResult buildCoawaitCalls(Sema &S, VarDecl *CoroPromise,
     //     type Z.
     QualType RetType = AwaitSuspend->getCallReturnType(S.Context);
 
+    // We need to mark await_suspend as noinline temporarily. See the comment
+    // of tryMarkAwaitSuspendNoInline for details.
+    tryMarkAwaitSuspendNoInline(S, Operand, AwaitSuspend);
+
     // Support for coroutine_handle returning await_suspend.
     if (Expr *TailCallSuspend =
             maybeTailCall(S, RetType, AwaitSuspend, Loc))

diff  --git a/clang/test/CodeGenCoroutines/coro-awaiter-noinline-suspend.cpp b/clang/test/CodeGenCoroutines/coro-awaiter-noinline-suspend.cpp
new file mode 100644
index 00000000000000..f95286faf46ec8
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-awaiter-noinline-suspend.cpp
@@ -0,0 +1,168 @@
+// Tests that we can mark await-suspend as noinline correctly.
+//
+// RUN: %clang_cc1 -std=c++20 -triple x86_64-unknown-linux-gnu -emit-llvm -o - %s \
+// RUN:     -O1 -disable-llvm-passes | FileCheck %s
+
+#include "Inputs/coroutine.h"
+
+struct Task {
+  struct promise_type {
+    struct FinalAwaiter {
+      bool await_ready() const noexcept { return false; }
+      template <typename PromiseType>
+      std::coroutine_handle<> await_suspend(std::coroutine_handle<PromiseType> h) noexcept {
+        return h.promise().continuation;
+      }
+      void await_resume() noexcept {}
+    };
+
+    Task get_return_object() noexcept {
+      return std::coroutine_handle<promise_type>::from_promise(*this);
+    }
+
+    std::suspend_always initial_suspend() noexcept { return {}; }
+    FinalAwaiter final_suspend() noexcept { return {}; }
+    void unhandled_exception() noexcept {}
+    void return_void() noexcept {}
+
+    std::coroutine_handle<> continuation;
+  };
+
+  Task(std::coroutine_handle<promise_type> handle);
+  ~Task();
+
+private:
+  std::coroutine_handle<promise_type> handle;
+};
+
+struct StatefulAwaiter {
+    int value;
+    bool await_ready() const noexcept { return false; }
+    template <typename PromiseType>
+    void await_suspend(std::coroutine_handle<PromiseType> h) noexcept {}
+    void await_resume() noexcept {}
+};
+
+typedef std::suspend_always NoStateAwaiter;
+using AnotherStatefulAwaiter = StatefulAwaiter;
+
+template <class T>
+struct TemplatedAwaiter {
+    T value;
+    bool await_ready() const noexcept { return false; }
+    template <typename PromiseType>
+    void await_suspend(std::coroutine_handle<PromiseType> h) noexcept {}
+    void await_resume() noexcept {}
+};
+
+
+class Awaitable {};
+StatefulAwaiter operator co_await(Awaitable) {
+  return StatefulAwaiter{};
+}
+
+StatefulAwaiter GlobalAwaiter;
+class Awaitable2 {};
+StatefulAwaiter& operator co_await(Awaitable2) {
+  return GlobalAwaiter;
+}
+
+struct AlwaysInlineStatefulAwaiter {
+    void* value;
+    bool await_ready() const noexcept { return false; }
+
+    template <typename PromiseType>
+    __attribute__((always_inline))
+    void await_suspend(std::coroutine_handle<PromiseType> h) noexcept {}
+
+    void await_resume() noexcept {}
+};
+
+Task testing() {
+    co_await std::suspend_always{};
+    co_await StatefulAwaiter{};
+    co_await AnotherStatefulAwaiter{};
+    
+    // Test lvalue case.
+    StatefulAwaiter awaiter;
+    co_await awaiter;
+
+    // The explicit call to await_suspend is not considered suspended.
+    awaiter.await_suspend(std::coroutine_handle<void>::from_address(nullptr));
+
+    co_await TemplatedAwaiter<int>{};
+    TemplatedAwaiter<int> TemplatedAwaiterInstace;
+    co_await TemplatedAwaiterInstace;
+
+    co_await Awaitable{};
+    co_await Awaitable2{};
+
+    co_await AlwaysInlineStatefulAwaiter{};
+}
+
+struct AwaitTransformTask {
+  struct promise_type {
+    struct FinalAwaiter {
+      bool await_ready() const noexcept { return false; }
+      template <typename PromiseType>
+      std::coroutine_handle<> await_suspend(std::coroutine_handle<PromiseType> h) noexcept {
+        return h.promise().continuation;
+      }
+      void await_resume() noexcept {}
+    };
+
+    AwaitTransformTask get_return_object() noexcept {
+      return std::coroutine_handle<promise_type>::from_promise(*this);
+    }
+
+    std::suspend_always initial_suspend() noexcept { return {}; }
+    FinalAwaiter final_suspend() noexcept { return {}; }
+    void unhandled_exception() noexcept {}
+    void return_void() noexcept {}
+
+    template <typename Awaitable>
+    auto await_transform(Awaitable &&awaitable) {
+      return awaitable;
+    }
+
+    std::coroutine_handle<> continuation;
+  };
+
+  AwaitTransformTask(std::coroutine_handle<promise_type> handle);
+  ~AwaitTransformTask();
+
+private:
+  std::coroutine_handle<promise_type> handle;
+};
+
+struct awaitableWithGetAwaiter {
+  bool await_ready() const noexcept { return false; }
+  template <typename PromiseType>
+  void await_suspend(std::coroutine_handle<PromiseType> h) noexcept {}
+  void await_resume() noexcept {}
+};
+
+AwaitTransformTask testingWithAwaitTransform() {
+  co_await awaitableWithGetAwaiter{};
+}
+
+// CHECK: define{{.*}}@_ZNSt14suspend_always13await_suspendESt16coroutine_handleIvE{{.*}}#[[NORMAL_ATTR:[0-9]+]]
+
+// CHECK: define{{.*}}@_ZN15StatefulAwaiter13await_suspendIN4Task12promise_typeEEEvSt16coroutine_handleIT_E{{.*}}#[[NOINLINE_ATTR:[0-9]+]]
+
+// CHECK: define{{.*}}@_ZN15StatefulAwaiter13await_suspendIvEEvSt16coroutine_handleIT_E{{.*}}#[[NORMAL_ATTR]]
+
+// CHECK: define{{.*}}@_ZN16TemplatedAwaiterIiE13await_suspendIN4Task12promise_typeEEEvSt16coroutine_handleIT_E{{.*}}#[[NOINLINE_ATTR]]
+
+// CHECK: define{{.*}}@_ZN27AlwaysInlineStatefulAwaiter13await_suspendIN4Task12promise_typeEEEvSt16coroutine_handleIT_E{{.*}}#[[ALWAYS_INLINE_ATTR:[0-9]+]]
+
+// CHECK: define{{.*}}@_ZN4Task12promise_type12FinalAwaiter13await_suspendIS0_EESt16coroutine_handleIvES3_IT_E{{.*}}#[[NORMAL_ATTR]]
+
+// CHECK: define{{.*}}@_ZN23awaitableWithGetAwaiter13await_suspendIN18AwaitTransformTask12promise_typeEEEvSt16coroutine_handleIT_E{{.*}}#[[NORMAL_ATTR]]
+
+// CHECK: define{{.*}}@_ZN18AwaitTransformTask12promise_type12FinalAwaiter13await_suspendIS0_EESt16coroutine_handleIvES3_IT_E{{.*}}#[[NORMAL_ATTR]]
+
+// CHECK-NOT: attributes #[[NORMAL_ATTR]] = noinline
+// CHECK: attributes #[[NOINLINE_ATTR]] = {{.*}}noinline
+// CHECK-NOT: attributes #[[ALWAYS_INLINE_ATTR]] = {{.*}}noinline
+// CHECK: attributes #[[ALWAYS_INLINE_ATTR]] = {{.*}}alwaysinline

diff  --git a/clang/test/CodeGenCoroutines/pr56301.cpp b/clang/test/CodeGenCoroutines/pr56301.cpp
new file mode 100644
index 00000000000000..cd851c0b815db9
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/pr56301.cpp
@@ -0,0 +1,85 @@
+// An end-to-end test to make sure things get processed correctly.
+// RUN: %clang_cc1 -std=c++20 -triple x86_64-unknown-linux-gnu -emit-llvm -o - %s -O3 | \
+// RUN:     FileCheck %s
+
+#include "Inputs/coroutine.h"
+
+struct SomeAwaitable {
+  // Resume the supplied handle once the awaitable becomes ready,
+  // returning a handle that should be resumed now for the sake of symmetric transfer.
+  // If the awaitable is already ready, return an empty handle without doing anything.
+  //
+  // Defined in another translation unit. Note that this may contain
+  // code that synchronizees with another thread.
+  std::coroutine_handle<> Register(std::coroutine_handle<>);
+};
+
+// Defined in another translation unit.
+void DidntSuspend();
+
+struct Awaiter {
+  SomeAwaitable&& awaitable;
+  bool suspended;
+
+  bool await_ready() { return false; }
+
+  std::coroutine_handle<> await_suspend(const std::coroutine_handle<> h) {
+    // Assume we will suspend unless proven otherwise below. We must do
+    // this *before* calling Register, since we may be destroyed by another
+    // thread asynchronously as soon as we have registered.
+    suspended = true;
+
+    // Attempt to hand off responsibility for resuming/destroying the coroutine.
+    const auto to_resume = awaitable.Register(h);
+
+    if (!to_resume) {
+      // The awaitable is already ready. In this case we know that Register didn't
+      // hand off responsibility for the coroutine. So record the fact that we didn't
+      // actually suspend, and tell the compiler to resume us inline.
+      suspended = false;
+      return h;
+    }
+
+    // Resume whatever Register wants us to resume.
+    return to_resume;
+  }
+
+  void await_resume() {
+    // If we didn't suspend, make note of that fact.
+    if (!suspended) {
+      DidntSuspend();
+    }
+  }
+};
+
+struct MyTask{
+  struct promise_type {
+    MyTask get_return_object() { return {}; }
+    std::suspend_never initial_suspend() { return {}; }
+    std::suspend_always final_suspend() noexcept { return {}; }
+    void unhandled_exception();
+
+    Awaiter await_transform(SomeAwaitable&& awaitable) {
+      return Awaiter{static_cast<SomeAwaitable&&>(awaitable)};
+    }
+  };
+};
+
+MyTask FooBar() {
+  co_await SomeAwaitable();
+}
+
+// CHECK-LABEL: @_Z6FooBarv
+// CHECK: %[[to_resume:.*]] = {{.*}}call ptr @_ZN13SomeAwaitable8RegisterESt16coroutine_handleIvE
+// CHECK-NEXT: %[[to_bool:.*]] = icmp eq ptr %[[to_resume]], null
+// CHECK-NEXT: br i1 %[[to_bool]], label %[[then:.*]], label %[[else:.*]]
+
+// CHECK: [[then]]:
+// We only access the coroutine frame conditionally as the sources did.
+// CHECK:   store i8 0,
+// CHECK-NEXT: br label %[[else]]
+
+// CHECK: [[else]]:
+// No more access to the coroutine frame until suspended.
+// CHECK-NOT: store
+// CHECK: }

diff  --git a/clang/test/CodeGenCoroutines/pr65018.cpp b/clang/test/CodeGenCoroutines/pr65018.cpp
new file mode 100644
index 00000000000000..03d123e1e7e2f7
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/pr65018.cpp
@@ -0,0 +1,43 @@
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -std=c++20 \
+// RUN:      -O1 -emit-llvm %s -o - | FileCheck %s
+
+#include "Inputs/coroutine.h"
+
+// A simple awaiter type with an await_suspend method that can't be
+// inlined.
+struct Awaiter {
+  const int& x;
+
+  bool await_ready() { return false; }
+  std::coroutine_handle<> await_suspend(const std::coroutine_handle<> h);
+  void await_resume() {}
+};
+
+struct MyTask {
+  // A lazy promise with an await_transform method that supports awaiting
+  // integer references using the Awaiter struct above.
+  struct promise_type {
+    MyTask get_return_object() { return {}; }
+    std::suspend_always initial_suspend() { return {}; }
+    std::suspend_always final_suspend() noexcept { return {}; }
+    void unhandled_exception();
+
+    auto await_transform(const int& x) { return Awaiter{x}; }
+  };
+};
+
+// A global array of integers.
+int g_array[32];
+
+// A coroutine that awaits each integer in the global array.
+MyTask FooBar() {
+  for (const int& x : g_array) {
+    co_await x;
+  }
+}
+
+// CHECK: %[[RET:.+]] = {{.*}}call{{.*}}@_ZN7Awaiter13await_suspendESt16coroutine_handleIvE
+// CHECK: %[[RESUME_ADDR:.+]] = load ptr, ptr %[[RET]],
+// CHECK: musttail call fastcc void %[[RESUME_ADDR]]({{.*}}%[[RET]]
+// CHECK: ret
+


        


More information about the cfe-commits mailing list