[libcxx-commits] [clang] [libcxx] Elide suspension points via [[clang::coro_await_suspend_destroy]] (PR #152623)
via libcxx-commits
libcxx-commits at lists.llvm.org
Fri Aug 8 00:12:32 PDT 2025
https://github.com/snarkmaster updated https://github.com/llvm/llvm-project/pull/152623
>From 9fc3169ea5f1aea2a88b2616b7c9c4f2949139be Mon Sep 17 00:00:00 2001
From: Alexey <snarkmaster at gmail.com>
Date: Thu, 7 Aug 2025 12:10:07 -0700
Subject: [PATCH 1/3] Elide suspension points via
[[clang::coro_await_suspend_destroy]]
Start by reading the detailed user-facing docs in `AttrDocs.td`.
My immediate motivation was that I noticed that short-circuiting coroutines
failed to optimize well. Interact with the demo program here:
https://godbolt.org/z/E3YK5c45a
If Clang on Compiler Explorer supported [[clang::coro_await_suspend_destroy]],
the assembly for `simple_coro` would be drastically shorter, and would not
contain a call to `operator new`.
Here are a few high-level thoughts that don't belong in the docs:
- This has `lit` tests, but what gives me real confidence in its correctness
is the integration test in `coro_await_suspend_destroy_test.cpp`. This
caught all the interesting bugs that I had in earlier revs, and covers
equivalence to the standard code path in far more scenarios.
- I considered a variety of other designs. Here are some key design points:
* I considered optimizing unmodified `await_suspend()` methods, as long as
they unconditionally end with an `h.destroy()` call on the current
handle, or an exception. However, this would (a) force dynamic dispatch
for `destroy` -- bloating IR & reducing optimization opportunities, (b)
require far more complex, delicate, and fragile analysis, (c) retain more
of the frame setup, so that e.g. `h.done()` works properly. The current
solution shortcuts all these concerns.
* I want to `Promise&`, rather than `std::coroutine_handle` to
`await_suspend_destroy` -- this is safer, simpler, and more efficient.
Short-circuiting corotuines should not touch the handle. This decision
forces the attribue to go on the class. Resolving a method attribute
would have required looking up overloads for both types, and choosing
one, which is costly and a bad UX to boot.
* `AttrDocs.td` tells portable code to provide a stub `await_suspend()`.
This portability / compatibility solution avoids dire issues that would
arise if users relied on `__has_cpp_attribute` and the declaration and
definition happened to use different toolchains. In particular, it will
even be safe for a future compiler release to killswitch this attribute
by removing its implementation and setting its version to 0.
```
let Spellings = [Clang<"coro_destroy_after_suspend", /*allowInC*/ 0,
/*Version*/ 0>];
```
- In the docs, I mention the `HasCoroSuspend` path in `CoroEarly.cpp` as
a further optimization opportunity. But, I'm sure there are
higher-leverage ways of making these non-suspending coros compile better, I
just don't know the coro optimization pipeline well enough to flag them.
- IIUC the only interaction of this with `coro_only_destroy_when_complete`
would be that the compiler expends fewer cycles.
- I ran some benchmarks on [folly::result](
https://github.com/facebook/folly/blob/main/folly/result/docs/result.md).
Heap allocs are definitely elided, the compiled code looks like a function,
not a coroutine, but there's still an optimization gap. On the plus side,
this results in a 4x speedup (!) in optimized ASAN builds (numbers not
shown for brevity.
```
// Simple result coroutine that adds 1 to the input
result<int> result_coro(result<int>&& r) {
co_return co_await std::move(r) + 1;
}
// Non-coroutine equivalent using value_or_throw()
result<int> catching_result_func(result<int>&& r) {
return result_catch_all([&]() -> result<int> {
if (r.has_value()) {
return r.value_or_throw() + 1;
}
return std::move(r).non_value();
});
}
// Not QUITE equivalent to the coro -- lacks the exception boundary
result<int> non_catching_result_func(result<int>&& r) {
if (r.has_value()) {
return r.value_or_throw() + 1;
}
return std::move(r).non_value();
}
============================================================================
[...]lly/result/test/result_coro_bench.cpp relative time/iter iters/s
============================================================================
result_coro_success 13.61ns 73.49M
non_catching_result_func_success 3.39ns 295.00M
catching_result_func_success 4.41ns 226.88M
result_coro_error 19.55ns 51.16M
non_catching_result_func_error 9.15ns 109.26M
catching_result_func_error 10.19ns 98.10M
============================================================================
[...]lly/result/test/result_coro_bench.cpp relative time/iter iters/s
============================================================================
result_coro_success 10.59ns 94.39M
non_catching_result_func_success 3.39ns 295.00M
catching_result_func_success 4.07ns 245.81M
result_coro_error 13.66ns 73.18M
non_catching_result_func_error 9.00ns 111.11M
catching_result_func_error 10.04ns 99.63M
```
Demo program from the Compiler Explorer link above:
```cpp
#include <coroutine>
#include <optional>
// Read this LATER -- this implementation detail isn't required to understand
// the value of [[clang::coro_await_suspend_destroy]].
//
// `optional_wrapper` exists since `get_return_object()` can't return
// `std::optional` directly. C++ coroutines have a fundamental timing mismatch
// between when the return object is created and when the value is available:
//
// 1) Early (coroutine startup): `get_return_object()` is called and must return
// something immediately.
// 2) Later (when `co_return` executes): `return_value(T)` is called with the
// actual value.
// 3) Issue: If `get_return_object()` returns the storage, it's empty when
// returned, and writing to it later cannot affect the already-returned copy.
template <typename T>
struct optional_wrapper {
std::optional<T> storage_;
std::optional<T>*& pointer_;
optional_wrapper(std::optional<T>*& p) : pointer_(p) {
pointer_ = &storage_;
}
operator std::optional<T>() { return std::move(storage_); }
~optional_wrapper() {}
};
// Make `std::optional` a coroutine
template <typename T, typename... Args>
struct std::coroutine_traits<std::optional<T>, Args...> {
struct promise_type {
std::optional<T>* storagePtr_ = nullptr;
promise_type() = default;
::optional_wrapper<T> get_return_object() {
return ::optional_wrapper<T>(storagePtr_);
}
std::suspend_never initial_suspend() const noexcept { return {}; }
std::suspend_never final_suspend() const noexcept { return {}; }
void return_value(T&& value) { *storagePtr_ = std::move(value); }
void unhandled_exception() {
// Leave storage_ empty to represent error
}
};
};
template <typename T>
struct [[clang::coro_await_suspend_destroy]] optional_awaitable {
std::optional<T> opt_;
bool await_ready() const noexcept { return opt_.has_value(); }
T await_resume() { return std::move(opt_).value(); }
// Adding `noexcept` here makes the early IR much smaller, but the
// optimizer is able to discard the cruft for simpler cases.
void await_suspend_destroy(auto& promise) noexcept {
// Assume the return object defaults to "empty"
}
void await_suspend(auto handle) {
await_suspend_destroy(handle.promise());
handle.destroy();
}
};
template <typename T>
optional_awaitable<T> operator co_await(std::optional<T> opt) {
return {std::move(opt)};
}
// Non-coroutine baseline -- matches the logic of `simple_coro`.
std::optional<int> simple_func(const std::optional<int>& r) {
try {
if (r.has_value()) {
return r.value() + 1;
}
} catch (...) {}
return std::nullopt; // return empty on empty input or error
}
// Without `coro_await_suspend_destroy`, allocates its frame on-heap.
std::optional<int> simple_coro(const std::optional<int>& r) {
co_return co_await std::move(r) + 4;
}
// Without `co_await`, this optimizes much like `simple_func`.
// Bugs:
// - Doesn't short-circuit when `r` is empty, but throws
// - Lacks an exception boundary
std::optional<int> wrong_simple_coro(const std::optional<int>& r) {
co_return r.value() + 2;
}
int main() {
return
simple_func(std::optional<int>{32}).value() +
simple_coro(std::optional<int>{8}).value() +
wrong_simple_coro(std::optional<int>{16}).value();
}
```
Test Plan:
For the all-important E2E test, I used this terrible cargo-culted script to run
the new end-to-end test with the new compiler. (Yes, I realize I should only
need 10% of those `-D` settings for a successful build.)
To make sure the test covered what I meant it to do:
- I also added an `#error` in the "no attribute" branch to make sure the
compiler indeed supports the attribute.
- I ran it with a compiler not supporting the attribute, and that also
passed.
- I also tried `return 1;` from `main()` and saw the logs of the 7 successful
tests running.
```sh
#!/bin/bash -uex
set -o pipefail
LLVMBASE=/path/to/source/of/llvm-project
SYSCLANG=/path/to/origianl/bin/clang
# NB Can add `--debug-output` to debug cmake...
# Bootstrap clang -- Use `RelWithDebInfo` or the next phase is too slow!
mkdir -p bootstrap
cd bootstrap
cmake "$LLVMBASE/llvm" \
-G Ninja \
-DBUILD_SHARED_LIBS=true \
-DCMAKE_ASM_COMPILER="$SYSCLANG" \
-DCMAKE_ASM_COMPILER_ID=Clang \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_CXX_COMPILER="$SYSCLANG"++ \
-DCMAKE_C_COMPILER="$SYSCLANG" \
-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-redhat-linux-gnu \
-DLLVM_HOST_TRIPLE=x86_64-redhat-linux-gnu \
-DLLVM_ENABLE_ASSERTIONS=ON \
-DLLVM_ENABLE_BINDINGS=OFF \
-DLLVM_ENABLE_LLD=ON \
-DLLVM_ENABLE_PROJECTS="clang;lld" \
-DLLVM_OPTIMIZED_TABLEGEN=true \
-DLLVM_FORCE_ENABLE_STATS=ON \
-DLLVM_ENABLE_DUMP=ON \
-DCLANG_DEFAULT_PIE_ON_LINUX=OFF
ninja clang lld
ninja check-clang-codegencoroutines # Includes the new IR regression tests
cd ..
NEWCLANG="$PWD"/bootstrap/bin/clang
NEWLLD="$PWD"/bootstrap/bin/lld
# LIBCXX_INCLUDE_BENCHMARKS=OFF because google-benchmark bugs out
cmake "$LLVMBASE/runtimes" \
-G Ninja \
-DLLVM_DEFAULT_TARGET_TRIPLE=x86_64-redhat-linux-gnu \
-DLLVM_HOST_TRIPLE=x86_64-redhat-linux-gnu \
-DBUILD_SHARED_LIBS=true \
-DCMAKE_ASM_COMPILER="$NEWCLANG" \
-DCMAKE_ASM_COMPILER_ID=Clang \
-DCMAKE_C_COMPILER="$NEWCLANG" \
-DCMAKE_CXX_COMPILER="$NEWCLANG"++ \
-DLLVM_FORCE_ENABLE_STATS=ON \
-DLLVM_ENABLE_ASSERTIONS=ON \
-DLLVM_ENABLE_LLD=ON \
-DLIBCXX_INCLUDE_TESTS=ON \
-DLIBCXX_INCLUDE_BENCHMARKS=OFF \
-DLLVM_INCLUDE_TESTS=ON \
-DLLVM_ENABLE_RUNTIMES="libcxx;libcxxabi;libunwind" \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_EXPORT_COMPILE_COMMANDS=ON
ninja cxx-test-depends
LIBCXXBUILD=$PWD
cd "$LLVMBASE"
libcxx/utils/libcxx-lit "$LIBCXXBUILD" -v \
libcxx/test/std/language.support/support.coroutines/end.to.end/coro_await_suspend_destroy.pass.cpp
```
---
clang/docs/ReleaseNotes.rst | 6 +
clang/include/clang/Basic/Attr.td | 8 +
clang/include/clang/Basic/AttrDocs.td | 87 ++++
.../clang/Basic/DiagnosticSemaKinds.td | 3 +
clang/lib/CodeGen/CGCoroutine.cpp | 232 +++++++---
clang/lib/Sema/SemaCoroutine.cpp | 102 ++++-
.../coro-await-suspend-destroy-errors.cpp | 55 +++
.../coro-await-suspend-destroy.cpp | 129 ++++++
...a-attribute-supported-attributes-list.test | 1 +
.../coro_await_suspend_destroy.pass.cpp | 409 ++++++++++++++++++
10 files changed, 942 insertions(+), 90 deletions(-)
create mode 100644 clang/test/CodeGenCoroutines/coro-await-suspend-destroy-errors.cpp
create mode 100644 clang/test/CodeGenCoroutines/coro-await-suspend-destroy.cpp
create mode 100644 libcxx/test/std/language.support/support.coroutines/end.to.end/coro_await_suspend_destroy.pass.cpp
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 0e9fcaa5fac6a..41c412730b033 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -136,6 +136,12 @@ Removed Compiler Flags
Attribute Changes in Clang
--------------------------
+- Introduced a new attribute ``[[clang::coro_await_suspend_destroy]]``. When
+ applied to a coroutine awaiter class, it causes suspensions into this awaiter
+ to use a new `await_suspend_destroy(Promise&)` method instead of the standard
+ `await_suspend(std::coroutine_handle<...>)`. The coroutine is then destroyed.
+ This improves code speed & size for "short-circuiting" coroutines.
+
Improvements to Clang's diagnostics
-----------------------------------
- Added a separate diagnostic group ``-Wfunction-effect-redeclarations``, for the more pedantic
diff --git a/clang/include/clang/Basic/Attr.td b/clang/include/clang/Basic/Attr.td
index 30efb9f39e4f4..341848be00e7d 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -1352,6 +1352,14 @@ def CoroAwaitElidableArgument : InheritableAttr {
let SimpleHandler = 1;
}
+def CoroAwaitSuspendDestroy: InheritableAttr {
+ let Spellings = [Clang<"coro_await_suspend_destroy">];
+ let Subjects = SubjectList<[CXXRecord]>;
+ let LangOpts = [CPlusPlus];
+ let Documentation = [CoroAwaitSuspendDestroyDoc];
+ let SimpleHandler = 1;
+}
+
// OSObject-based attributes.
def OSConsumed : InheritableParamAttr {
let Spellings = [Clang<"os_consumed">];
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index 2b095ab975202..d2224d86b3900 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -9270,6 +9270,93 @@ Example:
}];
}
+def CoroAwaitSuspendDestroyDoc : Documentation {
+ let Category = DocCatDecl;
+ let Content = [{
+
+The ``[[clang::coro_await_suspend_destroy]]`` attribute may be applied to a C++
+coroutine awaiter type. When this attribute is present, the awaiter must
+implement ``void await_suspend_destroy(Promise&)``. If ``await_ready()``
+returns ``false`` at a suspension point, ``await_suspend_destroy`` will be
+called directly, bypassing the ``await_suspend(std::coroutine_handle<...>)``
+method. The coroutine being suspended will then be immediately destroyed.
+
+Logically, the new behavior is equivalent to this standard code:
+
+.. code-block:: c++
+
+ void await_suspend_destroy(YourPromise&) { ... }
+ void await_suspend(auto handle) {
+ await_suspend_destroy(handle.promise());
+ handle.destroy();
+ }
+
+This enables `await_suspend_destroy()` usage in portable awaiters — just add a
+stub ``await_suspend()`` as above. Without ``coro_await_suspend_destroy``
+support, the awaiter will behave nearly identically, with the only difference
+being heap allocation instead of stack allocation for the coroutine frame.
+
+This attribute exists to optimize short-circuiting coroutines—coroutines whose
+suspend points are either (i) trivial (like ``std::suspend_never``), or (ii)
+short-circuiting (like a ``co_await`` that can be expressed in regular control
+flow as):
+
+.. code-block:: c++
+
+ T val;
+ if (awaiter.await_ready()) {
+ val = awaiter.await_resume();
+ } else {
+ awaiter.await_suspend();
+ return /* value representing the "execution short-circuited" outcome */;
+ }
+
+The benefits of this attribute are:
+ - **Avoid heap allocations for coro frames**: Allocating short-circuiting
+ coros on the stack makes code more predictable under memory pressure.
+ Without this attribute, LLVM cannot elide heap allocation even when all
+ awaiters are short-circuiting.
+ - **Performance**: Significantly faster execution and smaller code size.
+ - **Build time**: Faster compilation due to less IR being generated.
+
+Marking your ``await_suspend_destroy`` method as ``noexcept`` can sometimes
+further improve optimization.
+
+Here is a toy example of a portable short-circuiting awaiter:
+
+.. code-block:: c++
+
+ template <typename T>
+ struct [[clang::coro_await_suspend_destroy]] optional_awaitable {
+ std::optional<T> opt_;
+ bool await_ready() const noexcept { return opt_.has_value(); }
+ T await_resume() { return std::move(opt_).value(); }
+ void await_suspend_destroy(auto& promise) {
+ // Assume the return object of the outer coro defaults to "empty".
+ }
+ // Fallback for when `coro_await_suspend_destroy` is unavailable.
+ void await_suspend(auto handle) {
+ await_suspend_destroy(handle.promise());
+ handle.destroy();
+ }
+ };
+
+If all suspension points use (i) trivial or (ii) short-circuiting awaiters,
+then the coroutine optimizes more like a plain function, with 2 caveats:
+ - **Behavior:** The coroutine promise provides an implicit exception boundary
+ (as if wrapping the function in ``try {} catch { unhandled_exception(); }``).
+ This exception handling behavior is usually desirable in robust,
+ return-value-oriented programs that need short-circuiting coroutines.
+ Otherwise, the promise can always re-throw.
+ - **Speed:** As of 2025, there is still an optimization gap between a
+ realistic short-circuiting coro, and the equivalent (but much more verbose)
+ function. For a guesstimate, expect 4-5ns per call on x86. One idea for
+ improvement is to also elide trivial suspends like `std::suspend_never`, in
+ order to hit the `HasCoroSuspend` path in `CoroEarly.cpp`.
+
+}];
+}
+
def CountedByDocs : Documentation {
let Category = DocCatField;
let Content = [{
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 116341f4b66d5..58e7dd7db86d1 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -12504,6 +12504,9 @@ def note_coroutine_promise_call_implicitly_required : Note<
def err_await_suspend_invalid_return_type : Error<
"return type of 'await_suspend' is required to be 'void' or 'bool' (have %0)"
>;
+def err_await_suspend_destroy_invalid_return_type : Error<
+ "return type of 'await_suspend_destroy' is required to be 'void' (have %0)"
+>;
def note_await_ready_no_bool_conversion : Note<
"return type of 'await_ready' is required to be contextually convertible to 'bool'"
>;
diff --git a/clang/lib/CodeGen/CGCoroutine.cpp b/clang/lib/CodeGen/CGCoroutine.cpp
index 827385f9c1a1f..d74bef592aa9c 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -174,6 +174,66 @@ static bool StmtCanThrow(const Stmt *S) {
return false;
}
+// Check if this suspend should be calling `await_suspend_destroy`
+static bool useCoroAwaitSuspendDestroy(const CoroutineSuspendExpr &S) {
+ // This can only be an `await_suspend_destroy` suspend expression if it
+ // returns void -- `buildCoawaitCalls` in `SemaCoroutine.cpp` asserts this.
+ // Moreover, when `await_suspend` returns a handle, the outermost method call
+ // is `.address()` -- making it harder to get the actual class or method.
+ if (S.getSuspendReturnType() !=
+ CoroutineSuspendExpr::SuspendReturnType::SuspendVoid) {
+ return false;
+ }
+
+ // `CGCoroutine.cpp` & `SemaCoroutine.cpp` must agree on whether this suspend
+ // expression uses `[[clang::coro_await_suspend_destroy]]`.
+ //
+ // Any mismatch is a serious bug -- we would either double-free, or fail to
+ // destroy the promise type. For this reason, we make our decision based on
+ // the method name, and fatal outside of the happy path -- including on
+ // failure to find a method name.
+ //
+ // As a debug-only check we also try to detect the `AwaiterClass`. This is
+ // secondary, because detection of the awaiter type can be silently broken by
+ // small `buildCoawaitCalls` AST changes.
+ StringRef SuspendMethodName; // Primary
+ CXXRecordDecl *AwaiterClass = nullptr; // Debug-only, best-effort
+ if (auto *SuspendCall =
+ dyn_cast<CallExpr>(S.getSuspendExpr()->IgnoreImplicit())) {
+ if (auto *SuspendMember = dyn_cast<MemberExpr>(SuspendCall->getCallee())) {
+ if (auto *BaseExpr = SuspendMember->getBase()) {
+ // `IgnoreImplicitAsWritten` is critical since `await_suspend...` can be
+ // invoked on the base of the actual awaiter, and the base need not have
+ // the attribute. In such cases, the AST will show the true awaiter
+ // being upcast to the base.
+ AwaiterClass = BaseExpr->IgnoreImplicitAsWritten()
+ ->getType()
+ ->getAsCXXRecordDecl();
+ }
+ if (auto *SuspendMethod =
+ dyn_cast<CXXMethodDecl>(SuspendMember->getMemberDecl())) {
+ SuspendMethodName = SuspendMethod->getName();
+ }
+ }
+ }
+ if (SuspendMethodName == "await_suspend_destroy") {
+ assert(!AwaiterClass ||
+ AwaiterClass->hasAttr<CoroAwaitSuspendDestroyAttr>());
+ return true;
+ } else if (SuspendMethodName == "await_suspend") {
+ assert(!AwaiterClass ||
+ !AwaiterClass->hasAttr<CoroAwaitSuspendDestroyAttr>());
+ return false;
+ } else {
+ llvm::report_fatal_error(
+ "Wrong method in [[clang::coro_await_suspend_destroy]] check: "
+ "expected 'await_suspend' or 'await_suspend_destroy', but got '" +
+ SuspendMethodName + "'");
+ }
+
+ return false;
+}
+
// Emit suspend expression which roughly looks like:
//
// auto && x = CommonExpr();
@@ -220,6 +280,25 @@ namespace {
RValue RV;
};
}
+
+// The simplified `await_suspend_destroy` path avoids suspend intrinsics.
+static void emitAwaitSuspendDestroy(CodeGenFunction &CGF, CGCoroData &Coro,
+ llvm::Function *SuspendWrapper,
+ llvm::Value *Awaiter, llvm::Value *Frame,
+ bool AwaitSuspendCanThrow) {
+ SmallVector<llvm::Value *, 2> DirectCallArgs;
+ DirectCallArgs.push_back(Awaiter);
+ DirectCallArgs.push_back(Frame);
+
+ if (AwaitSuspendCanThrow) {
+ CGF.EmitCallOrInvoke(SuspendWrapper, DirectCallArgs);
+ } else {
+ CGF.EmitNounwindRuntimeCall(SuspendWrapper, DirectCallArgs);
+ }
+
+ CGF.EmitBranchThroughCleanup(Coro.CleanupJD);
+}
+
static LValueOrRValue emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Coro,
CoroutineSuspendExpr const &S,
AwaitKind Kind, AggValueSlot aggSlot,
@@ -234,7 +313,6 @@ static LValueOrRValue emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
auto Prefix = buildSuspendPrefixStr(Coro, Kind);
BasicBlock *ReadyBlock = CGF.createBasicBlock(Prefix + Twine(".ready"));
BasicBlock *SuspendBlock = CGF.createBasicBlock(Prefix + Twine(".suspend"));
- BasicBlock *CleanupBlock = CGF.createBasicBlock(Prefix + Twine(".cleanup"));
// If expression is ready, no need to suspend.
CGF.EmitBranchOnBoolExpr(S.getReadyExpr(), ReadyBlock, SuspendBlock, 0);
@@ -243,95 +321,105 @@ static LValueOrRValue emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
CGF.EmitBlock(SuspendBlock);
auto &Builder = CGF.Builder;
- llvm::Function *CoroSave = CGF.CGM.getIntrinsic(llvm::Intrinsic::coro_save);
- auto *NullPtr = llvm::ConstantPointerNull::get(CGF.CGM.Int8PtrTy);
- auto *SaveCall = Builder.CreateCall(CoroSave, {NullPtr});
auto SuspendWrapper = CodeGenFunction(CGF.CGM).generateAwaitSuspendWrapper(
CGF.CurFn->getName(), Prefix, S);
- CGF.CurCoro.InSuspendBlock = true;
-
assert(CGF.CurCoro.Data && CGF.CurCoro.Data->CoroBegin &&
"expected to be called in coroutine context");
- SmallVector<llvm::Value *, 3> SuspendIntrinsicCallArgs;
- SuspendIntrinsicCallArgs.push_back(
- CGF.getOrCreateOpaqueLValueMapping(S.getOpaqueValue()).getPointer(CGF));
-
- SuspendIntrinsicCallArgs.push_back(CGF.CurCoro.Data->CoroBegin);
- SuspendIntrinsicCallArgs.push_back(SuspendWrapper);
-
- const auto SuspendReturnType = S.getSuspendReturnType();
- llvm::Intrinsic::ID AwaitSuspendIID;
-
- switch (SuspendReturnType) {
- case CoroutineSuspendExpr::SuspendReturnType::SuspendVoid:
- AwaitSuspendIID = llvm::Intrinsic::coro_await_suspend_void;
- break;
- case CoroutineSuspendExpr::SuspendReturnType::SuspendBool:
- AwaitSuspendIID = llvm::Intrinsic::coro_await_suspend_bool;
- break;
- case CoroutineSuspendExpr::SuspendReturnType::SuspendHandle:
- AwaitSuspendIID = llvm::Intrinsic::coro_await_suspend_handle;
- break;
- }
-
- llvm::Function *AwaitSuspendIntrinsic = CGF.CGM.getIntrinsic(AwaitSuspendIID);
-
// SuspendHandle might throw since it also resumes the returned handle.
+ const auto SuspendReturnType = S.getSuspendReturnType();
const bool AwaitSuspendCanThrow =
SuspendReturnType ==
CoroutineSuspendExpr::SuspendReturnType::SuspendHandle ||
StmtCanThrow(S.getSuspendExpr());
- llvm::CallBase *SuspendRet = nullptr;
- // FIXME: add call attributes?
- if (AwaitSuspendCanThrow)
- SuspendRet =
- CGF.EmitCallOrInvoke(AwaitSuspendIntrinsic, SuspendIntrinsicCallArgs);
- else
- SuspendRet = CGF.EmitNounwindRuntimeCall(AwaitSuspendIntrinsic,
- SuspendIntrinsicCallArgs);
+ llvm::Value *Awaiter =
+ CGF.getOrCreateOpaqueLValueMapping(S.getOpaqueValue()).getPointer(CGF);
+ llvm::Value *Frame = CGF.CurCoro.Data->CoroBegin;
- assert(SuspendRet);
- CGF.CurCoro.InSuspendBlock = false;
+ if (useCoroAwaitSuspendDestroy(S)) { // Call `await_suspend_destroy` & cleanup
+ emitAwaitSuspendDestroy(CGF, Coro, SuspendWrapper, Awaiter, Frame,
+ AwaitSuspendCanThrow);
+ } else { // Normal suspend path -- can actually suspend, uses intrinsics
+ CGF.CurCoro.InSuspendBlock = true;
- switch (SuspendReturnType) {
- case CoroutineSuspendExpr::SuspendReturnType::SuspendVoid:
- assert(SuspendRet->getType()->isVoidTy());
- break;
- case CoroutineSuspendExpr::SuspendReturnType::SuspendBool: {
- assert(SuspendRet->getType()->isIntegerTy());
-
- // Veto suspension if requested by bool returning await_suspend.
- BasicBlock *RealSuspendBlock =
- CGF.createBasicBlock(Prefix + Twine(".suspend.bool"));
- CGF.Builder.CreateCondBr(SuspendRet, RealSuspendBlock, ReadyBlock);
- CGF.EmitBlock(RealSuspendBlock);
- break;
- }
- case CoroutineSuspendExpr::SuspendReturnType::SuspendHandle: {
- assert(SuspendRet->getType()->isVoidTy());
- break;
- }
- }
+ SmallVector<llvm::Value *, 3> SuspendIntrinsicCallArgs;
+ SuspendIntrinsicCallArgs.push_back(Awaiter);
+ SuspendIntrinsicCallArgs.push_back(Frame);
+ SuspendIntrinsicCallArgs.push_back(SuspendWrapper);
+ BasicBlock *CleanupBlock = CGF.createBasicBlock(Prefix + Twine(".cleanup"));
- // Emit the suspend point.
- const bool IsFinalSuspend = (Kind == AwaitKind::Final);
- llvm::Function *CoroSuspend =
- CGF.CGM.getIntrinsic(llvm::Intrinsic::coro_suspend);
- auto *SuspendResult = Builder.CreateCall(
- CoroSuspend, {SaveCall, Builder.getInt1(IsFinalSuspend)});
+ llvm::Function *CoroSave = CGF.CGM.getIntrinsic(llvm::Intrinsic::coro_save);
+ auto *NullPtr = llvm::ConstantPointerNull::get(CGF.CGM.Int8PtrTy);
+ auto *SaveCall = Builder.CreateCall(CoroSave, {NullPtr});
- // Create a switch capturing three possible continuations.
- auto *Switch = Builder.CreateSwitch(SuspendResult, Coro.SuspendBB, 2);
- Switch->addCase(Builder.getInt8(0), ReadyBlock);
- Switch->addCase(Builder.getInt8(1), CleanupBlock);
+ llvm::Intrinsic::ID AwaitSuspendIID;
- // Emit cleanup for this suspend point.
- CGF.EmitBlock(CleanupBlock);
- CGF.EmitBranchThroughCleanup(Coro.CleanupJD);
+ switch (SuspendReturnType) {
+ case CoroutineSuspendExpr::SuspendReturnType::SuspendVoid:
+ AwaitSuspendIID = llvm::Intrinsic::coro_await_suspend_void;
+ break;
+ case CoroutineSuspendExpr::SuspendReturnType::SuspendBool:
+ AwaitSuspendIID = llvm::Intrinsic::coro_await_suspend_bool;
+ break;
+ case CoroutineSuspendExpr::SuspendReturnType::SuspendHandle:
+ AwaitSuspendIID = llvm::Intrinsic::coro_await_suspend_handle;
+ break;
+ }
+
+ llvm::Function *AwaitSuspendIntrinsic =
+ CGF.CGM.getIntrinsic(AwaitSuspendIID);
+
+ llvm::CallBase *SuspendRet = nullptr;
+ // FIXME: add call attributes?
+ if (AwaitSuspendCanThrow)
+ SuspendRet =
+ CGF.EmitCallOrInvoke(AwaitSuspendIntrinsic, SuspendIntrinsicCallArgs);
+ else
+ SuspendRet = CGF.EmitNounwindRuntimeCall(AwaitSuspendIntrinsic,
+ SuspendIntrinsicCallArgs);
+
+ assert(SuspendRet);
+ CGF.CurCoro.InSuspendBlock = false;
+
+ switch (SuspendReturnType) {
+ case CoroutineSuspendExpr::SuspendReturnType::SuspendVoid:
+ assert(SuspendRet->getType()->isVoidTy());
+ break;
+ case CoroutineSuspendExpr::SuspendReturnType::SuspendBool: {
+ assert(SuspendRet->getType()->isIntegerTy());
+
+ // Veto suspension if requested by bool returning await_suspend.
+ BasicBlock *RealSuspendBlock =
+ CGF.createBasicBlock(Prefix + Twine(".suspend.bool"));
+ CGF.Builder.CreateCondBr(SuspendRet, RealSuspendBlock, ReadyBlock);
+ CGF.EmitBlock(RealSuspendBlock);
+ break;
+ }
+ case CoroutineSuspendExpr::SuspendReturnType::SuspendHandle: {
+ assert(SuspendRet->getType()->isVoidTy());
+ break;
+ }
+ }
+
+ // Emit the suspend point.
+ const bool IsFinalSuspend = (Kind == AwaitKind::Final);
+ llvm::Function *CoroSuspend =
+ CGF.CGM.getIntrinsic(llvm::Intrinsic::coro_suspend);
+ auto *SuspendResult = Builder.CreateCall(
+ CoroSuspend, {SaveCall, Builder.getInt1(IsFinalSuspend)});
+
+ // Create a switch capturing three possible continuations.
+ auto *Switch = Builder.CreateSwitch(SuspendResult, Coro.SuspendBB, 2);
+ Switch->addCase(Builder.getInt8(0), ReadyBlock);
+ Switch->addCase(Builder.getInt8(1), CleanupBlock);
+
+ // Emit cleanup for this suspend point.
+ CGF.EmitBlock(CleanupBlock);
+ CGF.EmitBranchThroughCleanup(Coro.CleanupJD);
+ }
// Emit await_resume expression.
CGF.EmitBlock(ReadyBlock);
diff --git a/clang/lib/Sema/SemaCoroutine.cpp b/clang/lib/Sema/SemaCoroutine.cpp
index d193a33f22393..83fe7219c9997 100644
--- a/clang/lib/Sema/SemaCoroutine.cpp
+++ b/clang/lib/Sema/SemaCoroutine.cpp
@@ -289,6 +289,45 @@ static ExprResult buildCoroutineHandle(Sema &S, QualType PromiseType,
return S.BuildCallExpr(nullptr, FromAddr.get(), Loc, FramePtr, Loc);
}
+// To support [[clang::coro_await_suspend_destroy]], this builds
+// *static_cast<Promise*>(
+// __builtin_coro_promise(handle, alignof(Promise), false))
+static ExprResult buildPromiseRef(Sema &S, QualType PromiseType,
+ SourceLocation Loc) {
+ uint64_t Align =
+ S.Context.getTypeAlign(PromiseType) / S.Context.getCharWidth();
+
+ // Build the call to __builtin_coro_promise()
+ SmallVector<Expr *, 3> Args = {
+ S.BuildBuiltinCallExpr(Loc, Builtin::BI__builtin_coro_frame, {}),
+ S.ActOnIntegerConstant(Loc, Align).get(), // alignof(Promise)
+ S.ActOnCXXBoolLiteral(Loc, tok::kw_false).get()}; // false
+ ExprResult CoroPromiseCall =
+ S.BuildBuiltinCallExpr(Loc, Builtin::BI__builtin_coro_promise, Args);
+
+ if (CoroPromiseCall.isInvalid())
+ return ExprError();
+
+ // Cast to Promise*
+ ExprResult CastExpr = S.ImpCastExprToType(
+ CoroPromiseCall.get(), S.Context.getPointerType(PromiseType), CK_BitCast);
+ if (CastExpr.isInvalid())
+ return ExprError();
+
+ // Dereference to get Promise&
+ return S.CreateBuiltinUnaryOp(Loc, UO_Deref, CastExpr.get());
+}
+
+static bool hasCoroAwaitSuspendDestroyAttr(Expr *Awaiter) {
+ QualType AwaiterType = Awaiter->getType();
+ if (auto *RD = AwaiterType->getAsCXXRecordDecl()) {
+ if (RD->hasAttr<CoroAwaitSuspendDestroyAttr>()) {
+ return true;
+ }
+ }
+ return false;
+}
+
struct ReadySuspendResumeResult {
enum AwaitCallType { ACT_Ready, ACT_Suspend, ACT_Resume };
Expr *Results[3];
@@ -399,15 +438,30 @@ static ReadySuspendResumeResult buildCoawaitCalls(Sema &S, VarDecl *CoroPromise,
Calls.Results[ACT::ACT_Ready] = S.MaybeCreateExprWithCleanups(Conv.get());
}
- ExprResult CoroHandleRes =
- buildCoroutineHandle(S, CoroPromise->getType(), Loc);
- if (CoroHandleRes.isInvalid()) {
- Calls.IsInvalid = true;
- return Calls;
+ // For awaiters with `[[clang::coro_await_suspend_destroy]]`, we call
+ // `void await_suspend_destroy(Promise&)` & promptly destroy the coro.
+ CallExpr *AwaitSuspend = nullptr;
+ bool UseAwaitSuspendDestroy = hasCoroAwaitSuspendDestroyAttr(Operand);
+ if (UseAwaitSuspendDestroy) {
+ ExprResult PromiseRefRes = buildPromiseRef(S, CoroPromise->getType(), Loc);
+ if (PromiseRefRes.isInvalid()) {
+ Calls.IsInvalid = true;
+ return Calls;
+ }
+ Expr *PromiseRef = PromiseRefRes.get();
+ AwaitSuspend = cast_or_null<CallExpr>(
+ BuildSubExpr(ACT::ACT_Suspend, "await_suspend_destroy", PromiseRef));
+ } else { // The standard `await_suspend(std::coroutine_handle<...>)`
+ ExprResult CoroHandleRes =
+ buildCoroutineHandle(S, CoroPromise->getType(), Loc);
+ if (CoroHandleRes.isInvalid()) {
+ Calls.IsInvalid = true;
+ return Calls;
+ }
+ Expr *CoroHandle = CoroHandleRes.get();
+ AwaitSuspend = cast_or_null<CallExpr>(
+ BuildSubExpr(ACT::ACT_Suspend, "await_suspend", CoroHandle));
}
- Expr *CoroHandle = CoroHandleRes.get();
- CallExpr *AwaitSuspend = cast_or_null<CallExpr>(
- BuildSubExpr(ACT::ACT_Suspend, "await_suspend", CoroHandle));
if (!AwaitSuspend)
return Calls;
if (!AwaitSuspend->getType()->isDependentType()) {
@@ -417,25 +471,37 @@ static ReadySuspendResumeResult buildCoawaitCalls(Sema &S, VarDecl *CoroPromise,
// type Z.
QualType RetType = AwaitSuspend->getCallReturnType(S.Context);
- // Support for coroutine_handle returning await_suspend.
- if (Expr *TailCallSuspend =
- maybeTailCall(S, RetType, AwaitSuspend, Loc))
+ auto EmitAwaitSuspendDiag = [&](unsigned int DiagCode) {
+ S.Diag(AwaitSuspend->getCalleeDecl()->getLocation(), DiagCode) << RetType;
+ S.Diag(Loc, diag::note_coroutine_promise_call_implicitly_required)
+ << AwaitSuspend->getDirectCallee();
+ Calls.IsInvalid = true;
+ };
+
+ // `await_suspend_destroy` must return `void` -- and `CGCoroutine.cpp`
+ // critically depends on this in `hasCoroAwaitSuspendDestroyAttr`.
+ if (UseAwaitSuspendDestroy) {
+ if (RetType->isVoidType()) {
+ Calls.Results[ACT::ACT_Suspend] =
+ S.MaybeCreateExprWithCleanups(AwaitSuspend);
+ } else {
+ EmitAwaitSuspendDiag(
+ diag::err_await_suspend_destroy_invalid_return_type);
+ }
+ // Support for coroutine_handle returning await_suspend.
+ } else if (Expr *TailCallSuspend =
+ maybeTailCall(S, RetType, AwaitSuspend, Loc)) {
// Note that we don't wrap the expression with ExprWithCleanups here
// because that might interfere with tailcall contract (e.g. inserting
// clean up instructions in-between tailcall and return). Instead
// ExprWithCleanups is wrapped within maybeTailCall() prior to the resume
// call.
Calls.Results[ACT::ACT_Suspend] = TailCallSuspend;
- else {
+ } else {
// non-class prvalues always have cv-unqualified types
if (RetType->isReferenceType() ||
(!RetType->isBooleanType() && !RetType->isVoidType())) {
- S.Diag(AwaitSuspend->getCalleeDecl()->getLocation(),
- diag::err_await_suspend_invalid_return_type)
- << RetType;
- S.Diag(Loc, diag::note_coroutine_promise_call_implicitly_required)
- << AwaitSuspend->getDirectCallee();
- Calls.IsInvalid = true;
+ EmitAwaitSuspendDiag(diag::err_await_suspend_invalid_return_type);
} else
Calls.Results[ACT::ACT_Suspend] =
S.MaybeCreateExprWithCleanups(AwaitSuspend);
diff --git a/clang/test/CodeGenCoroutines/coro-await-suspend-destroy-errors.cpp b/clang/test/CodeGenCoroutines/coro-await-suspend-destroy-errors.cpp
new file mode 100644
index 0000000000000..6a082c15f2581
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-await-suspend-destroy-errors.cpp
@@ -0,0 +1,55 @@
+// RUN: %clang_cc1 -std=c++20 -verify %s
+
+#include "Inputs/coroutine.h"
+
+// Coroutine type with `std::suspend_never` for initial/final suspend
+struct Task {
+ struct promise_type {
+ Task get_return_object() { return {}; }
+ std::suspend_never initial_suspend() { return {}; }
+ std::suspend_never final_suspend() noexcept { return {}; }
+ void return_void() {}
+ void unhandled_exception() {}
+ };
+};
+
+struct [[clang::coro_await_suspend_destroy]] WrongReturnTypeAwaitable {
+ bool await_ready() { return false; }
+ bool await_suspend_destroy(auto& promise) { return true; } // expected-error {{return type of 'await_suspend_destroy' is required to be 'void' (have 'bool')}}
+ void await_suspend(auto handle) {
+ await_suspend_destroy(handle.promise());
+ handle.destroy();
+ }
+ void await_resume() {}
+};
+
+Task test_invalid_destroying_await() {
+ co_await WrongReturnTypeAwaitable{}; // expected-note {{call to 'await_suspend_destroy<Task::promise_type>' implicitly required by coroutine function here}}
+}
+
+struct [[clang::coro_await_suspend_destroy]] MissingMethodAwaitable {
+ bool await_ready() { return false; }
+ // Missing await_suspend_destroy method
+ void await_suspend(auto handle) {
+ handle.destroy();
+ }
+ void await_resume() {}
+};
+
+Task test_missing_method() {
+ co_await MissingMethodAwaitable{}; // expected-error {{no member named 'await_suspend_destroy' in 'MissingMethodAwaitable'}}
+}
+
+struct [[clang::coro_await_suspend_destroy]] WrongParameterTypeAwaitable {
+ bool await_ready() { return false; }
+ void await_suspend_destroy(int x) {} // expected-note {{passing argument to parameter 'x' here}}
+ void await_suspend(auto handle) {
+ await_suspend_destroy(handle.promise());
+ handle.destroy();
+ }
+ void await_resume() {}
+};
+
+Task test_wrong_parameter_type() {
+ co_await WrongParameterTypeAwaitable{}; // expected-error {{no viable conversion from 'std::coroutine_traits<Task>::promise_type' (aka 'Task::promise_type') to 'int'}}
+}
diff --git a/clang/test/CodeGenCoroutines/coro-await-suspend-destroy.cpp b/clang/test/CodeGenCoroutines/coro-await-suspend-destroy.cpp
new file mode 100644
index 0000000000000..fa1dbf475e56c
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-await-suspend-destroy.cpp
@@ -0,0 +1,129 @@
+// RUN: %clang_cc1 -std=c++20 -triple x86_64-unknown-linux-gnu -emit-llvm -o - %s \
+// RUN: -disable-llvm-passes | FileCheck %s --check-prefix=CHECK-INITIAL
+// RUN: %clang_cc1 -std=c++20 -triple x86_64-unknown-linux-gnu -emit-llvm -o - %s \
+// RUN: -O2 | FileCheck %s --check-prefix=CHECK-OPTIMIZED
+
+#include "Inputs/coroutine.h"
+
+// Awaitable with `coro_await_suspend_destroy` attribute
+struct [[clang::coro_await_suspend_destroy]] DestroyingAwaitable {
+ bool await_ready() { return false; }
+ void await_suspend_destroy(auto& promise) {}
+ void await_suspend(auto handle) {
+ await_suspend_destroy(handle.promise());
+ handle.destroy();
+ }
+ void await_resume() {}
+};
+
+// Awaitable without `coro_await_suspend_destroy` (normal behavior)
+struct NormalAwaitable {
+ bool await_ready() { return false; }
+ void await_suspend(std::coroutine_handle<> h) {}
+ void await_resume() {}
+};
+
+// Coroutine type with `std::suspend_never` for initial/final suspend
+struct Task {
+ struct promise_type {
+ Task get_return_object() { return {}; }
+ std::suspend_never initial_suspend() { return {}; }
+ std::suspend_never final_suspend() noexcept { return {}; }
+ void return_void() {}
+ void unhandled_exception() {}
+ };
+};
+
+// Single co_await with coro_await_suspend_destroy.
+// Should result in no allocation after optimization.
+Task test_single_destroying_await() {
+ co_await DestroyingAwaitable{};
+}
+
+// CHECK-INITIAL-LABEL: define{{.*}} void @_Z28test_single_destroying_awaitv
+// CHECK-INITIAL: call{{.*}} @llvm.coro.alloc
+// CHECK-INITIAL: call{{.*}} @llvm.coro.begin
+
+// CHECK-OPTIMIZED-LABEL: define{{.*}} void @_Z28test_single_destroying_awaitv
+// CHECK-OPTIMIZED-NOT: call{{.*}} @llvm.coro.alloc
+// CHECK-OPTIMIZED-NOT: call{{.*}} malloc
+// CHECK-OPTIMIZED-NOT: call{{.*}} @_Znwm
+
+// Test multiple `co_await`s, all with `coro_await_suspend_destroy`.
+// This should also result in no allocation after optimization.
+Task test_multiple_destroying_awaits(bool condition) {
+ co_await DestroyingAwaitable{};
+ co_await DestroyingAwaitable{};
+ if (condition) {
+ co_await DestroyingAwaitable{};
+ }
+}
+
+// CHECK-INITIAL-LABEL: define{{.*}} void @_Z31test_multiple_destroying_awaitsb
+// CHECK-INITIAL: call{{.*}} @llvm.coro.alloc
+// CHECK-INITIAL: call{{.*}} @llvm.coro.begin
+
+// CHECK-OPTIMIZED-LABEL: define{{.*}} void @_Z31test_multiple_destroying_awaitsb
+// CHECK-OPTIMIZED-NOT: call{{.*}} @llvm.coro.alloc
+// CHECK-OPTIMIZED-NOT: call{{.*}} malloc
+// CHECK-OPTIMIZED-NOT: call{{.*}} @_Znwm
+
+// Mixed awaits - some with `coro_await_suspend_destroy`, some without.
+// We should still see allocation because not all awaits destroy the coroutine.
+Task test_mixed_awaits() {
+ co_await NormalAwaitable{}; // Must precede "destroy" to be reachable
+ co_await DestroyingAwaitable{};
+}
+
+// CHECK-INITIAL-LABEL: define{{.*}} void @_Z17test_mixed_awaitsv
+// CHECK-INITIAL: call{{.*}} @llvm.coro.alloc
+// CHECK-INITIAL: call{{.*}} @llvm.coro.begin
+
+// CHECK-OPTIMIZED-LABEL: define{{.*}} void @_Z17test_mixed_awaitsv
+// CHECK-OPTIMIZED: call{{.*}} @_Znwm
+
+
+// Check the attribute detection affects control flow.
+Task test_attribute_detection() {
+ co_await DestroyingAwaitable{};
+ // Unreachable in OPTIMIZED, so those builds don't see an allocation.
+ co_await NormalAwaitable{};
+}
+
+// Check that we skip the normal suspend intrinsic and go directly to cleanup.
+//
+// CHECK-INITIAL-LABEL: define{{.*}} void @_Z24test_attribute_detectionv
+// CHECK-INITIAL: call{{.*}} @_Z24test_attribute_detectionv.__await_suspend_wrapper__await
+// CHECK-INITIAL-NEXT: br label %cleanup5
+// CHECK-INITIAL-NOT: call{{.*}} @llvm.coro.suspend
+// CHECK-INITIAL: call{{.*}} @_Z24test_attribute_detectionv.__await_suspend_wrapper__await
+// CHECK-INITIAL: call{{.*}} @llvm.coro.suspend
+// CHECK-INITIAL: call{{.*}} @_Z24test_attribute_detectionv.__await_suspend_wrapper__final
+
+// Since `co_await DestroyingAwaitable{}` gets converted into an unconditional
+// branch, the `co_await NormalAwaitable{}` is unreachable in optimized builds.
+//
+// CHECK-OPTIMIZED-NOT: call{{.*}} @llvm.coro.alloc
+// CHECK-OPTIMIZED-NOT: call{{.*}} malloc
+// CHECK-OPTIMIZED-NOT: call{{.*}} @_Znwm
+
+// Template awaitable with `coro_await_suspend_destroy` attribute
+template<typename T>
+struct [[clang::coro_await_suspend_destroy]] TemplateDestroyingAwaitable {
+ bool await_ready() { return false; }
+ void await_suspend_destroy(auto& promise) {}
+ void await_suspend(auto handle) {
+ await_suspend_destroy(handle.promise());
+ handle.destroy();
+ }
+ void await_resume() {}
+};
+
+Task test_template_destroying_await() {
+ co_await TemplateDestroyingAwaitable<int>{};
+}
+
+// CHECK-OPTIMIZED-LABEL: define{{.*}} void @_Z30test_template_destroying_awaitv
+// CHECK-OPTIMIZED-NOT: call{{.*}} @llvm.coro.alloc
+// CHECK-OPTIMIZED-NOT: call{{.*}} malloc
+// CHECK-OPTIMIZED-NOT: call{{.*}} @_Znwm
diff --git a/clang/test/Misc/pragma-attribute-supported-attributes-list.test b/clang/test/Misc/pragma-attribute-supported-attributes-list.test
index 05693538252aa..43327744ffc8a 100644
--- a/clang/test/Misc/pragma-attribute-supported-attributes-list.test
+++ b/clang/test/Misc/pragma-attribute-supported-attributes-list.test
@@ -62,6 +62,7 @@
// CHECK-NEXT: Convergent (SubjectMatchRule_function)
// CHECK-NEXT: CoroAwaitElidable (SubjectMatchRule_record)
// CHECK-NEXT: CoroAwaitElidableArgument (SubjectMatchRule_variable_is_parameter)
+// CHECK-NEXT: CoroAwaitSuspendDestroy (SubjectMatchRule_record)
// CHECK-NEXT: CoroDisableLifetimeBound (SubjectMatchRule_function)
// CHECK-NEXT: CoroLifetimeBound (SubjectMatchRule_record)
// CHECK-NEXT: CoroOnlyDestroyWhenComplete (SubjectMatchRule_record)
diff --git a/libcxx/test/std/language.support/support.coroutines/end.to.end/coro_await_suspend_destroy.pass.cpp b/libcxx/test/std/language.support/support.coroutines/end.to.end/coro_await_suspend_destroy.pass.cpp
new file mode 100644
index 0000000000000..1b48b1523bf12
--- /dev/null
+++ b/libcxx/test/std/language.support/support.coroutines/end.to.end/coro_await_suspend_destroy.pass.cpp
@@ -0,0 +1,409 @@
+//===-- Integration test for `clang::co_await_suspend_destroy` ------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+// Test for the `coro_await_suspend_destroy` attribute and
+// `await_suspend_destroy` method.
+//
+// Per `AttrDocs.td`, using `coro_await_suspend_destroy` with
+// `await_suspend_destroy` should be equivalent to providing a stub
+// `await_suspend` that calls `await_suspend_destroy` and then destroys the
+// coroutine handle.
+//
+// This test logs control flow in a variety of scenarios (controlled by
+// `test_toggles`), and checks that the execution traces are identical for
+// awaiters with/without the attribute. We currently test all combinations of
+// error injection points to ensure behavioral equivalence.
+//
+// In contrast to Clang `lit` tests, this makes it easy to verify non-divergence
+// of functional behavior of the entire coroutine across many scenarios,
+// including exception handling, early returns, and mixed usage with legacy
+// awaitables.
+//
+//===----------------------------------------------------------------------===//
+
+// UNSUPPORTED: c++03, c++11, c++14, c++17
+
+#if __has_cpp_attribute(clang::coro_await_suspend_destroy)
+# define ATTR_CORO_AWAIT_SUSPEND_DESTROY [[clang::coro_await_suspend_destroy]]
+#else
+# define ATTR_CORO_AWAIT_SUSPEND_DESTROY
+#endif
+
+#include <cassert>
+#include <coroutine>
+#include <exception>
+#include <iostream>
+#include <memory>
+#include <optional>
+#include <string>
+
+struct my_err : std::exception {};
+
+enum test_toggles {
+ throw_in_convert_optional_wrapper = 0,
+ throw_in_return_value,
+ throw_in_await_resume,
+ throw_in_await_suspend_destroy,
+ dynamic_short_circuit, // Does not apply to `..._shortcircuits_to_empty` tests
+ largest = dynamic_short_circuit // for array in `test_driver`
+};
+
+enum test_event {
+ unset = 0,
+ // Besides events, we also log various integers between 1 and 9999 that
+ // disambiguate different awaiters, or represent different return values.
+ convert_optional_wrapper = 10000,
+ destroy_return_object,
+ destroy_promise,
+ get_return_object,
+ initial_suspend,
+ final_suspend,
+ return_value,
+ throw_return_value,
+ unhandled_exception,
+ await_ready,
+ await_resume,
+ destroy_optional_awaitable,
+ throw_await_resume,
+ await_suspend_destroy,
+ throw_await_suspend_destroy,
+ await_suspend,
+ coro_catch,
+ throw_convert_optional_wrapper,
+};
+
+struct test_driver {
+ static constexpr int max_events = 1000;
+
+ bool toggles_[test_toggles::largest + 1] = {};
+ int events_[max_events] = {};
+ int cur_event_ = 0;
+
+ bool toggles(test_toggles toggle) const { return toggles_[toggle]; }
+ void log(auto&&... events) {
+ for (auto event : {static_cast<int>(events)...}) {
+ assert(cur_event_ < max_events);
+ events_[cur_event_++] = event;
+ }
+ }
+};
+
+// `optional_wrapper` exists since `get_return_object()` can't return
+// `std::optional` directly. C++ coroutines have a fundamental timing mismatch
+// between when the return object is created and when the value is available:
+//
+// 1) Early (coroutine startup): `get_return_object()` is called and must return
+// something immediately.
+// 2) Later (when `co_return` executes): `return_value(T)` is called with the
+// actual value.
+// 3) Issue: If `get_return_object()` returns the storage, it's empty when
+// returned, and writing to it later cannot affect the already-returned copy.
+template <typename T>
+struct optional_wrapper {
+ test_driver& driver_;
+ std::optional<T> storage_;
+ std::optional<T>*& pointer_;
+ optional_wrapper(test_driver& driver, std::optional<T>*& p) : driver_(driver), pointer_(p) { pointer_ = &storage_; }
+ operator std::optional<T>() {
+ if (driver_.toggles(test_toggles::throw_in_convert_optional_wrapper)) {
+ driver_.log(test_event::throw_convert_optional_wrapper);
+ throw my_err();
+ }
+ driver_.log(test_event::convert_optional_wrapper);
+ return std::move(storage_);
+ }
+ ~optional_wrapper() { driver_.log(test_event::destroy_return_object); }
+};
+
+// Make `std::optional` a coroutine
+template <typename T, typename... Args>
+struct std::coroutine_traits<std::optional<T>, test_driver&, Args...> {
+ struct promise_type {
+ std::optional<T>* storagePtr_ = nullptr;
+ test_driver& driver_;
+
+ promise_type(test_driver& driver, auto&&...) : driver_(driver) {}
+ ~promise_type() { driver_.log(test_event::destroy_promise); }
+ optional_wrapper<T> get_return_object() {
+ driver_.log(test_event::get_return_object);
+ return optional_wrapper<T>(driver_, storagePtr_);
+ }
+ std::suspend_never initial_suspend() const noexcept {
+ driver_.log(test_event::initial_suspend);
+ return {};
+ }
+ std::suspend_never final_suspend() const noexcept {
+ driver_.log(test_event::final_suspend);
+ return {};
+ }
+ void return_value(T value) {
+ driver_.log(test_event::return_value, value);
+ if (driver_.toggles(test_toggles::throw_in_return_value)) {
+ driver_.log(test_event::throw_return_value);
+ throw my_err();
+ }
+ *storagePtr_ = std::move(value);
+ }
+ void unhandled_exception() {
+ // Leave `*storagePtr_` empty to represent error
+ driver_.log(test_event::unhandled_exception);
+ }
+ };
+};
+
+template <typename T, bool HasAttr>
+struct base_optional_awaitable {
+ test_driver& driver_;
+ int id_;
+ std::optional<T> opt_;
+
+ ~base_optional_awaitable() { driver_.log(test_event::destroy_optional_awaitable, id_); }
+
+ bool await_ready() const noexcept {
+ driver_.log(test_event::await_ready, id_);
+ return opt_.has_value();
+ }
+ T await_resume() {
+ if (driver_.toggles(test_toggles::throw_in_await_resume)) {
+ driver_.log(test_event::throw_await_resume, id_);
+ throw my_err();
+ }
+ driver_.log(test_event::await_resume, id_);
+ return std::move(opt_).value();
+ }
+ void await_suspend_destroy(auto& promise) {
+#if __has_cpp_attribute(clang::coro_await_suspend_destroy)
+ if constexpr (HasAttr) {
+ // This is just here so that old & new events compare exactly equal.
+ driver_.log(test_event::await_suspend);
+ }
+#endif
+ assert(promise.storagePtr_);
+ if (driver_.toggles(test_toggles::throw_in_await_suspend_destroy)) {
+ driver_.log(test_event::throw_await_suspend_destroy, id_);
+ throw my_err();
+ }
+ driver_.log(test_event::await_suspend_destroy, id_);
+ }
+ void await_suspend(auto handle) {
+ driver_.log(test_event::await_suspend);
+ await_suspend_destroy(handle.promise());
+ handle.destroy();
+ }
+};
+
+template <typename T>
+struct old_optional_awaitable : base_optional_awaitable<T, false> {};
+
+template <typename T>
+struct ATTR_CORO_AWAIT_SUSPEND_DESTROY new_optional_awaitable : base_optional_awaitable<T, true> {};
+
+void enumerate_toggles(auto lambda) {
+ // Generate all combinations of toggle values
+ for (int mask = 0; mask <= (1 << (test_toggles::largest + 1)) - 1; ++mask) {
+ test_driver driver;
+ for (int i = 0; i <= test_toggles::largest; ++i) {
+ driver.toggles_[i] = (mask & (1 << i)) != 0;
+ }
+ lambda(driver);
+ }
+}
+
+template <typename T>
+void check_coro_with_driver_for(auto coro_fn) {
+ enumerate_toggles([&](const test_driver& driver) {
+ auto old_driver = driver;
+ std::optional<T> old_res;
+ bool old_threw = false;
+ try {
+ old_res = coro_fn.template operator()<old_optional_awaitable<T>, T>(old_driver);
+ } catch (const my_err&) {
+ old_threw = true;
+ }
+ auto new_driver = driver;
+ std::optional<T> new_res;
+ bool new_threw = false;
+ try {
+ new_res = coro_fn.template operator()<new_optional_awaitable<T>, T>(new_driver);
+ } catch (const my_err&) {
+ new_threw = true;
+ }
+
+ // Print toggle values for debugging
+ std::string toggle_info = "Toggles: ";
+ for (int i = 0; i <= test_toggles::largest; ++i) {
+ if (driver.toggles_[i]) {
+ toggle_info += std::to_string(i) + " ";
+ }
+ }
+ toggle_info += "\n";
+ std::cerr << toggle_info.c_str() << std::endl;
+
+ assert(old_threw == new_threw);
+ assert(old_res == new_res);
+
+ // Compare events arrays directly using cur_event_ and indices
+ assert(old_driver.cur_event_ == new_driver.cur_event_);
+ for (int i = 0; i < old_driver.cur_event_; ++i) {
+ assert(old_driver.events_[i] == new_driver.events_[i]);
+ }
+ });
+}
+
+// Move-only, non-nullable type that quacks like int but stores a
+// heap-allocated int. Used to exercise the machinery with a nontrivial type.
+class heap_int {
+private:
+ std::unique_ptr<int> ptr_;
+
+public:
+ explicit heap_int(int value) : ptr_(std::make_unique<int>(value)) {}
+
+ heap_int operator+(const heap_int& other) const { return heap_int(*ptr_ + *other.ptr_); }
+
+ bool operator==(const heap_int& other) const { return *ptr_ == *other.ptr_; }
+
+ /*implicit*/ operator int() const { return *ptr_; }
+};
+
+void check_coro_with_driver(auto coro_fn) {
+ check_coro_with_driver_for<int>(coro_fn);
+ check_coro_with_driver_for<heap_int>(coro_fn);
+}
+
+template <typename Awaitable, typename T>
+std::optional<T> coro_shortcircuits_to_empty(test_driver& driver) {
+ T n = co_await Awaitable{driver, 1, std::optional<T>{11}};
+ co_await Awaitable{driver, 2, std::optional<T>{}}; // return early!
+ co_return n + co_await Awaitable{driver, 3, std::optional<T>{22}};
+}
+
+void test_coro_shortcircuits_to_empty() {
+ std::cerr << "test_coro_shortcircuits_to_empty" << std::endl;
+ check_coro_with_driver([]<typename Awaitable, typename T>(test_driver& driver) {
+ return coro_shortcircuits_to_empty<Awaitable, T>(driver);
+ });
+}
+
+template <typename Awaitable, typename T>
+std::optional<T> coro_simple_await(test_driver& driver) {
+ co_return co_await Awaitable{driver, 1, std::optional<T>{11}} +
+ co_await Awaitable{driver, 2, driver.toggles(dynamic_short_circuit) ? std::optional<T>{} : std::optional<T>{22}};
+}
+
+void test_coro_simple_await() {
+ std::cerr << "test_coro_simple_await" << std::endl;
+ check_coro_with_driver([]<typename Awaitable, typename T>(test_driver& driver) {
+ return coro_simple_await<Awaitable, T>(driver);
+ });
+}
+
+// The next pair of tests checks that adding a `try-catch` in the coroutine
+// doesn't affect control flow when `await_suspend_destroy` awaiters are in use.
+
+template <typename Awaitable, typename T>
+std::optional<T> coro_catching_shortcircuits_to_empty(test_driver& driver) {
+ try {
+ T n = co_await Awaitable{driver, 1, std::optional<T>{11}};
+ co_await Awaitable{driver, 2, std::optional<T>{}}; // return early!
+ co_return n + co_await Awaitable{driver, 3, std::optional<T>{22}};
+ } catch (...) {
+ driver.log(test_event::coro_catch);
+ throw;
+ }
+}
+
+void test_coro_catching_shortcircuits_to_empty() {
+ std::cerr << "test_coro_catching_shortcircuits_to_empty" << std::endl;
+ check_coro_with_driver([]<typename Awaitable, typename T>(test_driver& driver) {
+ return coro_catching_shortcircuits_to_empty<Awaitable, T>(driver);
+ });
+}
+
+template <typename Awaitable, typename T>
+std::optional<T> coro_catching_simple_await(test_driver& driver) {
+ try {
+ co_return co_await Awaitable{driver, 1, std::optional<T>{11}} +
+ co_await Awaitable{
+ driver, 2, driver.toggles(dynamic_short_circuit) ? std::optional<T>{} : std::optional<T>{22}};
+ } catch (...) {
+ driver.log(test_event::coro_catch);
+ throw;
+ }
+}
+
+void test_coro_catching_simple_await() {
+ std::cerr << "test_coro_catching_simple_await" << std::endl;
+ check_coro_with_driver([]<typename Awaitable, typename T>(test_driver& driver) {
+ return coro_catching_simple_await<Awaitable, T>(driver);
+ });
+}
+
+// The next pair of tests shows that the `await_suspend_destroy` code path works
+// correctly, even if it's mixed in a coroutine with legacy awaitables.
+
+template <typename Awaitable, typename T>
+std::optional<T> noneliding_coro_shortcircuits_to_empty(test_driver& driver) {
+ T n = co_await Awaitable{driver, 1, std::optional<T>{11}};
+ T n2 = co_await old_optional_awaitable<T>{driver, 2, std::optional<T>{22}};
+ co_await Awaitable{driver, 3, std::optional<T>{}}; // return early!
+ co_return n + n2 + co_await Awaitable{driver, 4, std::optional<T>{44}};
+}
+
+void test_noneliding_coro_shortcircuits_to_empty() {
+ std::cerr << "test_noneliding_coro_shortcircuits_to_empty" << std::endl;
+ check_coro_with_driver([]<typename Awaitable, typename T>(test_driver& driver) {
+ return noneliding_coro_shortcircuits_to_empty<Awaitable, T>(driver);
+ });
+}
+
+template <typename Awaitable, typename T>
+std::optional<T> noneliding_coro_simple_await(test_driver& driver) {
+ co_return co_await Awaitable{driver, 1, std::optional<T>{11}} +
+ co_await Awaitable{driver, 2, driver.toggles(dynamic_short_circuit) ? std::optional<T>{} : std::optional<T>{22}} +
+ co_await old_optional_awaitable<T>{driver, 3, std::optional<T>{33}};
+}
+
+void test_noneliding_coro_simple_await() {
+ std::cerr << "test_noneliding_coro_simple_await" << std::endl;
+ check_coro_with_driver([]<typename Awaitable, typename T>(test_driver& driver) {
+ return noneliding_coro_simple_await<Awaitable, T>(driver);
+ });
+}
+
+// Test nested coroutines (coroutines that await other coroutines)
+
+template <typename Awaitable, typename T>
+std::optional<T> inner_coro(test_driver& driver, int base_id) {
+ co_return co_await Awaitable{driver, base_id, std::optional<T>{100}} +
+ co_await Awaitable{
+ driver, base_id + 1, driver.toggles(dynamic_short_circuit) ? std::optional<T>{} : std::optional<T>{200}};
+}
+
+template <typename Awaitable, typename T>
+std::optional<T> outer_coro(test_driver& driver) {
+ T result1 = co_await Awaitable{driver, 1, inner_coro<Awaitable, T>(driver, 10)};
+ T result2 = co_await Awaitable{driver, 2, inner_coro<Awaitable, T>(driver, 20)};
+ co_return result1 + result2;
+}
+
+void test_nested_coroutines() {
+ std::cerr << "test_nested_coroutines" << std::endl;
+ check_coro_with_driver([]<typename Awaitable, typename T>(test_driver& driver) {
+ return outer_coro<Awaitable, T>(driver);
+ });
+}
+
+int main(int, char**) {
+ test_coro_shortcircuits_to_empty();
+ test_coro_simple_await();
+ test_coro_catching_shortcircuits_to_empty();
+ test_coro_catching_simple_await();
+ test_noneliding_coro_shortcircuits_to_empty();
+ test_noneliding_coro_simple_await();
+ test_nested_coroutines();
+ return 0;
+}
>From eb5557ab0eb43ff216441603d1c47615869d0bbe Mon Sep 17 00:00:00 2001
From: lesha <lesha at meta.com>
Date: Thu, 7 Aug 2025 23:38:21 -0700
Subject: [PATCH 2/3] Fix CI
---
clang/include/clang/Basic/AttrDocs.td | 32 ++++++-------
.../coro_await_suspend_destroy.pass.cpp | 48 +++++++++++++++++--
2 files changed, 60 insertions(+), 20 deletions(-)
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index d2224d86b3900..e45f692740193 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -9312,12 +9312,12 @@ flow as):
}
The benefits of this attribute are:
- - **Avoid heap allocations for coro frames**: Allocating short-circuiting
- coros on the stack makes code more predictable under memory pressure.
- Without this attribute, LLVM cannot elide heap allocation even when all
- awaiters are short-circuiting.
- - **Performance**: Significantly faster execution and smaller code size.
- - **Build time**: Faster compilation due to less IR being generated.
+- **Avoid heap allocations for coro frames**: Allocating short-circuiting
+ coros on the stack makes code more predictable under memory pressure.
+ Without this attribute, LLVM cannot elide heap allocation even when all
+ awaiters are short-circuiting.
+- **Performance**: Significantly faster execution and smaller code size.
+- **Build time**: Faster compilation due to less IR being generated.
Marking your ``await_suspend_destroy`` method as ``noexcept`` can sometimes
further improve optimization.
@@ -9343,16 +9343,16 @@ Here is a toy example of a portable short-circuiting awaiter:
If all suspension points use (i) trivial or (ii) short-circuiting awaiters,
then the coroutine optimizes more like a plain function, with 2 caveats:
- - **Behavior:** The coroutine promise provides an implicit exception boundary
- (as if wrapping the function in ``try {} catch { unhandled_exception(); }``).
- This exception handling behavior is usually desirable in robust,
- return-value-oriented programs that need short-circuiting coroutines.
- Otherwise, the promise can always re-throw.
- - **Speed:** As of 2025, there is still an optimization gap between a
- realistic short-circuiting coro, and the equivalent (but much more verbose)
- function. For a guesstimate, expect 4-5ns per call on x86. One idea for
- improvement is to also elide trivial suspends like `std::suspend_never`, in
- order to hit the `HasCoroSuspend` path in `CoroEarly.cpp`.
+- **Behavior:** The coroutine promise provides an implicit exception boundary
+ (as if wrapping the function in ``try {} catch { unhandled_exception(); }``).
+ This exception handling behavior is usually desirable in robust,
+ return-value-oriented programs that need short-circuiting coroutines.
+ Otherwise, the promise can always re-throw.
+- **Speed:** As of 2025, there is still an optimization gap between a
+ realistic short-circuiting coro, and the equivalent (but much more verbose)
+ function. For a guesstimate, expect 4-5ns per call on x86. One idea for
+ improvement is to also elide trivial suspends like `std::suspend_never`, in
+ order to hit the `HasCoroSuspend` path in `CoroEarly.cpp`.
}];
}
diff --git a/libcxx/test/std/language.support/support.coroutines/end.to.end/coro_await_suspend_destroy.pass.cpp b/libcxx/test/std/language.support/support.coroutines/end.to.end/coro_await_suspend_destroy.pass.cpp
index 1b48b1523bf12..9da8ba530edf3 100644
--- a/libcxx/test/std/language.support/support.coroutines/end.to.end/coro_await_suspend_destroy.pass.cpp
+++ b/libcxx/test/std/language.support/support.coroutines/end.to.end/coro_await_suspend_destroy.pass.cpp
@@ -40,6 +40,14 @@
#include <optional>
#include <string>
+#define DEBUG_LOG 0 // Logs break no-localization CI, set to 1 if needed
+
+#ifndef TEST_HAS_NO_EXCEPTIONS
+# define THROW(_ex) throw _ex;
+#else
+# define THROW(_ex)
+#endif
+
struct my_err : std::exception {};
enum test_toggles {
@@ -110,7 +118,7 @@ struct optional_wrapper {
operator std::optional<T>() {
if (driver_.toggles(test_toggles::throw_in_convert_optional_wrapper)) {
driver_.log(test_event::throw_convert_optional_wrapper);
- throw my_err();
+ THROW(my_err());
}
driver_.log(test_event::convert_optional_wrapper);
return std::move(storage_);
@@ -143,7 +151,7 @@ struct std::coroutine_traits<std::optional<T>, test_driver&, Args...> {
driver_.log(test_event::return_value, value);
if (driver_.toggles(test_toggles::throw_in_return_value)) {
driver_.log(test_event::throw_return_value);
- throw my_err();
+ THROW(my_err());
}
*storagePtr_ = std::move(value);
}
@@ -169,7 +177,7 @@ struct base_optional_awaitable {
T await_resume() {
if (driver_.toggles(test_toggles::throw_in_await_resume)) {
driver_.log(test_event::throw_await_resume, id_);
- throw my_err();
+ THROW(my_err());
}
driver_.log(test_event::await_resume, id_);
return std::move(opt_).value();
@@ -184,7 +192,7 @@ struct base_optional_awaitable {
assert(promise.storagePtr_);
if (driver_.toggles(test_toggles::throw_in_await_suspend_destroy)) {
driver_.log(test_event::throw_await_suspend_destroy, id_);
- throw my_err();
+ THROW(my_err());
}
driver_.log(test_event::await_suspend_destroy, id_);
}
@@ -218,20 +226,29 @@ void check_coro_with_driver_for(auto coro_fn) {
auto old_driver = driver;
std::optional<T> old_res;
bool old_threw = false;
+#ifndef TEST_HAS_NO_EXCEPTIONS
try {
+#endif
old_res = coro_fn.template operator()<old_optional_awaitable<T>, T>(old_driver);
+#ifndef TEST_HAS_NO_EXCEPTIONS
} catch (const my_err&) {
old_threw = true;
}
+#endif
auto new_driver = driver;
std::optional<T> new_res;
bool new_threw = false;
+#ifndef TEST_HAS_NO_EXCEPTIONS
try {
+#endif
new_res = coro_fn.template operator()<new_optional_awaitable<T>, T>(new_driver);
+#ifndef TEST_HAS_NO_EXCEPTIONS
} catch (const my_err&) {
new_threw = true;
}
+#endif
+#if DEBUG_LOG
// Print toggle values for debugging
std::string toggle_info = "Toggles: ";
for (int i = 0; i <= test_toggles::largest; ++i) {
@@ -241,6 +258,7 @@ void check_coro_with_driver_for(auto coro_fn) {
}
toggle_info += "\n";
std::cerr << toggle_info.c_str() << std::endl;
+#endif
assert(old_threw == new_threw);
assert(old_res == new_res);
@@ -282,7 +300,9 @@ std::optional<T> coro_shortcircuits_to_empty(test_driver& driver) {
}
void test_coro_shortcircuits_to_empty() {
+#if DEBUG_LOG
std::cerr << "test_coro_shortcircuits_to_empty" << std::endl;
+#endif
check_coro_with_driver([]<typename Awaitable, typename T>(test_driver& driver) {
return coro_shortcircuits_to_empty<Awaitable, T>(driver);
});
@@ -295,7 +315,9 @@ std::optional<T> coro_simple_await(test_driver& driver) {
}
void test_coro_simple_await() {
+#if DEBUG_LOG
std::cerr << "test_coro_simple_await" << std::endl;
+#endif
check_coro_with_driver([]<typename Awaitable, typename T>(test_driver& driver) {
return coro_simple_await<Awaitable, T>(driver);
});
@@ -306,18 +328,24 @@ void test_coro_simple_await() {
template <typename Awaitable, typename T>
std::optional<T> coro_catching_shortcircuits_to_empty(test_driver& driver) {
+#ifndef TEST_HAS_NO_EXCEPTIONS
try {
+#endif
T n = co_await Awaitable{driver, 1, std::optional<T>{11}};
co_await Awaitable{driver, 2, std::optional<T>{}}; // return early!
co_return n + co_await Awaitable{driver, 3, std::optional<T>{22}};
+#ifndef TEST_HAS_NO_EXCEPTIONS
} catch (...) {
driver.log(test_event::coro_catch);
throw;
}
+#endif
}
void test_coro_catching_shortcircuits_to_empty() {
+#if DEBUG_LOG
std::cerr << "test_coro_catching_shortcircuits_to_empty" << std::endl;
+#endif
check_coro_with_driver([]<typename Awaitable, typename T>(test_driver& driver) {
return coro_catching_shortcircuits_to_empty<Awaitable, T>(driver);
});
@@ -325,18 +353,24 @@ void test_coro_catching_shortcircuits_to_empty() {
template <typename Awaitable, typename T>
std::optional<T> coro_catching_simple_await(test_driver& driver) {
+#ifndef TEST_HAS_NO_EXCEPTIONS
try {
+#endif
co_return co_await Awaitable{driver, 1, std::optional<T>{11}} +
co_await Awaitable{
driver, 2, driver.toggles(dynamic_short_circuit) ? std::optional<T>{} : std::optional<T>{22}};
+#ifndef TEST_HAS_NO_EXCEPTIONS
} catch (...) {
driver.log(test_event::coro_catch);
throw;
}
+#endif
}
void test_coro_catching_simple_await() {
+#if DEBUG_LOG
std::cerr << "test_coro_catching_simple_await" << std::endl;
+#endif
check_coro_with_driver([]<typename Awaitable, typename T>(test_driver& driver) {
return coro_catching_simple_await<Awaitable, T>(driver);
});
@@ -354,7 +388,9 @@ std::optional<T> noneliding_coro_shortcircuits_to_empty(test_driver& driver) {
}
void test_noneliding_coro_shortcircuits_to_empty() {
+#if DEBUG_LOG
std::cerr << "test_noneliding_coro_shortcircuits_to_empty" << std::endl;
+#endif
check_coro_with_driver([]<typename Awaitable, typename T>(test_driver& driver) {
return noneliding_coro_shortcircuits_to_empty<Awaitable, T>(driver);
});
@@ -368,7 +404,9 @@ std::optional<T> noneliding_coro_simple_await(test_driver& driver) {
}
void test_noneliding_coro_simple_await() {
+#if DEBUG_LOG
std::cerr << "test_noneliding_coro_simple_await" << std::endl;
+#endif
check_coro_with_driver([]<typename Awaitable, typename T>(test_driver& driver) {
return noneliding_coro_simple_await<Awaitable, T>(driver);
});
@@ -391,7 +429,9 @@ std::optional<T> outer_coro(test_driver& driver) {
}
void test_nested_coroutines() {
+#if DEBUG_LOG
std::cerr << "test_nested_coroutines" << std::endl;
+#endif
check_coro_with_driver([]<typename Awaitable, typename T>(test_driver& driver) {
return outer_coro<Awaitable, T>(driver);
});
>From 5d6a06d27ba913bc49286a8bdea97446ba37fae2 Mon Sep 17 00:00:00 2001
From: lesha <lesha at meta.com>
Date: Fri, 8 Aug 2025 00:12:00 -0700
Subject: [PATCH 3/3] Improve doc formatting
---
clang/include/clang/Basic/AttrDocs.td | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/clang/include/clang/Basic/AttrDocs.td b/clang/include/clang/Basic/AttrDocs.td
index e45f692740193..a80b8e97efee2 100644
--- a/clang/include/clang/Basic/AttrDocs.td
+++ b/clang/include/clang/Basic/AttrDocs.td
@@ -9312,11 +9312,14 @@ flow as):
}
The benefits of this attribute are:
+
- **Avoid heap allocations for coro frames**: Allocating short-circuiting
coros on the stack makes code more predictable under memory pressure.
Without this attribute, LLVM cannot elide heap allocation even when all
awaiters are short-circuiting.
+
- **Performance**: Significantly faster execution and smaller code size.
+
- **Build time**: Faster compilation due to less IR being generated.
Marking your ``await_suspend_destroy`` method as ``noexcept`` can sometimes
@@ -9343,11 +9346,13 @@ Here is a toy example of a portable short-circuiting awaiter:
If all suspension points use (i) trivial or (ii) short-circuiting awaiters,
then the coroutine optimizes more like a plain function, with 2 caveats:
+
- **Behavior:** The coroutine promise provides an implicit exception boundary
(as if wrapping the function in ``try {} catch { unhandled_exception(); }``).
This exception handling behavior is usually desirable in robust,
return-value-oriented programs that need short-circuiting coroutines.
Otherwise, the promise can always re-throw.
+
- **Speed:** As of 2025, there is still an optimization gap between a
realistic short-circuiting coro, and the equivalent (but much more verbose)
function. For a guesstimate, expect 4-5ns per call on x86. One idea for
More information about the libcxx-commits
mailing list