[clang] [docs][coroutines] Revamp "Debugging C++ coroutines" (PR #142651)
Chuanqi Xu via cfe-commits
cfe-commits at lists.llvm.org
Wed Jun 25 02:06:45 PDT 2025
================
@@ -8,470 +8,966 @@ Debugging C++ Coroutines
Introduction
============
-For performance and other architectural reasons, the C++ Coroutines feature in
-the Clang compiler is implemented in two parts of the compiler. Semantic
-analysis is performed in Clang, and Coroutine construction and optimization
-takes place in the LLVM middle-end.
+Coroutines in C++ were introduced in C++20, and the user experience for
+debugging them can still be challenging. This document guides you how to most
+efficiently debug coroutines and how to navigate existing shortcomings in
+debuggers and compilers.
+
+Coroutines are generally used either as generators or for asynchronous
+programming. In this document, we will discuss both use cases. Even if you are
+using coroutines for asynchronous programming, you should still read the
+generators section, as it will introduce foundational debugging techniques also
+applicable to the debugging of asynchronous programming.
+
+Both compilers (clang, gcc, ...) and debuggers (lldb, gdb, ...) are
+still improving their support for coroutines. As such, we recommend using the
+latest available version of your toolchain.
+
+This document focuses on clang and lldb. The screenshots show
+[lldb-dap](https://marketplace.visualstudio.com/items?itemName=llvm-vs-code-extensions.lldb-dap)
+in combination with VS Code. The same techniques can also be used in other
+IDEs.
+
+Debugging clang-compiled binaries with gdb is possible, but requires more
+scripting. This guide comes with a basic GDB script for coroutine debugging.
+
+This guide will first showcase the more polished, bleeding-edge experience, but
+will also show you how to debug coroutines with older toolchains. In general,
+the older your toolchain, the deeper you will have to dive into the
+implementation details of coroutines (such as their ABI). The further down in
+this document you go, the more low-level, technical the content will become. If
+you are on an up-to-date toolchain, you will hopefully be able to stop reading
+earlier.
+
+Debugging generators
+====================
+
+The first major use case for coroutines in C++ are generators, i.e., functions
+which can produce values via ``co_yield``. Values are produced lazily,
+on-demand. For that purpose, every time a new value is requested the coroutine
+gets resumed. As soon as it reaches a ``co_yield`` and thereby returns the
+requested value, the coroutine is suspended again.
+
+This logic is encapsulated in a ``generator`` type similar to this one:
-However, this design forces us to generate insufficient debugging information.
-Typically, the compiler generates debug information in the Clang frontend, as
-debug information is highly language specific. However, this is not possible
-for Coroutine frames because the frames are constructed in the LLVM middle-end.
-
-To mitigate this problem, the LLVM middle end attempts to generate some debug
-information, which is unfortunately incomplete, since much of the language
-specific information is missing in the middle end.
+.. code-block:: c++
-This document describes how to use this debug information to better debug
-coroutines.
+ // generator.hpp
+ #include <coroutine>
-Terminology
-===========
+ // `generator` is a stripped down, minimal generator type.
+ template<typename T>
+ struct generator {
+ struct promise_type {
+ T current_value{};
-Due to the recent nature of C++20 Coroutines, the terminology used to describe
-the concepts of Coroutines is not settled. This section defines a common,
-understandable terminology to be used consistently throughout this document.
+ auto get_return_object() {
+ return std::coroutine_handle<promise_type>::from_promise(*this);
+ }
+ auto initial_suspend() { return std::suspend_always(); }
+ auto final_suspend() noexcept { return std::suspend_always(); }
+ auto return_void() { return std::suspend_always(); }
+ void unhandled_exception() { __builtin_unreachable(); }
+ auto yield_value(T v) {
+ current_value = v;
+ return std::suspend_always();
+ }
+ };
-coroutine type
---------------
+ generator(std::coroutine_handle<promise_type> h) : hdl(h) { hdl.resume(); }
+ ~generator() { hdl.destroy(); }
-A `coroutine function` is any function that contains any of the Coroutine
-Keywords `co_await`, `co_yield`, or `co_return`. A `coroutine type` is a
-possible return type of one of these `coroutine functions`. `Task` and
-`Generator` are commonly referred to coroutine types.
+ generator<T>& operator++() { hdl.resume(); return *this; } // resume the coroutine
+ T operator*() const { return hdl.promise().current_value; }
-coroutine
----------
+ private:
+ std::coroutine_handle<promise_type> hdl;
+ };
-By technical definition, a `coroutine` is a suspendable function. However,
-programmers typically use `coroutine` to refer to an individual instance.
-For example:
+We can then use this ``generator`` class to print the Fibonacci sequence:
.. code-block:: c++
- std::vector<Task> Coros; // Task is a coroutine type.
- for (int i = 0; i < 3; i++)
- Coros.push_back(CoroTask()); // CoroTask is a coroutine function, which
- // would return a coroutine type 'Task'.
+ #include "generator.hpp"
+ #include <iostream>
-In practice, we typically say "`Coros` contains 3 coroutines" in the above
-example, though this is not strictly correct. More technically, this should
-say "`Coros` contains 3 coroutine instances" or "Coros contains 3 coroutine
-objects."
+ generator<int> fibonacci() {
+ co_yield 0;
+ int prev = 0;
+ co_yield 1;
+ int current = 1;
+ while (true) {
+ int next = current + prev;
+ co_yield next;
+ prev = current;
+ current = next;
+ }
+ }
-In this document, we follow the common practice of using `coroutine` to refer
-to an individual `coroutine instance`, since the terms `coroutine instance` and
-`coroutine object` aren't sufficiently defined in this case.
+ template<typename T>
+ void print10Elements(generator<T>& gen) {
+ for (unsigned i = 0; i < 10; ++i) {
+ std::cerr << *gen << "\n";
+ ++gen;
+ }
+ }
-coroutine frame
----------------
+ int main() {
+ std::cerr << "Fibonacci sequence - here we go\n";
+ generator<int> fib = fibonacci();
+ for (unsigned i = 0; i < 5; ++i) {
+ ++fib;
+ }
+ print10Elements(fib);
+ }
-The C++ Standard uses `coroutine state` to describe the allocated storage. In
-the compiler, we use `coroutine frame` to describe the generated data structure
-that contains the necessary information.
+To compile this code, use ``clang++ --std=c++23 generator-example.cpp -g``.
-The structure of coroutine frames
-=================================
+Breakpoints inside the generators
+---------------------------------
-The structure of coroutine frames is defined as:
+We can set breakpoints inside coroutines just as we set them in regular
+functions. For VS Code, that means clicking next the line number in the editor.
+In the ``lldb`` CLI or in ``gdb``, you can use ``b`` to set a breakpoint.
-.. code-block:: c++
+Inspecting variables in a coroutine
+-----------------------------------
- struct {
- void (*__r)(); // function pointer to the `resume` function
- void (*__d)(); // function pointer to the `destroy` function
- promise_type; // the corresponding `promise_type`
- ... // Any other needed information
- }
+If you hit a breakpoint inside the ``fibonacci`` function, you should be able
+to inspect all local variables (``prev```, ``current```, ``next``) just like in
+a regular function.
-In the debugger, the function's name is obtainable from the address of the
-function. And the name of `resume` function is equal to the name of the
-coroutine function. So the name of the coroutine is obtainable once the
-address of the coroutine is known.
+.. image:: ./coro-generator-variables.png
-Print promise_type
-==================
+Note the two additional variables ``__promise`` and ``__coro_frame``. Those
+show the internal state of the coroutine. They are not relevant for our
+generator example, but will be relevant for asynchronous programming described
+in the next section.
-Every coroutine has a `promise_type`, which defines the behavior
-for the corresponding coroutine. In other words, if two coroutines have the
-same `promise_type`, they should behave in the same way.
-To print a `promise_type` in a debugger when stopped at a breakpoint inside a
-coroutine, printing the `promise_type` can be done by:
+Stepping out of a coroutine
+---------------------------
-.. parsed-literal::
+When single-stepping, you will notice that the debugger will leave the
+``fibonacci`` function as soon as you hit a ``co_yield`` statement. You might
+find yourself inside some standard library code. After stepping out of the
+library code, you will be back in the ``main`` function.
- print __promise
+Stepping into a coroutine
+-------------------------
-It is also possible to print the `promise_type` of a coroutine from the address
-of the coroutine frame. For example, if the address of a coroutine frame is
-0x416eb0, and the type of the `promise_type` is `task::promise_type`, printing
-the `promise_type` can be done by:
+If you stop at ``++fib`` and try to step into the generator, you will first
+find yourself inside ``operator++``. Stepping into the ``handle.resume()`` will
+not work by default.
-.. parsed-literal::
+This is because lldb does not step into functions from the standard library by
+default. To make this work, you first need to run ``settings set
+target.process.thread.step-avoid-regexp ""``. You can do so from the "Debug
+Console" towards the bottom of the screen. With that setting change, you can
+step through ``coroutine_handle::resume`` and into your generator.
----------------
ChuanqiXu9 wrote:
Agreed.
https://github.com/llvm/llvm-project/pull/142651
More information about the cfe-commits
mailing list