[Lldb-commits] [PATCH] D132815: [LLDB] Do not dereference promise pointer in `coroutine_handle` pretty printer

Mon Sep 5 08:02:51 PDT 2022

labath added inline comments.

================
Comment at: lldb/source/Plugins/Language/CPlusPlus/Coroutines.cpp:246-248
+  DataExtractor data(&promise_addr, sizeof(promise_addr),
+                     process_sp->GetByteOrder(),
+                     process_sp->GetAddressByteSize());
----------------
avogelsgesang wrote:
> labath wrote:
> > avogelsgesang wrote:
> > > labath wrote:
> > > > Have you checked there won't be a use-after-free problem here, given that this data extractor will refer to the stack object?
> > > > 
> > > > To create persistent data, you need to use the DataBufferSP constructor, but I'm wondering if we couldn't fix this by creating the (non-pointer) object using the `CreateValueObjectFromAddress` function, as above, but then actually use valobj->AddressOf as the synthetic child.
> > > > 
> > > > I am also somewhat surprised that we need to use the GetAddressOf trick here, as this seems to indicate that the coroutine contains (in the proper C "subobject" kind of way) the promise object. That's not necessarily wrong, but it makes me think we may be "breaking the cycle" at the wrong place.
> > > Thanks for taking a look!
> > > 
> > > > To create persistent data, you need to use the DataBufferSP constructor
> > > 
> > > good point, I will keep this in mind as a fallback in case we don't decide to follow any of the other directions you hinted at.
> > > 
> > > > wondering if we couldn't fix this by creating the (non-pointer) object using the CreateValueObjectFromAddress function, as above, but then actually use valobj->AddressOf as the synthetic child
> > > 
> > > Good idea! I will give it a try
> > > 
> > > 
> > > > [...] as this seems to indicate that the coroutine contains (in the proper C "subobject" kind of way) the promise object. That's not necessarily wrong, but it makes me think we may be "breaking the cycle" at the wrong place.
> > > 
> > > The physical layout of this is:
> > > ```
> > > // in the standard library
> > > template<typename promise_type>
> > > struct exception_handle<promise_type> {
> > >     __coro_frame<promise_type>* __hdl; // <--- this is the pointer we read with `GetCoroFramePtrFromHandle`
> > > };
> > > 
> > > // compiler-generated coroutine frame. Generated ad-hoc per coroutine
> > > struct __coro_frame<promise_type> {
> > >      // The ABI guaranteees that hose two pointers are always the first two pointers in the struct.
> > >      void (*resume)(void*); // function pointer for type erasure
> > >      void (*destroy)(void*); // function pointer for type erasure
> > >      // Next comes our promise type. This is under the control of the program author
> > >      promise_type promise;
> > >      // Next comes any compiler-generated, internal state which gets persisted across suspension points. 
> > >      // The functions pointed to by `resume`/`destroy` know how to interpret this part of the coroutine frame.
> > >      int __suspension_point_id;
> > >      double __some_internal_state;
> > >      std::string __some_other_internal_state;
> > >      ....
> > > };
> > > ```
> > > 
> > > The programmer (i.e., most likely the user of this pretty-printer), wrote only the "promise" explicitly in his code. Everything else is compiler-generated. As such, the lldb-user will usually look for the "promise" first, and I would like to make it easy to find it, by exposing it as a top-level children of the `exception_handle` instead of hiding it inside a sub-child.
> > > As such, the lldb-user will usually look for the "promise" first, and I would like to make it easy to find it, by exposing it as a top-level children of the `exception_handle` instead of hiding it inside a sub-child.
> > 
> > That makes sense. And I think that's a good argument for automatically "dereferencing" that object (i.e., status quo). That said, it's not fully clear to me how do we end up looping here. I take it the promise object contains a (compiler-generated ?) pointer to another __coro_frame object? What would happen if we turned *that* into a pointer?
> > I take it the promise object contains a [...] pointer to another __coro_frame object?
> 
> yes
> 
> > (compiler-generated ?)
> 
> no
> 
> We end up looping if the user explicitly puts an `coroutine_handle` inside `promise_type`. You can find an example of this in https://clang.llvm.org/docs/DebuggingCoroutines.html#get-the-asynchronous-stack, in particular
> 
> ```
> struct promise_type {
>     // [...] shortened for readability
> 
>     std::coroutine_handle<> continuation = std::noop_coroutine();
>     int result = 0;
> };
> ```
> 
> In asynchronous programming, it is common to "chain continuations" by putting a `coroutine_handle` into the `promise_type`. This coroutine_handle inside the promise_type gives my asynchronous coroutine the answer to the question "where should I continue next, after finishing the asychronous operation modelled by my own coroutine?".
> 
> In normal situations (i.e. in the absence of bugs inside the debugged program), those coroutine_handle's should not form cycles. But I guess there could also be other use cases for coroutines where coroutine_handle cycles might be useful...
> 
> > What would happen if we turned *that* into a pointer?
> 
> The `promise_type` is a user-written struct, so I guess that I have little leverage whether it contains a `coroutine_handle` by value or by pointer? And turning an object which the user wrote as "by value" into a "by pointer" representation in the debugger would be pretty surprising...
Ok, thanks for your patience, I think I finally understand what's going on. Given all of the above, the approach in this patch (not dereferencing the promise type) makes sense to me, since the `coroutine_handle`type has pointer-like semantics. However, do take my opinion with a grain of salt, since I probably not going to be using this code.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132815/new/

https://reviews.llvm.org/D132815