[PATCH] D134009: [Coroutines] Introduce `always_complete_coroutine` attribute to guide optimization (1/2)

Tue Sep 20 03:46:41 PDT 2022

avogelsgesang added inline comments.

================
Comment at: llvm/docs/Coroutines.rst:1732
+
+The ``always_complete_coroutine`` attribute is a function attribute indicates
+that the coroutine will always complete. This is supported by switch-resumed
----------------
ChuanqiXu wrote:
> avogelsgesang wrote:
> > I am a bit surprised that we introduce a new IR-attribute for this.
> > 
> > Can't we reuse the existing branching structure of the suspension points?
> > I could imagine that we could use the `switch` after each suspend to encode whether the coroutine can be destroyed at that suspension point. If we point the `switch` for `i8 1`, i.e. the cleanup control flow edge, to an unreachable basic block, the coroutine splitting should not optimize out the corresponding code from the `destroy` function already, doesn't it?
> > 
> > I imagine something like
> > 
> > ```
> >   // For this suspension point, we want to optimize under the assumption that the
> >   // coroutine will not be destroyed while suspended.
> >   %0 = call i8 @llvm.coro.suspend(token none, i1 false)
> >   // Hence we let the "cleanup" result (`i8 1`) branch to an "unreachable" block
> >   switch i8 %0, label %suspend [i8 0, label %continue
> >                                 i8 1, label %unreachable_bb]
> > continue:
> >   // After the resumption point, execution continues here...
> >   // Let's assume we suspend a 2nd time. 
> >   %1 = call i8 @llvm.coro.suspend(token none, i1 false)
> >   // At this suspension point, we allow the coroutine to be destructed.
> >   switch i8 %1, label %suspend [i8 0, label %continue2
> >                                 i8 1, label %cleanup]
> > continue2:
> >   // Some other stuff happens here...
> > cleanup:
> >   %mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
> >   call void @free(i8* %mem)
> >   br label %suspend
> > suspend:
> >   %unused = call i1 @llvm.coro.end(i8* %hdl, i1 false)
> >   ret i8* %hdl
> > unreachable_bb:
> >   unreachable
> > }
> > ```
> > 
> > To prove the viability of this alternative approach, I copied the LLVM code for https://github.com/llvm/llvm-project/issues/56980 from https://godbolt.org/z/P84MPzq4q and manually replaced the `await.cleanup`, `await2.cleanup` ..., `await7.cleanup` basic blocks by "unreachable". You can find the changed LLVM code in https://godbolt.org/z/n93TTj8vo. As you can see, the generated `Foo() [clone .destroy]` function does not contain any code for the `GlobalSetter` destructors
> > 
> > I think re-using the `switch` control flow is preferable over introducing a new attribute because:
> > 1. it allows for more fine-grained control: we can control for each suspension point independently whether the coroutine can be destroyed while suspended at this suspension point
> > 2. it reuses an existing concept instead of introducing a new attribute, thereby keeping LLVM's code simpler
> Yeah, it is a good idea if we only want to reduce the destroy function. However, if we look at the following explanation for the control flow information, we can't infer such information any more by the trick. And the implementation of the patch is not complex indeed, if you look at the code actually, you'll find the meaningful change 5 lines of code in CoroSplit.cpp.
> 
> Also I feel like the attribute is a general property, it may be possible for other languages to use it. So I guess it might not be a big deal to introduce it.
> However, if we look at the following explanation for the control flow information, we can't infer such information any more by the trick.

Why not? From my understanding, given the `unreachable`, we should also be able to "envision the ``ret`` in ``bb{i} (i != n)`` as branch instruction to the next block ``bb{i+1}`` with a unknown function call".

Yes, currently the control flow analysis does not yet treat the `unreachable` in a way such that it uses this additional control flow information, yet. But from what I can tell, also the `always_complete_coroutine` does not use the additional control flow, either. And I don't see a technical reason why inferring the additional control flow information from the `always_complete_coroutine` would be harder than inferring it from the `unreachable`. Or am I missing something?

> Also I feel like the attribute is a general property, it may be possible for other languages to use it

Not sure I can follow. Afaict, the `unreachable` could just as well be used by other language frontends?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134009/new/

https://reviews.llvm.org/D134009