[PATCH] D134009: [Coroutines] Introduce `always_complete_coroutine` attribute to guide optimization (1/2)

Mon Sep 19 06:51:31 PDT 2022

avogelsgesang added inline comments.

================
Comment at: llvm/docs/Coroutines.rst:1732
+
+The ``always_complete_coroutine`` attribute is a function attribute indicates
+that the coroutine will always complete. This is supported by switch-resumed
----------------
I am a bit surprised that we introduce a new IR-attribute for this.

Can't we reuse the existing branching structure of the suspension points?
I could imagine that we could use the `switch` after each suspend to encode whether the coroutine can be destroyed at that suspension point. If we point the `switch` for `i8 1`, i.e. the cleanup control flow edge, to an unreachable basic block, the coroutine splitting should not optimize out the corresponding code from the `destroy` function already, doesn't it?

I imagine something like

```
  // For this suspension point, we want to optimize under the assumption that the
  // coroutine will not be destroyed while suspended.
  %0 = call i8 @llvm.coro.suspend(token none, i1 false)
  // Hence we let the "cleanup" result (`i8 1`) branch to an "unreachable" block
  switch i8 %0, label %suspend [i8 0, label %continue
                                i8 1, label %unreachable_bb]
continue:
  // After the resumption point, execution continues here...
  // Let's assume we suspend a 2nd time. 
  %1 = call i8 @llvm.coro.suspend(token none, i1 false)
  // At this suspension point, we allow the coroutine to be destructed.
  switch i8 %1, label %suspend [i8 0, label %continue2
                                i8 1, label %cleanup]
continue2:
  // Some other stuff happens here...
cleanup:
  %mem = call i8* @llvm.coro.free(token %id, i8* %hdl)
  call void @free(i8* %mem)
  br label %suspend
suspend:
  %unused = call i1 @llvm.coro.end(i8* %hdl, i1 false)
  ret i8* %hdl
unreachable_bb:
  unreachable
}
```

To prove the viability of this alternative approach, I copied the LLVM code for https://github.com/llvm/llvm-project/issues/56980 from https://godbolt.org/z/P84MPzq4q and manually replaced the `await.cleanup`, `await2.cleanup` ..., `await7.cleanup` basic blocks by "unreachable". You can find the changed LLVM code in https://godbolt.org/z/n93TTj8vo. As you can see, the generated `Foo() [clone .destroy]` function does not contain any code for the `GlobalSetter` destructors

I think re-using the `switch` control flow is preferable over introducing a new attribute because:
1. it allows for more fine-grained control: we can control for each suspension point independently whether the coroutine can be destroyed while suspended at this suspension point
2. it reuses an existing concept instead of introducing a new attribute, thereby keeping LLVM's code simpler

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134009/new/

https://reviews.llvm.org/D134009