[PATCH] D103593: [Coroutine] Sink lifetime markers after switch of suspend blocks to avoid disturbing must tail calls

Mon Jun 14 19:30:40 PDT 2021

ChuanqiXu added inline comments.

================
Comment at: llvm/lib/Transforms/Coroutines/CoroFrame.cpp:2343
            "should have split coro.suspend into its own block");
-    DomSet.insert(SuspendBlock->getSingleSuccessor());
+    BasicBlock *SwitchBlock = SuspendBlock->getSingleSuccessor();
+    for (auto *BB : llvm::successors(SwitchBlock))
----------------
rjmccall wrote:
> ChuanqiXu wrote:
> > rjmccall wrote:
> > > ChuanqiXu wrote:
> > > > rjmccall wrote:
> > > > > Can we assert that this is a switch block somehow?  This is very dangerous if the code pattern isn't what we expect.
> > > > The terminator of `SuspendBlock->getSingleSuccessor()` isn't necessarily  a `SwitchInst` actually. Precious name is misunderstanding, my bad. I add comments to clarify it.
> > > Do we ever not emit a block that does this kind of branch/switch?  What if there's only one suspend point?  It's extremely important that this not accidentally walk into the user-provided parts of the function, so if there's any way for us to actually assert that the successor block has the form we expect (nothing but a switch or branch on the state value), that would make me feel a lot better.
> > > Do we ever not emit a block that does this kind of branch/switch?
> > 
> > We always emit a switch in the frontend. But if there are identical destination basic block in the switch instruction, it may be converted into a conditional branch instruction. And if all the destination blocks are the same (happens in the test case), it would be converted to an unconditional branch instruction.
> > 
> >  > What if there's only one suspend point?
> > 
> > From my point of view, it should be OK.
> > 
> > >  It's extremely important that this not accidentally walk into the user-provided parts of the function, so if there's any way for us to actually assert that the successor block has the form we expect (nothing but a switch or branch on the state value), that would make me feel a lot better.
> > 
> > Agreed. However, it seems hard to code the pattern in a one assert statement. So I tried to add codes to check the pattern. The codes look a little bit ugly to me. But I find it isn't easy to make it in a simple way. One alternative was to add function to verify the IR in the beginning of coro split pass.
> > 
> > By the way, there are other side effects for these guard codes. It breaks some tests whose pattern are unusual for IR produced by the frontend. But from the documentation (https://llvm.org/docs/Coroutines.html#llvm-coro-suspend-retcon-intrinsic), I think these tests are legal. After all, LLVM IR are independent language. So if we decide to continue in this way. We need to modify the document about coroutine API to make it consistent.
> > We always emit a switch in the frontend. But if there are identical destination basic block in the switch instruction, it may be converted into a conditional branch instruction. And if all the destination blocks are the same (happens in the test case), it would be converted to an unconditional branch instruction.
> 
> Well, two blocks separated by an unconditional branch can be merged, so I don't this we can rely on seeing this pattern, even if it were true that the frontend guaranteed to emit it.
> 
> Stepping back, are lifetime intrinsics that point into the coroutine frame really at all useful after splitting?  Should we just erase lifetime intrinsics for allocas that we've moved into the frame?  (Lifetimes for allocas that we haven't moved into the frame should be fine to leave in, since we'll only do this when the lifetimes don't overlap a suspend.)
> Stepping back, are lifetime intrinsics that point into the coroutine frame really at all useful after splitting? 

In my mind, if we did coroutine elision and SROA for the frame, the lifetime intrinsics may be useful. I didn't write test cases to prove it. But I guess it may be possible.

>From my point of view, there may be two solutions:
1. Remove these check-codes simply. Keep it as the original patch.
2. Try to remove the lifetime markers when we convert tail call to musttail call at the end of Coro-split pass.

I think the first may be better. Because the root cause of the bug is that we didn't put lifetime markers properly which breaks the symmetric transfer. So we need to put the lifetime markers **properly**. And there are two questions followed:
1. Do we put lifetime markers in the right place? Or do the lifetime markers would make the program wrong (crash)?
2. Can we prove that the lifetime markers inserted wouldn't break the symmetric transfer?

And my answer for the first question is yes. Since the condition to sink lifetimes didn't change and I think it is correct.
For the second question, which may be your main concern, my answer is yes if other passes didn't do any surprising transformations. Given the codes generated in the frontend, we can know the switch (or branch) statement  should be in the same BB of the corresponding coro.suspend. Then after splitting around `coro.suspend`, the switch statement  should be in the single successor block of the coro.suspend. So it should be fine in my mind.

For the second solution, removing lifetimes when we generating musttail call. It looks a little bit weird for me since it didn't remove all the lifetime markers as you said nor prove the removed lifetime markers wouldn't be used in  future transformation.

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103593/new/

https://reviews.llvm.org/D103593