[clang] [llvm] [Coroutines] Mark parameter allocas with coro.outside.frame metadata (PR #127653)

Mon Feb 24 05:58:58 PST 2025

zmodem wrote:

> > > My instinct reaction is, it may block some optimizations to optimize the alloca out.
> > 
> > 
> > The intrinsic gets dropped by CoroSplit, so after that it's not a problem. Before that, I'm not sure much interesting optimization can happen to allocas anyway? Especially since it's not known if they will be part of the coro frame or not at that point.
> 
> As far as I know, a lot of allocas related optimizations/transformation happens before CoroSplit. e.g., mem2reg. And this is the reason why we put CoroSplit in the middle of passes.

You're right. I hadn't realized that we run mem2reg before CoroSplit and then spill values back to the frame as needed.

I suppose mem2reg is especially relevant since it operates on allocas. If the transform could remove an alloca, we should not let the intrinsic block it. I've updated the PR to handle that.

For other optimizations that analyze uses of the alloca, I've marked the intrinsic `memory(none)` and the argument `nocapture`.

I think this way the intrinsic will not block important optimizations.

> > > > In fact, my change broke a test which seems to be doing exactly that. Just because it's old doesn't mean it was wrong. So the patch didn't feel solid.
> > 
> > 
> > > I do feel it is wrong since it accesses the frame after it ends. It makes no sense.
> > 
> > 
> > (To make it easier to follow, we're talking about [this access](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/test/Transforms/Coroutines/coro-debug.ll#L75).)
> > Looking closer, I think that code might be fine. The code after `coro.end` will go at the end of the ramp function. At that point the coro frame is still alive. And since that alloca is also [accessed in the resume function](https://github.com/llvm/llvm-project/blob/llvmorg-20.1.0-rc2/llvm/test/Transforms/Coroutines/coro-debug.ll#L67) (after a suspend), it _needs_ to be in the frame.
> 
> On the one hand, I still suspect to the validness of the old test. On the other hand, I think we should improve our algorithm to identify the area after coro.end as special area when calculating whether or not to put an alloca to the frame.

I think until someone points out exactly what's wrong about the old test, we have to assume that it's valid.

> The short term solution may consider the case the coroutine won't suspend specially.

Can you explain this more?

https://github.com/llvm/llvm-project/pull/127653