[PATCH] D97915: [Coroutines] Handle overaligned frame allocation

Wed Jul 7 20:32:32 PDT 2021

ChuanqiXu added a comment.

In D97915#2862419 <https://reviews.llvm.org/D97915#2862419>, @ychen wrote:

>> It looks not hard to implement. And we don't need to refactor the CodeGen part a lot. In this way, I think the main effort to support `::operator new(size_t, align_t)` would be in the Sema part and the works remained in CodeGen part would be little. It wouldn't touch the middle end part neither.
>
> I agree that something like this is simpler implementation-wise. Although in spirit, it is similar to D100739 <https://reviews.llvm.org/D100739> which we decided not to pursue. What is the middle-end? is it LLVM?
>
>>> It shifts the semantic lowering from Clang to LLVM but does not perform less work.
>>
>> I think it would be simpler.
>
> I agree it would be simpler.
>
>> At least, we don't need to emit `getReturnStmtOnAllocFailure` twice and
>
> `getReturnStmtOnAllocFailure` is called twice but the code is emitted once.
>
>> we don't need to touch `CallCoroDelete`  neither.
>
> If we don't touch `CallCoroDelete`, then we cannot do the equivalent work in LLVM since there is no concept of deallocation function concept there.
>
>> And we don't organize the basic blocks in the CodeGenCoroutineBody.
>
> That's true. It is delayed until CoroSplit.
>
>> And we could emit simpler AlignupTo (Although it could be simplified further, I believe).
>
> D102147 <https://reviews.llvm.org/D102147> has a version of it.
>
>> And the extra work we need to do is to compare the alignment requirement for the coroutine frame with the second argument of `llvm.coro.create.frame` to see if we need to over align coroutine frame.
>> If yes, we need to lower the `llvm.coro.create.frame` to compute the true address for the coroutine frame and store the raw frame address.
>> If no, we could return `%allocated` simply.
>
> Yes, it is similar to CoroFrame.cpp changes in D102147 <https://reviews.llvm.org/D102147>.

> I agree that something like this is simpler implementation-wise. Although in spirit, it is similar to D100739 <https://reviews.llvm.org/D100739> which we decided not to pursue. What is the middle-end? is it LLVM?

It is frustrating to break the decision made before. But I don't find the conclusion in D100739 <https://reviews.llvm.org/D100739> that we shouldn't complete this in middle end.

> `getReturnStmtOnAllocFailure` is called twice but the code is emitted once.

I don't think so. `getReturnStmtOnAllocFailure` is called twice and emitted twice. The code emitted by the frontend should be:

  AllocBB:
    ; ...
    getReturnStmtOnAllocFailure()

  AlignedAllocBB:
    ; ...
    getReturnStmtOnAllocFailure()

I understand that one of the branch would be pruned in the middle end. But in the frontend, they are emitted twice. And it's redundant in both compiler codes and generated codes.

> If we don't touch CallCoroDelete, then we cannot do the equivalent work in LLVM since there is no concept of deallocation function concept there.

Sorry for confusing. I means that we could change semantics for `llvm.coro.free` to return the address of raw frame  and touch the Sema part to pass `llvm.coro.raw.frame.size` as the second argument to deallocator.
It should be easy and clean to implement and we don't need to touch `CallCoroDelete ` either.

>> And we don't organize the basic blocks in the CodeGenCoroutineBody.
>
> That's true. It is delayed until CoroSplit.

I don't think it is delayed. I think the task to organize the BBs would be eliminated if we move the main work in the middle end. 
I think if we move the work to select to over align or not, we could use the same IR structure. I think it is simpler, cleaner and more natural.

>> And we could emit simpler AlignupTo (Although it could be simplified further, I believe).
>
> D102147 <https://reviews.llvm.org/D102147> has a version of it.
>
>> And the extra work we need to do is to compare the alignment requirement for the coroutine frame with the second argument of `llvm.coro.create.frame` to see if we need to over align coroutine frame.
>> If yes, we need to lower the `llvm.coro.create.frame` to compute the true address for the coroutine frame and store the raw frame address.
>> If no, we could return `%allocated` simply.
>
> Yes, it is similar to CoroFrame.cpp changes in D102147 <https://reviews.llvm.org/D102147>.

I guess you refer to the wrong revision. Since I don't find related things in D102147 <https://reviews.llvm.org/D102147>.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97915/new/

https://reviews.llvm.org/D97915