[PATCH] D97915: [Coroutines] Handle overaligned frame allocation

Wed May 12 23:39:50 PDT 2021

ChuanqiXu added inline comments.

================
Comment at: clang/include/clang/AST/StmtCXX.h:356-359
     Expr *Allocate = nullptr;
     Expr *Deallocate = nullptr;
+    Expr *AlignedAllocate = nullptr;
+    Expr *AlignedDeallocate = nullptr;
----------------
ychen wrote:
> ChuanqiXu wrote:
> > ychen wrote:
> > > ChuanqiXu wrote:
> > > > Can't we merge these?
> > > I'm not sure about the "merge" here. Could you be more explicit?
> > Sorry. I mean if we can merge `Allocate` with `AlignedAllocate` and merge `Deallocate` with `AlignedDeallocate`. Since from the implementation, it looks like the value of `Allocate` and `AlignedAllocate ` (so as `Deallocate` and `AlignedDeallocate`) are the same.
> Oh, this is to set the path for D102147 where `Allocate` and `AlignedAllocate` could be different. If I do this in D102147, it will also touch the `CGCoroutine.cpp` which I'm trying to avoid` since it is intended to be a Sema only patch.
Yeah, this is the key different point between us. I think that `D102147` could and should to touch the CodeGen part.

================
Comment at: clang/lib/CodeGen/CGCoroutine.cpp:528-547
     CGF.EmitBlock(FreeBB);
     CGF.EmitStmt(Deallocate);
-
-    auto *AfterFreeBB = CGF.createBasicBlock("after.coro.free");
-    CGF.EmitBlock(AfterFreeBB);
+    CGF.Builder.CreateBr(AfterFreeBB);

     // We should have captured coro.free from the emission of deallocate.
     auto *CoroFree = CGF.CurCoro.Data->LastCoroFree;
+    CGF.CurCoro.Data->LastCoroFreeUsedForDealloc = true;
----------------
ychen wrote:
> ychen wrote:
> > ChuanqiXu wrote:
> > > ychen wrote:
> > > > ChuanqiXu wrote:
> > > > > It looks like it would emit a `deallocate` first, and emit an `alignedDeallocate`, which is very odd. Although I can find that the second `deallocate` wouldn't be emitted due to the check `LastCoroFreeUsedForDealloc`, it is still very odd to me. If the second `deallocate` wouldn't come up all the way, what's the reason we need to write `emit(deallocate)` twice?
> > > > Agree that `LastCoroFreeUsedForDealloc` is a bit confusing. It makes sure deallocation and aligned deallocation share one `coro.free`. Otherwise, AFAIK, there would be two `coro.free` get codegen'd.
> > > > ```
> > > > %mem = llvm.coro.free()
> > > > br i1 <overalign> , label <aligend-dealloc>, label <dealloc>
> > > > 
> > > > aligend-dealloc:
> > > >     use %mem
> > > > 
> > > > dealloc:
> > > >     use %mem
> > > > ```
> > > > 
> > > > 
> > > > > what's the reason we need to write emit(deallocate) twice?
> > > > John wrote a code snippet here: https://reviews.llvm.org/D100739#2717582. I think it would be helpful to look at the changed tests below to see the patterns.
> > > > 
> > > > Basically, for allocation, it looks like below; for deallocation, it would be similar.
> > > > ```
> > > > void *rawFrame =nullptr;
> > > > ...
> > > > if (llvm.coro.alloc()) {
> > > >   size_t size = llvm.coro.size(), align = llvm.coro.align();
> > > >   if (align > NEW_ALIGN) {
> > > > #if <an allocation function without std::align_val_t argument is selected by Sema>
> > > >     size += align - NEW_ALIGN + sizeof(void*);
> > > >     frame = operator new(size);
> > > >     rawFrame = frame;
> > > >     frame = (frame + align - 1) & ~(align - 1);
> > > > #else
> > > >     // If an aligned allocation function is selected.
> > > >     frame = operator new(size, align);
> > > > #endif
> > > >   } else {
> > > >     frame = operator new(size);
> > > >   }
> > > > }
> > > > ```
> > > > The true branch of the #if directive is equivalent to "coro.alloc.align" block (and "coro.alloc.align2" if `get_return_object_on_allocation_failure` is defined), the false branch is equivalent to "coro.alloc" block.
> > > > The above pattern handles both aligned/normal allocation/deallocation so it is independent of D102147.
> > > Thanks. I get the reason why I am thinking the code isn't natural. Since I think `::operator new(size_t, align_val_t)` shouldn't come up in this patch which should be available after D102147 applies. Here you said this patch is independent with D102147, I believe this patch could work without D102147. But it contains the codes which would work only if we applies the successor patch, so I think it is dependent on D102147.
> > > 
> > > The ideally relationship for me is to merge `D102145 ` into this one (Otherwise it is weird for me that `D102145` only introduces some intrinsics which wouldn't be used actually). Then this patch should handle the alignment for variables in coroutine frame without introducing `::new(size_t, align_val_t)`. Then the final patch could do the job that searching and generating code for `::new(size_t, align_val_t)`.
> > > 
> > > Maybe it is a little bit hard to rebase again and again. But I think it is better.
> > I think I know where the confusion comes from. `AlignedDeallocate` is not guaranteed to be an aligned allocator. In this patch in `SemaCoroutine.cpp`, it is set to `Deallocate` in which case we always dynamically adjust frame alignment. Once D102147 is landed. `AlignedDeallocate` may or may not be an aligned allocator.
> > 
> > > The ideally relationship for me is to merge D102145  into this one (Otherwise it is weird for me that D102145 only introduces some intrinsics which wouldn't be used actually). Then this patch should handle the alignment for variables in coroutine frame without introducing ::new(size_t, align_val_t). Then the final patch could do the job that searching and generating code for ::new(size_t, align_val_t).
> > 
> > I was worried about the size of the patch if this is merged with D102145 but if that is preferred by more than one reviewer, I'll go ahead and do that. D102145 is pretty self-contained in that it does not contain clients of the added intrinsics but the introduced test should cover the expected intrinsic lowering.
> > 
> Naming is hard. I had a hard time figuring out a better name. `AlignedDeallocate`/`AlignedAllocate` is intended to refer to allocator/deallocator used for handling overaligned frame. Not that they are referring to allocator/deallocator with std::align_val_t argument. 
I think it is better for me to merge `D102145` into this one to understand this patch. For example, the test cases in `D102145` looks weird to me since it doesn't do over alignment at all like we discussed in that thread. Maybe my understanding is not right, but I think it isn't pretty self-contained. I am OK to wait for opinions from other reviewers.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97915/new/

https://reviews.llvm.org/D97915