[PATCH] D97915: [Coroutines] Handle overaligned frame allocation

Wed May 12 23:19:52 PDT 2021

ychen added inline comments.

================
Comment at: clang/lib/CodeGen/CGCoroutine.cpp:528-547
     CGF.EmitBlock(FreeBB);
     CGF.EmitStmt(Deallocate);
-
-    auto *AfterFreeBB = CGF.createBasicBlock("after.coro.free");
-    CGF.EmitBlock(AfterFreeBB);
+    CGF.Builder.CreateBr(AfterFreeBB);

     // We should have captured coro.free from the emission of deallocate.
     auto *CoroFree = CGF.CurCoro.Data->LastCoroFree;
+    CGF.CurCoro.Data->LastCoroFreeUsedForDealloc = true;
----------------
ChuanqiXu wrote:
> ychen wrote:
> > ChuanqiXu wrote:
> > > It looks like it would emit a `deallocate` first, and emit an `alignedDeallocate`, which is very odd. Although I can find that the second `deallocate` wouldn't be emitted due to the check `LastCoroFreeUsedForDealloc`, it is still very odd to me. If the second `deallocate` wouldn't come up all the way, what's the reason we need to write `emit(deallocate)` twice?
> > Agree that `LastCoroFreeUsedForDealloc` is a bit confusing. It makes sure deallocation and aligned deallocation share one `coro.free`. Otherwise, AFAIK, there would be two `coro.free` get codegen'd.
> > ```
> > %mem = llvm.coro.free()
> > br i1 <overalign> , label <aligend-dealloc>, label <dealloc>
> > 
> > aligend-dealloc:
> >     use %mem
> > 
> > dealloc:
> >     use %mem
> > ```
> > 
> > 
> > > what's the reason we need to write emit(deallocate) twice?
> > John wrote a code snippet here: https://reviews.llvm.org/D100739#2717582. I think it would be helpful to look at the changed tests below to see the patterns.
> > 
> > Basically, for allocation, it looks like below; for deallocation, it would be similar.
> > ```
> > void *rawFrame =nullptr;
> > ...
> > if (llvm.coro.alloc()) {
> >   size_t size = llvm.coro.size(), align = llvm.coro.align();
> >   if (align > NEW_ALIGN) {
> > #if <an allocation function without std::align_val_t argument is selected by Sema>
> >     size += align - NEW_ALIGN + sizeof(void*);
> >     frame = operator new(size);
> >     rawFrame = frame;
> >     frame = (frame + align - 1) & ~(align - 1);
> > #else
> >     // If an aligned allocation function is selected.
> >     frame = operator new(size, align);
> > #endif
> >   } else {
> >     frame = operator new(size);
> >   }
> > }
> > ```
> > The true branch of the #if directive is equivalent to "coro.alloc.align" block (and "coro.alloc.align2" if `get_return_object_on_allocation_failure` is defined), the false branch is equivalent to "coro.alloc" block.
> > The above pattern handles both aligned/normal allocation/deallocation so it is independent of D102147.
> Thanks. I get the reason why I am thinking the code isn't natural. Since I think `::operator new(size_t, align_val_t)` shouldn't come up in this patch which should be available after D102147 applies. Here you said this patch is independent with D102147, I believe this patch could work without D102147. But it contains the codes which would work only if we applies the successor patch, so I think it is dependent on D102147.
> 
> The ideally relationship for me is to merge `D102145 ` into this one (Otherwise it is weird for me that `D102145` only introduces some intrinsics which wouldn't be used actually). Then this patch should handle the alignment for variables in coroutine frame without introducing `::new(size_t, align_val_t)`. Then the final patch could do the job that searching and generating code for `::new(size_t, align_val_t)`.
> 
> Maybe it is a little bit hard to rebase again and again. But I think it is better.
I think I know where the confusion comes from. `AlignedDeallocate` is not guaranteed to be an aligned allocator. In this patch in `SemaCoroutine.cpp`, it is set to `Deallocate` in which case we always dynamically adjust frame alignment. Once D102147 is landed. `AlignedDeallocate` may or may not be an aligned allocator.

> The ideally relationship for me is to merge D102145  into this one (Otherwise it is weird for me that D102145 only introduces some intrinsics which wouldn't be used actually). Then this patch should handle the alignment for variables in coroutine frame without introducing ::new(size_t, align_val_t). Then the final patch could do the job that searching and generating code for ::new(size_t, align_val_t).

I was worried about the size of the patch if this is merged with D102145 but if that is preferred by more than one reviewer, I'll go ahead and do that. D102145 is pretty self-contained in that it does not contain clients of the added intrinsics but the introduced test should cover the expected intrinsic lowering.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97915/new/

https://reviews.llvm.org/D97915