[PATCH] D97915: [Coroutines] Handle overaligned frame allocation

Tue Jun 22 02:44:45 PDT 2021

ChuanqiXu added a comment.

In D97915#2832446 <https://reviews.llvm.org/D97915#2832446>, @ychen wrote:

> In D97915#2829581 <https://reviews.llvm.org/D97915#2829581>, @ChuanqiXu wrote:
>
>> A remained question.
>>
>> - what's the semantics if user specified their allocation/deallocation functions?
>>
>> Previously, we discussed for the ::operator new and ::operator delete. But what would we do for user specified allocation/deallocation functions?
>> It looks like we would treat them just like `::operator new`. And it makes sense at the first glance. But the problem comes from that we judge whether
>> or not to over allocate a frame by this condition:
>>
>>   coro.align > align of new
>>
>> But if the user uses their new, it makes no sense to use the `align of new` as the condition. On the other hand, if user specified their new function and the 
>> alignment requirement for their promise type, would it be better really that the compiler do the extra transformation?
>>
>> May be we need to discuss with other guys who are more familiar with the C++ standard to answer this.
>
> I think @rjmccall could answer these. My understanding is that user-specified allocation/deallocation has the same semantics as their standard library's counterparts. `align of new` (aka __STDCPP_DEFAULT_NEW_ALIGNMENT__) should apply to both.

Yeah, I just curious about the question and not sure about the answer yet. I agree with that it should be safe if we treat user-specified allocation/deallocation as ::operator new. Maybe I am a little bit of pedantry. I just not sure if the developer would be satisfied when they find we allocate padding space they didn't want when they offered a new/delete method. (Maybe I am too anxious).

---

Another problem I find in this patch is that the introduction for `raw frame` makes the frame confusing. For example, the following codes is aim to calculate the size for the over allocated  frame:

  %[[SIZE:.+]] = call i64 @llvm.coro.size.i64()
  %[[NEWSIZE:.+]] = add i64 %[[SIZE]], %[[PAD]]
  %[[MEM2:.+]] = call noalias nonnull i8* @_Znwm(i64 %[[NEWSIZE]])

It makes sense only if `llvm.coro.size` stands for the size of 'true frame' (I am not sure what's the meaning for raw frame now. Let me use 'true frame' temporarily). But the document now says that '@llvm.coro.size' returns the size of the frame. It's confusing
I am not require to fix it by rename 'llvm.coro.size' or rephrase the document simply. I am thinking about the the semantics of coroutine intrinsics after we introduce the concept of 'raw frame'.

================
Comment at: clang/lib/CodeGen/CGCoroutine.cpp:424
+// dynamically adjust the frame start address at runtime.
+void OverAllocateFrame(CodeGenFunction &CGF, llvm::CallInst *CI, bool IsAlloc) {
+  unsigned CoroSizeIdx = IsAlloc ? 0 : 1;
----------------
So if this would be called for deallocate. the function name is confusing. I think it may be better to rename it as something like 'GeSizeOFtOverAlignedFrame' (The name suggested looks not good too).
By the way, now I am wondering why wouldn't we use llvm.coro.size directly? And make the middle end to handle it. How do you think about it?

================
Comment at: clang/lib/CodeGen/CGCoroutine.cpp:719
+  auto *AlignedUpAddr =
+      EmitBuiltinAlignTo(AlignedAllocateCall, CoroAlign, S.getAllocate(), true);
+  // rawFrame = frame
----------------
How do your think about to replace EmitBuiltinAlignTo inplace?

================
Comment at: clang/lib/CodeGen/CGCoroutine.cpp:452-461
+  llvm::Function *RawFramePtrOffsetIntrin = CGF.CGM.getIntrinsic(
+      llvm::Intrinsic::coro_raw_frame_ptr_offset, CGF.Int32Ty);
+  auto *RawFramePtrOffset = CGF.Builder.CreateCall(RawFramePtrOffsetIntrin);
+  auto *FramePtrAddrStart =
+      CGF.Builder.CreateInBoundsGEP(CoroFree, {RawFramePtrOffset});
+  auto *FramePtrAddr = CGF.Builder.CreatePointerCast(
+      FramePtrAddrStart, CGF.Int8PtrTy->getPointerTo());
----------------
ychen wrote:
> ChuanqiXu wrote:
> > We allocate overaligned-frame like:
> > ```
> > | --- for align --- | --- true frame --- |
> > ```
> > And here we set the argument to the address of true frame. Then I wonder how about the memory for the `for align` part. Would we still free them correctly? Maybe I missed something.
> > Would we still free them correctly? 
> Yes, that's the tricky part. Using `f0` of `coro-alloc.cpp` as an example, `llvm.coro.raw.frame.ptr.addr` is called at alloc time to save the raw memory pointer to the coroutine frame.  Later at dealloc time, `llvm.coro.raw.frame.ptr.addr` is called again to load the raw memory pointer back and free it.
To make it clear, what's the definition for 'raw ptr'? From the context, I think it means the true frame in above diagram from the context.

So this confuses me:
> llvm.coro.raw.frame.ptr.addr is called again to load the raw memory pointer back and free it.

If the raw memory means the true frame, it may not be right. Since the part for 'for-align' wouldn't be freed.

================
Comment at: clang/lib/CodeGen/CGCoroutine.cpp:466-472
+  if (Dealloc->getNumArgOperands() > 1) {
+    // Size may only be the second argument of allocator call.
+    if (auto *CoroSize =
+            dyn_cast<llvm::IntrinsicInst>(Dealloc->getArgOperand(1)))
+      if (CoroSize->getIntrinsicID() == llvm::Intrinsic::coro_size)
+        OverAllocateFrame(CGF, Dealloc, /*IsAlloc*/ false);
+  }
----------------
ychen wrote:
> ChuanqiXu wrote:
> > We don't need to handle this condition in this patch.
> This handling is for `sized delete` (`void T::operator delete  ( void* ptr, std::size_t sz );`) instead of `aligned delete`. `sized delete` needs the same `size` that is used for `new`. Please check the `f3` test in `coro-alloc.cpp` (The test was missing the CHECK lines for this, I've added it.).
hmm, I understand it a bit more now.

================
Comment at: clang/test/CodeGenCoroutines/coro-alloc.cpp:106
+  // CHECK-NEXT: %[[ADDR2:.+]] = bitcast i8* %[[ADDR]] to i8**
+  // CHECK-NEXT: %[[MEM:.+]] = load i8*, i8** %[[ADDR2]], align 8
+  // CHECK-NEXT: call void @_ZdlPv(i8* %[[MEM]])
----------------
It defines variable 'MEM' again in conflicting with the line at 89. Does it matter?

================
Comment at: llvm/lib/Transforms/Coroutines/CoroFrame.cpp:1133-1137
   // Create an entry for every spilled value.
   for (auto &S : FrameData.Spills) {
     FieldIDType Id = B.addField(S.first->getType(), None);
     FrameData.setFieldIndex(S.first, Id);
   }
----------------
ychen wrote:
> ChuanqiXu wrote:
> > Why we move this snippet to the front? Although it is not defined, the layout for the current frame would be:
> > ```
> > | resume func addr | destroy func addr | promise | other things needed |
> > ```
> > Move this to the front may break this.
> The intent is to structure the code better, no intention to change the frame layout here. My understanding is that `promise` already has a fixed offset ahead of this. `FrameData::Allocas` is ordered but there is no defined semantics. There seem no tests failing due to reordered frame layout. However, I might be wrong. Could you describe how it changes the layout?
Sorry, I made a mistake. This move should be OK. My bad.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97915/new/

https://reviews.llvm.org/D97915