[PATCH] D97915: [Coroutines] Handle overaligned frame allocation

Thu Mar 4 14:59:57 PST 2021

rjmccall added inline comments.

================
Comment at: clang/lib/CodeGen/CGBuiltin.cpp:4450
+    Value *Extra = Builder.CreateSelect(Cmp, Diff, Zero);
+    return RValue::get(Builder.CreateAdd(SizeCall, Extra));
   }
----------------
ychen wrote:
> ychen wrote:
> > rjmccall wrote:
> > > Okay, so you're implicitly increasing the coroutine size to allow you to round up to get an aligned frame.  But how do you round back down to get the actual pointer that you need to delete?  This just doesn't work.
> > > 
> > > You really ought to just be using the aligned `operator new` instead when the required alignment is too high.  If that means checking the alignment "dynamically" before calling `operator new` / `operator delete`, so be it.  In practice, it will not be dynamic because lowering will replace the `coro.align` call with a constant, at which point the branch will be foldable.
> > > 
> > > I don't know what to suggest if the aligned `operator new` isn't reliably available on the target OS.  You could outline a function to pick the best allocator/deallocator, I suppose.
> > Thanks for the review!
> > 
> > > Okay, so you're implicitly increasing the coroutine size to allow you to round up to get an aligned frame. But how do you round back down to get the actual pointer that you need to delete? This just doesn't work.
> > 
> > Hmm, you're right that I missed the `delete` part, thanks. The adjusted amount is constant, I could just dial it down in `coro.free`, right?
> > 
> > >  You really ought to just be using the aligned operator new instead when the required alignment is too high. If that means checking the alignment "dynamically" before calling operator new / operator delete, so be it. In practice, it will not be dynamic because lowering will replace the coro.align call with a constant, at which point the branch will be foldable.
> >  
> > That's my intuition at first. But spec is written in a way suggesting (IMHO) that the aligned version should not be used? What if the user specify their own allocator, then which one they should use?
> Sorry, I meant the adjusted amount is coro.align - std::max_align_t,  I could subtract it in `coro.free` . I think it should work?
No, because the adjustment you have to do in `coro.alloc` isn't just an addition, it's an addition plus a mask, which isn't reversible.  Suppose the frame needs to be 32-byte-aligned, but the allocator only promises 8-byte alignment.  The problem is that when you go to free a frame pointer, and you see that it's 32-byte-aligned (which, again, it always will be), the pointer you got from the allocator might be the frame pointer minus any of 8, 16, or 24 — or it might be exactly the same.  The only way to reverse that is to store some sort of cookie, either the amount to subtract or even just the original pointer.

Now, if you could change the entire coroutine ABI, you could make the frame handle that you pass around be the unadjusted pointer and then just repeat the adjustment every time you enter the coroutine.  But that doesn't work because the ABI relies on things like the promise being at a reliable offset from the frame handle.

I think the best solution would be to figure out a way to use an aligned allocator, which at worst does this in a more systematic way and at best can actually just satisfy your requirement directly without any overhead.  If you can't do that, adding an offset to the frame would be best; if you can't do that, doing it as a cookie is okay.

> That's my intuition at first. But spec is written in a way suggesting (IMHO) that the aligned version should not be used? What if the user specify their own allocator, then which one they should use?

It seems like a spec bug that this doesn't use aligned allocators even when they're available.  If there's an aligned allocator available, I think this should essentially do dynamically what it would normally do statically, i.e.:

```
void *allocation = alignment > __STDCPP_DEFAULT_NEW_ALIGNMENT__ ? operator new(size, align_val_t(alignment)) : operator new(size);
```

This would ODR-use both allocation functions, of course.

Maybe it's right to do this cookie thing if we can't rely on an aligned allocator, like if the promise class provides only an `operator new(size_t)`.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97915/new/

https://reviews.llvm.org/D97915