[PATCH] D101980: [RFC] [Coroutines] Put the contents of byval argument to the frame

Mon May 10 23:24:33 PDT 2021

rjmccall added a comment.

In D101980#2743798 <https://reviews.llvm.org/D101980#2743798>, @ChuanqiXu wrote:

> In D101980#2743665 <https://reviews.llvm.org/D101980#2743665>, @rjmccall wrote:
>
>> How is the situation with `byval` different from the situation with indirect arguments?  It sounds like we generally need to perform this extra copy into the coroutine frame once it's allocated, and yes, that's something we need to emit in the frontend because it might be non-trivial.
>>
>> Or is the problem that LLVM might be helpfully eliminating the copy when the source is a `byval` parameter?  But that's also something it could be doing with indirect arguments, so again, treating `byval` specially seems suspect.
>
> Sorry. May I ask what does indirect argument mean? I tried to copy all byval parameters since it looks like if the frontend needs to pass parameters by value for structs, it would emit `byval` attributes.

An indirect argument is one that's implicitly passed by reference.  For example, most x86-64 ABIs will pass a struct like `struct { char x[1024]; };` as a pointer to a temporary instead of on the stack.  Normally, functions that take such arguments just bind the parameter variable directly to the pointer that's passed in; if that's a problem for coroutines, you're going to need to force a copy there as well.

Indirect arguments are also used (on pretty much every target except Windows) for non-trivially-copyable class types.  That gets tricky because unfortunately you can't just emit a correct move-construction during coroutine lowering; we'll have to do it in the frontend, if we aren't already.

That brings up a related problem: there are LLVM optimizations that assume things about parameters during a function.  For example, the pointers passed as indirect arguments are generally assumed not to alias anything, which means that we can generally eliminate redundant things like, say, copies out of them that are perfectly happy to just use the original memory.  That's not a problem for copies emitted during coroutine lowering, since we'll immediately split the function and make the coroutine non-redundant from LLVM's perspective, but I am worried that we might trigger these optimizations on copies we emit earlier, and that could mess things up.

For example, consider code like this:

  struct A {
    char buffer[32];
    A(const A &);
  };

  A::A(const A &other) {
    memcpy(buffer, other.buffer, sizeof(buffer));
  }

  my_coroutine foo(A a) {
    ...
  }

If we emit the copy of `a` in the frontend — which I think we have no choice but to do — it'll generate IR something like this:

  define ... @foo(%struct.A* %0) {
    %a = alloca %struct.A
    call void @llvm.memcpy(%a, %0, ...)
    ...
  }

If the LLVM optimizer sees this, it very well might just remove the memcpy and use %0 everywhere instead of %a, and then we'll miscompile the coroutine.  I'm not sure of the best way to fix this; maybe we just have to disable those optimizations on unlowered coroutines.

I'm asking these questions, not to insist that we fix them all at once, but to make sure that we're setting ourselves up to fix them all and not just doing something short-term that ends up needing to be completely replaced with a more comprehensive fix.  The central design questions to me are (1) do we do this during coro splitting or in the frontend and (2) to the extent that we don't do it solely during coro splitting, how do we protect against miscompilation?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101980/new/

https://reviews.llvm.org/D101980