[PATCH] D68845: Don't emit unwanted constructor calls in co_return statements

Wed Oct 30 13:06:49 PDT 2019

aaronpuchert added a comment.

Not answering inline because comments are getting longer.

> I still think that it makes sense to do implicit-move on any control-flow construct that "exits" the current function. Right now that's `return` and `throw`, and they both do implicit-move (albeit inconsistently with plenty of implementation divergence documented in P1155 <http://wg21.link/p1155r2>). C++2a adds `co_return` as another way to "exit." I think it would be un-graceful and inconsistent to permit `return p` but require `co_return std::move(p)`.

Even with `return` it's not obvious where to draw the line: basically we still require that nothing non-trivial happens between the `ReturnStmt` and the `DeclRefExpr`, so for example `return f(x)` won't be turned into `return f(std::move(x))`, and `return a + b` won't be turned into `return std::move(a) + std::move(b)`. It's clear why this can't always happen: if an lvalue is referenced more than once, we might move it more than once. But one could argue that we should still move all local variables that only occur once. I work in a code base with a lot of code like

  SomeBuilder b;
  b.doStuff();
  return std::move(b).getResult();

calling a `getResult() &&` method at the end. It would be nice if I could omit the move. But I see why that's hard.

So basically we have the cases where implicit move is more or less necessary (because explicit moving would prevent NRVO), and we have the cases where it would be possible. And the difficulty is to find the right cutoff. If I understand this correctly, the current rules basically say that if we directly return a `DeclRefExpr`, possibly with implicit conversions, that will be implicitly moved, if possible. (Although it's a bit weird that we silently fall back in the case of ambiguity.)

With `co_return` the implicit move is never “necessary”, because NRVO can't happen, so we could just never do it. You're right that could be considered an inconsistency, but it wouldn't be hard to remember: "for `co_return`, always move."

> You've noticed that to codegen `t.return_value(x)` requires not just overload resolution, but actually figuring out whether `task::return_value` is a function at all —

Function templates definitely make sense to me, I also added test cases for that.

> it might be a static data member of function-pointer type, or something even wilder <https://quuxplusone.github.io/blog/2019/09/18/cppcon-whiteboard-puzzle/>.

It could also be an object of class type with an `operator()`. I'm actually not sure if the standard intentionally allows all these variants.

(Unrelated, but in that blog post you write “for backward compatibility, C++17 had to make `noexcept` function pointers implicitly convertible to non-`noexcept` function pointers.” I don't see this as a matter of backward compatibility: just like you can implicitly cast a reference/pointer to `const`, you should be able to cast `noexcept` away from pointers/references.)

> But eliminating implicit-move won't make that complexity go away, will it? Doing implicit-move just means doing the required overload resolution twice instead of once. That should be easy, compared to the rest of the mess.

Initially I thought so, and I wrote some very elegant (yet unfortunately wrong) code. I called `buildPromiseCall`, which returns an `ExprResult`, and if that was invalid I tried again without the xvalue conversion. Unfortunately, if that overload resolution fails, I get an error message //even if I discard the resulting expression//, and compilation fails. (I would also get warnings, if there are any.) So I looked into the call stack if I could disentangle the overload resolution from the diagnostics, but that wasn't so easy. That's why this is so much code.

At first I thought this was weird and there should be functionality for this in `Sema`, but I think there isn't. I believe that's because there is basically no precedent to “do overload resolution, and if that fails, silently do something else instead”. Normally, if overload resolution fails, that's just an immediate error, no need to wait for anything else. So basically I need to simulate `Sema::BuildCallExpr`, but without throwing errors. Fortunately many code paths aren't needed, so it's not terribly complicated, but I'm still missing at least two cases (function pointers and function objects).

For `return` this is much simpler, because we just do an initialization, and there are ways to “try” initialization without immediately throwing an error. For `co_return` I need to try (parameter) initialization **and** try to resolve the callee.

> That said, I would be happy to start a thread among you, me, David Stone, and the EWG mailing list to consider removing the words "or `co_return`" from class.copy.elision/3 <http://eel.is/c++draft/class.copy.elision#3.1>. Say the word.

I'm somewhat undecided. Things would perhaps also be simpler if we require somewhere that `return_value` is a member function or member function template, thus excluding the case of a function pointer or function object. I don't really see a use case for that, because the `promise_type` is already an object itself, so why create another object within that? Right now the standard reads as if we're doing textual replacement and I think that's what opens the door to all kind of weird stuff.

================
Comment at: clang/lib/Sema/SemaCoroutine.cpp:896
+      if (S.CanPerformCopyInitialization(Entity, &AsRvalue))
+        return true;
+    } else if (auto *FTD = dyn_cast<FunctionTemplateDecl>(D)) {
----------------
Quuxplusone wrote:
> aaronpuchert wrote:
> > Overlad resolution can actually still fail if there is a viable candidate, namely when there are multiple candidates and none is better than all others. It's a bit weird though to fall back to lvalue parameter then as if nothing happened.
> That is an interesting point! I had not considered it during [P1155](https://wg21.link/p1155r2). I imagine that it might make implementation of [P1155](https://wg21.link/p1155r2)'s new logic more difficult.
> 
> GCC 8 (but not trunk) implements the behavior I would expect to see for derived-to-base conversions: https://godbolt.org/z/fj_lNw
> 
> C++ always treats "an embarrassment of riches" equivalently to a "famine"; overload resolution //can// fail due to ambiguity just as easily as it can fail due to no candidates at all. I agree it's "a bit weird," but it would be vastly weirder if C++ did anything //different// from its usual behavior in this case.
> 
> I'm still amenable to the idea that `co_return` should simply not do the copy-elision or implicit-move optimizations at all. I wish I knew some use-cases for `co_return`, so that we could see if the optimization is useful to anyone.
> I agree it's "a bit weird," but it would be vastly weirder if C++ did anything different from its usual behavior in this case.

I would find it more natural to throw an error, i.e. not do the fallback, in the case of ambiguity. So the fallback should in my opinion only happen when there are no viable overload candidates, not when there are too many.

I see your construction as follows: we add both operations that take an lvalue and those that take an rvalue to a bigger “overload set”, and order those that take rvalues as higher/better than those that don't. One could say that we do overload resolution on a `T&&` argument where we allow binding a `T&&` to a `T&`, but this is less preferable in the overload ordering.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D68845/new/

https://reviews.llvm.org/D68845