[PATCH] D99517: Implemented [[clang::musttail]] attribute for guaranteed tail calls.

Fri Apr 2 14:11:54 PDT 2021

rsmith added a comment.

Thanks, I think this is looking really good.

@rjmccall, no explicit need to review; I just wanted to make sure you'd seen this and had a chance to express any concerns before we go ahead.

================
Comment at: clang/include/clang/Basic/AttrDocs.td:452
+optimizations are disabled. This guarantees that the call will not cause
+unbounded stack growth if it is part of a recursive cycle in the call graph.
+
----------------
One thing I'd add:

> If the callee is a virtual function that is implemented by a thunk, there is no guarantee in general that the thunk tail-calls the implementation of the virtual function, so such a call in a recursive cycle can still result in unbounded stack growth.

================
Comment at: clang/lib/CodeGen/CGCall.cpp:5318-5319
+      EHCleanupScope *Cleanup = dyn_cast<EHCleanupScope>(&*it);
+      assert(Cleanup && Cleanup->getCleanup()->isCallLifetimeEnd() &&
+             "found unexpected cleanup generating musttail exit");
+    }
----------------
Given the potential for mismatch between the JumpDiagnostics checks and this one, especially as new more exotic kinds of cleanup are added, I wonder if we should use an `ErrorUnsupported` here instead of an `assert`.

I strongly suspect we can still reach the problematic case here for a tail call in a statement expression. I don't think it's feasible to check for all the ways that an arbitrary expression context can have pending cleanups, which we'd need in order to produce precise `Sema` diagnostics for that, so either we handle that here or we blanket reject all `musttail` returns in statement expressions. I think either approach is probably acceptable.

================
Comment at: clang/lib/CodeGen/CGCall.cpp:5315-5317
+    // TODO(haberman): insert checks/assertions to verify that this early exit
+    // is safe. We tried to verify this in Sema but we should double-check
+    // here.
----------------
haberman wrote:
> rsmith wrote:
> > haberman wrote:
> > > aaron.ballman wrote:
> > > > Are you planning to handle this TODO in the patch? If not, can you switch to a FIXME without a name associated with it?
> > > I am interested in feedback on the best way to proceed here.
> > > 
> > >   - Is my assessment correct that we should have an assertion that validates this?
> > >   - Is such an assertion reasonably feasible to implement?
> > >   - Is it ok to defer with FIXME, or should I try to fix it in this patch?
> > > 
> > > I've changed it to a FIXME for now.
> > Yes, I think we should validate this by an assertion if we can. We can check this by walking the cleanup scope stack (walk from `CurrentCleanupScopeDepth` to `EHScopeStack::stable_end()`) and making sure that there is no "problematic" enclosing cleanup scope. Here, "problematic" would mean any scope other than an `EHCleanupScope` containing only `CallLifetimeEnd` cleanups.
> > 
> > Looking at the kinds of cleanups that we might encounter here, I think there may be a few more things that Sema needs to check in order to not get in the way of exception handling. In particular, I think we should reject if the callee is potentially-throwing and the musttail call is inside a try block or a function that's either noexcept or has a dynamic exception specification.
> > 
> > Oh, also, we should disallow musttail calls inside statement expressions, in order to defend against cleanups that exist transiently within an expression.
> I'm having trouble implementing the check because there doesn't appear to be any discriminator in `EHScopeStack::Cleanup` that will let you test if it is a `CallLifetimeEnd`. (The actual code just does virtual dispatch through `EHScopeStack::Cleanup::Emit()`.
> 
> I temporarily implemented this by adding an extra virtual function to act as discriminator. The check fires if a VLA is in scope:
> 
> ```
> int Func14(int x) {
>   int vla[x];
>   [[clang::musttail]] return Bar(x);
> }
> ```
> 
> Do we need to forbid VLAs or do I need to refine the check?
> 
> It appears that `JumpDiagnostics.cpp` is already diagnosing statement expressions and `try`. However I could not get testing to work. I tried adding a test with `try`  but even with `-fexceptions` I am getting:
> 
> ```
> cannot use 'try' with exceptions disabled
> ```
> Do we need to forbid VLAs or do I need to refine the check?

Assuming that LLVM supports musttail calls from functions where a dynamic alloca is in scope, I think we should allow VLAs. The `musttail` documentation doesn't mention this, so I think its OK, and I can't think of a good reason why you wouldn't be able to `musttail` call due to a variably-sized frame.

Perhaps a good model would be to add a virtual function to permit asking a cleanup whether it's optional / skippable.

> I could not get testing to work.

You need `-fcxx-exceptions` to use `try`. At the `-cc1` level, we have essentially-orthogonal settings for "it's valid for exceptions to unwind through this code" (`-fexceptions`) and "C++ exception handling syntax is permitted" (`-fcxx-exceptions`), and you usually need to enable both for CodeGen tests involving exceptions.

================
Comment at: clang/lib/Sema/SemaStmt.cpp:583-586
+  if (!Ex) {
+    assert(hasUncompilableErrorOccurred() && "expected previous error");
+    return false;
+  }
----------------
I would have thought this assert would fire for `void f() { [[clang::musttail]] return; }`. If so, we should reject this case with a diagnostic.

================
Comment at: clang/lib/Sema/SemaStmt.cpp:588
+
+  const CallExpr *CE = dyn_cast<CallExpr>(Ex->IgnoreUnlessSpelledInSource());
+
----------------
`IgnoreUnlessSpelledInSource` is a syntactic check that's only really intended for tooling use cases; I think we want something a bit more semantic here, so `IgnoreImplicitAsWritten` would be more appropriate.

I think it would be reasonable to also skip "parentheses" here (which we treat as also including things like C's `_Generic`). Would `Ex->IgnoreImplicitAsWritten()->IgnoreParens()` work?

If we're going to skip elidable copy construction of the result here (which I think we should), should we also reflect that in the AST? Perhaps we should strip the return value down to being just the call expression? I'm thinking in particular of things like building in C++14 or before with `-fno-elide-constructors`, where code generation for a by-value return of a class object will synthesize a local temporary to hold the result, with a final destination copy emitted after the call. (Testcase: `struct A { A(const A&); }; A f(); A g() { [[clang::musttail]] return f(); }` with `-fno-elide-constructors`.)

================
Comment at: clang/lib/Sema/SemaStmt.cpp:596
+  if (!CE->getCalleeDecl()) {
+    assert(hasUncompilableErrorOccurred() && "expected previous error");
+    return false;
----------------
A call expression doesn't necessarily have a known callee declaration. I would expect this assert to fire on a case like:
```
void f() {
  void (*p)() = f;
  [[clang::musttail]] return p();
}
```
We should reject this with a diagnostic.

================
Comment at: clang/lib/Sema/SemaStmt.cpp:619-620
+    if (dyn_cast<CXXConstructorDecl>(CMD) || dyn_cast<CXXDestructorDecl>(CMD)) {
+      Diag(St->getBeginLoc(), diag::err_musttail_structors_forbidden) << &MTA;
+      Diag(CMD->getBeginLoc(), diag::note_musttail_structors_forbidden);
+      return false;
----------------
Please pass in a flag here so the diagnostic can `%select` and produce a more specific description of the problem.

================
Comment at: clang/lib/Sema/SemaStmt.cpp:631
+    // Caller is an Obj-C block decl: ^(void) { /* ... */ }
+    assert(dyn_cast<BlockDecl>(CurContext) && "unexpected decl context");
+    Diag(St->getBeginLoc(), diag::err_musttail_from_block_forbidden) << &MTA;
----------------
There are a couple of other contexts that can include a return statement: the caller could also be an `ObjCMethodDecl` (an Objective-C method) or a `CapturedDecl` (the body of a `#pragma omp` parallel region). I'd probably use a specific diagnostic ("cannot be used from a block" / "cannot be used from an Objective-C function") for the block and ObjCMethod case, and a nonsepcific-but-correct "cannot be used from this context" for anything else.

================
Comment at: clang/test/CodeGen/attr-musttail.cpp:72
+
+// CHECK: musttail call void @_Z11ReturnsVoidv()
+
----------------
haberman wrote:
> rsmith wrote:
> > For completeness, can we also get a `CHECK-NEXT: ret void` here too?
> I turned on LLVM verification via `opt` so I think this should get verified by the IR verifier. Is that sufficient?
We don't like Clang's tests depending on `opt` in general, but I think in this case it's an acceptable crutch until we fix Clang to run the verifier on its IR output again (as discussed offline, it looks like we lost that as part of the transition to the new pass manager). Please add a FIXME to remove the call to `opt` once that bug is fixed. Other than that, I'm fine with this approach.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99517/new/

https://reviews.llvm.org/D99517