[clang] [llvm] Implement llvm.coro.await.suspend intrinsic (PR #79712)

via cfe-commits cfe-commits at lists.llvm.org
Sat Jan 27 14:47:53 PST 2024


https://github.com/fpasserby created https://github.com/llvm/llvm-project/pull/79712

Implement `llvm.coro.await.suspend` intrinsics, to deal with performance regression after prohibiting `.await_suspend` inlining, as suggested in #64945.
Actually, there are three new intrinsics, which directly correspond to each of three forms of `await_suspend`:
```
void llvm.coro.await.suspend(ptr %awaiter, ptr %frame, ptr @helperFunction)
i1 llvm.coro.await.suspend.bool(ptr %awaiter, ptr %frame, ptr @helperFunction)
ptr llvm.coro.await.suspend.handle(ptr %awaiter, ptr %frame, ptr @helperFunction)
```
There are three different versions instead of one, because in `bool` case it's result is used for resuming via a branch, and in `coroutine_handle` case exceptions from `await_suspend` are handled in the coroutine, and exceptions from the subsequent `.resume()` are propagated to the caller. 

Await-suspend block is simplified down to intrinsic calls only, for example for the symmetric transfer case:
```
%id = call token @llvm.coro.save(ptr null)
%handle = call ptr @llvm.coro.await.suspend.handle(ptr %awaiter, ptr %frame, ptr @helperFunction)
call void @llvm.coro.resume(%handle)
%result = call i8 @llvm.coro.suspend(token %id, i1 false)
switch i8 %result, ...
```
All the await-suspend logic is moved out into a helper function, generated for each suspension point.
The signature of the function is `<type> helperFunction(ptr %awaiter, ptr %frame)` where `<type>` is one of  `void` `i1` or `ptr`, depending on the return type of `await_suspend`. 
The intrinsic calls are lowered and helper functions bodies are inlined during `CoroSplit` pass, right after the split.

Because I'm new to LLVM, I'm not sure if the helper function generation, calls to them and lowering are implemented in the right way, especially with regard to various metadata and attributes, i. e. for TBAA. All things, that seemed questionable, are marked with `FIXME` comments.

There is another detail:  in case of symmetric transfer raw pointer to the frame of coroutine, that should be resumed, is returned from the helper function and a direct call to `@llvm.coro.resume` is generated. C++ standard demands, that `.resume()` method is evaluated. Not sure how important is this, because code has been generated in the same way before, sans helper function.


>From 858e207bc33466ed85ecf6a7e8718c2e6153f4fe Mon Sep 17 00:00:00 2001
From: fpasserby <125797601+fpasserby at users.noreply.github.com>
Date: Sun, 28 Jan 2024 00:02:15 +0100
Subject: [PATCH] Implement llvm.coro.await.suspend intrinsic

---
 clang/include/clang/AST/ExprCXX.h             |  28 ++-
 clang/lib/CodeGen/CGCoroutine.cpp             | 143 +++++++++++++--
 clang/lib/CodeGen/CodeGenFunction.h           |   4 +
 clang/lib/Sema/SemaCoroutine.cpp              | 160 +++++++++--------
 clang/test/AST/coroutine-locals-cleanup.cpp   |  10 +-
 .../CodeGenCoroutines/coro-always-inline.cpp  |   2 +-
 clang/test/CodeGenCoroutines/coro-await.cpp   |  79 ++++++--
 .../coro-awaiter-noinline-suspend.cpp         | 168 ------------------
 clang/test/CodeGenCoroutines/coro-dwarf.cpp   |  12 ++
 .../coro-function-try-block.cpp               |   2 +-
 .../coro-simplify-suspend-point.cpp           |  67 +++++++
 .../coro-symmetric-transfer-01.cpp            |   8 +-
 .../coro-symmetric-transfer-02.cpp            |  12 +-
 clang/test/CodeGenCoroutines/pr59181.cpp      |   9 +-
 clang/test/SemaCXX/co_await-ast.cpp           |   7 +-
 llvm/include/llvm/IR/Intrinsics.td            |  12 ++
 llvm/lib/IR/Verifier.cpp                      |   3 +
 llvm/lib/Transforms/Coroutines/CoroInstr.h    |  30 ++++
 llvm/lib/Transforms/Coroutines/CoroSplit.cpp  | 167 ++++++++++++++---
 llvm/lib/Transforms/Coroutines/Coroutines.cpp |   3 +
 20 files changed, 590 insertions(+), 336 deletions(-)
 delete mode 100644 clang/test/CodeGenCoroutines/coro-awaiter-noinline-suspend.cpp
 create mode 100644 clang/test/CodeGenCoroutines/coro-simplify-suspend-point.cpp

diff --git a/clang/include/clang/AST/ExprCXX.h b/clang/include/clang/AST/ExprCXX.h
index a0e467b35778c5c..57a505036def409 100644
--- a/clang/include/clang/AST/ExprCXX.h
+++ b/clang/include/clang/AST/ExprCXX.h
@@ -5035,15 +5035,19 @@ class CoroutineSuspendExpr : public Expr {
   enum SubExpr { Operand, Common, Ready, Suspend, Resume, Count };
 
   Stmt *SubExprs[SubExpr::Count];
+  bool IsSuspendNoThrow = false;
   OpaqueValueExpr *OpaqueValue = nullptr;
+  OpaqueValueExpr *OpaqueFramePtr = nullptr;
 
 public:
   CoroutineSuspendExpr(StmtClass SC, SourceLocation KeywordLoc, Expr *Operand,
                        Expr *Common, Expr *Ready, Expr *Suspend, Expr *Resume,
-                       OpaqueValueExpr *OpaqueValue)
+                       bool IsSuspendNoThrow, OpaqueValueExpr *OpaqueValue,
+                       OpaqueValueExpr *OpaqueFramePtr)
       : Expr(SC, Resume->getType(), Resume->getValueKind(),
              Resume->getObjectKind()),
-        KeywordLoc(KeywordLoc), OpaqueValue(OpaqueValue) {
+        KeywordLoc(KeywordLoc), IsSuspendNoThrow(IsSuspendNoThrow),
+        OpaqueValue(OpaqueValue), OpaqueFramePtr(OpaqueFramePtr) {
     SubExprs[SubExpr::Operand] = Operand;
     SubExprs[SubExpr::Common] = Common;
     SubExprs[SubExpr::Ready] = Ready;
@@ -5080,6 +5084,9 @@ class CoroutineSuspendExpr : public Expr {
   /// getOpaqueValue - Return the opaque value placeholder.
   OpaqueValueExpr *getOpaqueValue() const { return OpaqueValue; }
 
+  /// getOpaqueFramePtr - Return coroutine frame pointer placeholder.
+  OpaqueValueExpr *getOpaqueFramePtr() const { return OpaqueFramePtr; }
+
   Expr *getReadyExpr() const {
     return static_cast<Expr*>(SubExprs[SubExpr::Ready]);
   }
@@ -5097,6 +5104,8 @@ class CoroutineSuspendExpr : public Expr {
     return static_cast<Expr *>(SubExprs[SubExpr::Operand]);
   }
 
+  bool isSuspendNoThrow() const { return IsSuspendNoThrow; }
+
   SourceLocation getKeywordLoc() const { return KeywordLoc; }
 
   SourceLocation getBeginLoc() const LLVM_READONLY { return KeywordLoc; }
@@ -5125,10 +5134,12 @@ class CoawaitExpr : public CoroutineSuspendExpr {
 
 public:
   CoawaitExpr(SourceLocation CoawaitLoc, Expr *Operand, Expr *Common,
-              Expr *Ready, Expr *Suspend, Expr *Resume,
-              OpaqueValueExpr *OpaqueValue, bool IsImplicit = false)
+              Expr *Ready, Expr *Suspend, Expr *Resume, bool IsSuspendNoThrow,
+              OpaqueValueExpr *OpaqueValue, OpaqueValueExpr *OpaqueFramePtr,
+              bool IsImplicit = false)
       : CoroutineSuspendExpr(CoawaitExprClass, CoawaitLoc, Operand, Common,
-                             Ready, Suspend, Resume, OpaqueValue) {
+                             Ready, Suspend, Resume, IsSuspendNoThrow,
+                             OpaqueValue, OpaqueFramePtr) {
     CoawaitBits.IsImplicit = IsImplicit;
   }
 
@@ -5206,10 +5217,11 @@ class CoyieldExpr : public CoroutineSuspendExpr {
 
 public:
   CoyieldExpr(SourceLocation CoyieldLoc, Expr *Operand, Expr *Common,
-              Expr *Ready, Expr *Suspend, Expr *Resume,
-              OpaqueValueExpr *OpaqueValue)
+              Expr *Ready, Expr *Suspend, Expr *Resume, bool IsSuspendNoThrow,
+              OpaqueValueExpr *OpaqueValue, OpaqueValueExpr *OpaqueFramePtr)
       : CoroutineSuspendExpr(CoyieldExprClass, CoyieldLoc, Operand, Common,
-                             Ready, Suspend, Resume, OpaqueValue) {}
+                             Ready, Suspend, Resume, IsSuspendNoThrow,
+                             OpaqueValue, OpaqueFramePtr) {}
   CoyieldExpr(SourceLocation CoyieldLoc, QualType Ty, Expr *Operand,
               Expr *Common)
       : CoroutineSuspendExpr(CoyieldExprClass, CoyieldLoc, Ty, Operand,
diff --git a/clang/lib/CodeGen/CGCoroutine.cpp b/clang/lib/CodeGen/CGCoroutine.cpp
index 888d30bfb3e1d6a..1c768df41d43c09 100644
--- a/clang/lib/CodeGen/CGCoroutine.cpp
+++ b/clang/lib/CodeGen/CGCoroutine.cpp
@@ -212,9 +212,10 @@ static LValueOrRValue emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
                                     bool ignoreResult, bool forLValue) {
   auto *E = S.getCommonExpr();
 
-  auto Binder =
+  auto CommonBinder =
       CodeGenFunction::OpaqueValueMappingData::bind(CGF, S.getOpaqueValue(), E);
-  auto UnbindOnExit = llvm::make_scope_exit([&] { Binder.unbind(CGF); });
+  auto UnbindCommonOnExit =
+      llvm::make_scope_exit([&] { CommonBinder.unbind(CGF); });
 
   auto Prefix = buildSuspendPrefixStr(Coro, Kind);
   BasicBlock *ReadyBlock = CGF.createBasicBlock(Prefix + Twine(".ready"));
@@ -232,16 +233,57 @@ static LValueOrRValue emitSuspendExpression(CodeGenFunction &CGF, CGCoroData &Co
   auto *NullPtr = llvm::ConstantPointerNull::get(CGF.CGM.Int8PtrTy);
   auto *SaveCall = Builder.CreateCall(CoroSave, {NullPtr});
 
-  CGF.CurCoro.InSuspendBlock = true;
-  auto *SuspendRet = CGF.EmitScalarExpr(S.getSuspendExpr());
-  CGF.CurCoro.InSuspendBlock = false;
+  auto SuspendHelper = CodeGenFunction(CGF.CGM).generateAwaitSuspendHelper(
+      CGF.CurFn->getName(), Prefix, S);
 
-  if (SuspendRet != nullptr && SuspendRet->getType()->isIntegerTy(1)) {
-    // Veto suspension if requested by bool returning await_suspend.
-    BasicBlock *RealSuspendBlock =
-        CGF.createBasicBlock(Prefix + Twine(".suspend.bool"));
-    CGF.Builder.CreateCondBr(SuspendRet, RealSuspendBlock, ReadyBlock);
-    CGF.EmitBlock(RealSuspendBlock);
+  llvm::CallBase *SuspendRet = nullptr;
+
+  {
+    CGF.CurCoro.InSuspendBlock = true;
+
+    auto FramePtrBinder = CodeGenFunction::OpaqueValueMappingData::bind(
+        CGF, S.getOpaqueFramePtr(), S.getOpaqueFramePtr()->getSourceExpr());
+    auto UnbindFramePtrOnExit =
+        llvm::make_scope_exit([&] { FramePtrBinder.unbind(CGF); });
+
+    SmallVector<llvm::Value *, 3> SuspendHelperCallArgs;
+    SuspendHelperCallArgs.push_back(
+        CGF.getOrCreateOpaqueLValueMapping(S.getOpaqueValue()).getPointer(CGF));
+    SuspendHelperCallArgs.push_back(
+        CGF.getOrCreateOpaqueRValueMapping(S.getOpaqueFramePtr())
+            .getScalarVal());
+    SuspendHelperCallArgs.push_back(SuspendHelper);
+
+    auto IID = llvm::Intrinsic::coro_await_suspend;
+    if (S.getSuspendExpr()->getType()->isBooleanType())
+      IID = llvm::Intrinsic::coro_await_suspend_bool;
+    else if (S.getSuspendExpr()->getType()->isVoidPointerType())
+      IID = llvm::Intrinsic::coro_await_suspend_handle;
+
+    llvm::Function *AwaitSuspendIntrinsic = CGF.CGM.getIntrinsic(IID);
+    // FIXME: add call attributes?
+    if (S.isSuspendNoThrow()) {
+      SuspendRet = CGF.EmitNounwindRuntimeCall(AwaitSuspendIntrinsic,
+                                               SuspendHelperCallArgs);
+    } else {
+      SuspendRet =
+          CGF.EmitCallOrInvoke(AwaitSuspendIntrinsic, SuspendHelperCallArgs);
+    }
+
+    CGF.CurCoro.InSuspendBlock = false;
+  }
+
+  if (SuspendRet != nullptr) {
+    if (SuspendRet->getType()->isIntegerTy(1)) {
+      // Veto suspension if requested by bool returning await_suspend.
+      BasicBlock *RealSuspendBlock =
+          CGF.createBasicBlock(Prefix + Twine(".suspend.bool"));
+      CGF.Builder.CreateCondBr(SuspendRet, RealSuspendBlock, ReadyBlock);
+      CGF.EmitBlock(RealSuspendBlock);
+    } else if (SuspendRet->getType()->isPointerTy()) {
+      auto ResumeIntrinsic = CGF.CGM.getIntrinsic(llvm::Intrinsic::coro_resume);
+      Builder.CreateCall(ResumeIntrinsic, SuspendRet);
+    }
   }
 
   // Emit the suspend point.
@@ -338,6 +380,85 @@ static QualType getCoroutineSuspendExprReturnType(const ASTContext &Ctx,
 }
 #endif
 
+llvm::Function *
+CodeGenFunction::generateAwaitSuspendHelper(Twine const &CoroName,
+                                            Twine const &SuspendPointName,
+                                            CoroutineSuspendExpr const &S) {
+  std::string FuncName = "__await_suspend_helper_";
+  FuncName += CoroName.str();
+  FuncName += SuspendPointName.str();
+
+  ASTContext &C = getContext();
+
+  FunctionArgList args;
+
+  ImplicitParamDecl AwaiterDecl(C, C.VoidPtrTy, ImplicitParamKind::Other);
+  ImplicitParamDecl FrameDecl(C, C.VoidPtrTy, ImplicitParamKind::Other);
+  QualType ReturnTy = S.getSuspendExpr()->getType();
+
+  if (ReturnTy->isBooleanType()) {
+    ReturnTy = C.BoolTy;
+  } else if (ReturnTy->isVoidPointerType()) {
+    ReturnTy = C.VoidPtrTy;
+  } else {
+    ReturnTy = C.VoidTy;
+  }
+
+  args.push_back(&AwaiterDecl);
+  args.push_back(&FrameDecl);
+
+  const CGFunctionInfo &FI =
+      CGM.getTypes().arrangeBuiltinFunctionDeclaration(ReturnTy, args);
+
+  llvm::FunctionType *LTy = CGM.getTypes().GetFunctionType(FI);
+
+  llvm::Function *Fn = llvm::Function::Create(
+      LTy, llvm::GlobalValue::PrivateLinkage, FuncName, &CGM.getModule());
+
+  Fn->addParamAttr(0, llvm::Attribute::AttrKind::NonNull);
+  Fn->addParamAttr(0, llvm::Attribute::AttrKind::NoUndef);
+
+  Fn->addParamAttr(1, llvm::Attribute::AttrKind::NoUndef);
+
+  Fn->setMustProgress();
+  Fn->addFnAttr(llvm::Attribute::AttrKind::AlwaysInline);
+
+  if (S.isSuspendNoThrow()) {
+    Fn->addFnAttr(llvm::Attribute::AttrKind::NoUnwind);
+  }
+
+  StartFunction(GlobalDecl(), ReturnTy, Fn, FI, args);
+
+  llvm::Value *AwaiterPtr = Builder.CreateLoad(GetAddrOfLocalVar(&AwaiterDecl));
+  auto AwaiterLValue =
+      MakeNaturalAlignAddrLValue(AwaiterPtr, AwaiterDecl.getType());
+
+  // FIXME: mark as aliasing with awaiter?
+  // FIXME: TBAA?
+  // FIXME: emit in a better way (maybe egenerate AST in SemaCoroutine)?
+  auto FramePtrRValue =
+      RValue::get(Builder.CreateLoad(GetAddrOfLocalVar(&FrameDecl)));
+
+  auto AwaiterBinder = CodeGenFunction::OpaqueValueMappingData::bind(
+      *this, S.getOpaqueValue(), AwaiterLValue);
+  auto FramePtrBinder = CodeGenFunction::OpaqueValueMappingData::bind(
+      *this, S.getOpaqueFramePtr(), FramePtrRValue);
+
+  auto *SuspendRet = EmitScalarExpr(S.getSuspendExpr());
+
+  auto UnbindCommonOnExit =
+      llvm::make_scope_exit([&] { AwaiterBinder.unbind(*this); });
+  auto UnbindFramePtrOnExit =
+      llvm::make_scope_exit([&] { FramePtrBinder.unbind(*this); });
+  if (SuspendRet != nullptr) {
+    Fn->addRetAttr(llvm::Attribute::AttrKind::NoUndef);
+    Builder.CreateStore(SuspendRet, ReturnValue);
+  }
+
+  FinishFunction();
+  return Fn;
+}
+
 LValue
 CodeGenFunction::EmitCoawaitLValue(const CoawaitExpr *E) {
   assert(getCoroutineSuspendExprReturnType(getContext(), E)->isReferenceType() &&
diff --git a/clang/lib/CodeGen/CodeGenFunction.h b/clang/lib/CodeGen/CodeGenFunction.h
index 143ad64e8816b12..811dbd9aed44e46 100644
--- a/clang/lib/CodeGen/CodeGenFunction.h
+++ b/clang/lib/CodeGen/CodeGenFunction.h
@@ -350,6 +350,10 @@ class CodeGenFunction : public CodeGenTypeCache {
     return isCoroutine() && CurCoro.InSuspendBlock;
   }
 
+  llvm::Function *generateAwaitSuspendHelper(Twine const &CoroName,
+                                             Twine const &SuspendPointName,
+                                             CoroutineSuspendExpr const &S);
+
   /// CurGD - The GlobalDecl for the current function being compiled.
   GlobalDecl CurGD;
 
diff --git a/clang/lib/Sema/SemaCoroutine.cpp b/clang/lib/Sema/SemaCoroutine.cpp
index a969b9383563b22..a46744205d0f298 100644
--- a/clang/lib/Sema/SemaCoroutine.cpp
+++ b/clang/lib/Sema/SemaCoroutine.cpp
@@ -31,6 +31,8 @@
 using namespace clang;
 using namespace sema;
 
+static bool isNoThrow(Sema &S, const Stmt *E);
+
 static LookupResult lookupMember(Sema &S, const char *Name, CXXRecordDecl *RD,
                                  SourceLocation Loc, bool &Res) {
   DeclarationName DN = S.PP.getIdentifierInfo(Name);
@@ -266,7 +268,7 @@ static ExprResult buildOperatorCoawaitCall(Sema &SemaRef, Scope *S,
 }
 
 static ExprResult buildCoroutineHandle(Sema &S, QualType PromiseType,
-                                       SourceLocation Loc) {
+                                       Expr *FramePtr, SourceLocation Loc) {
   QualType CoroHandleType = lookupCoroutineHandleType(S, PromiseType, Loc);
   if (CoroHandleType.isNull())
     return ExprError();
@@ -280,9 +282,6 @@ static ExprResult buildCoroutineHandle(Sema &S, QualType PromiseType,
     return ExprError();
   }
 
-  Expr *FramePtr =
-      S.BuildBuiltinCallExpr(Loc, Builtin::BI__builtin_coro_frame, {});
-
   CXXScopeSpec SS;
   ExprResult FromAddr =
       S.BuildDeclarationNameExpr(SS, Found, /*NeedsADL=*/false);
@@ -296,6 +295,8 @@ struct ReadySuspendResumeResult {
   enum AwaitCallType { ACT_Ready, ACT_Suspend, ACT_Resume };
   Expr *Results[3];
   OpaqueValueExpr *OpaqueValue;
+  OpaqueValueExpr *OpaqueFramePtr;
+  bool IsSuspendNoThrow;
   bool IsInvalid;
 };
 
@@ -380,66 +381,7 @@ static Expr *maybeTailCall(Sema &S, QualType RetType, Expr *E,
   // __builtin_coro_resume so that the cleanup code are not inserted in-between
   // the resume call and return instruction, which would interfere with the
   // musttail call contract.
-  JustAddress = S.MaybeCreateExprWithCleanups(JustAddress);
-  return S.BuildBuiltinCallExpr(Loc, Builtin::BI__builtin_coro_resume,
-                                JustAddress);
-}
-
-/// The await_suspend call performed by co_await is essentially asynchronous
-/// to the execution of the coroutine. Inlining it normally into an unsplit
-/// coroutine can cause miscompilation because the coroutine CFG misrepresents
-/// the true control flow of the program: things that happen in the
-/// await_suspend are not guaranteed to happen prior to the resumption of the
-/// coroutine, and things that happen after the resumption of the coroutine
-/// (including its exit and the potential deallocation of the coroutine frame)
-/// are not guaranteed to happen only after the end of await_suspend.
-///
-/// See https://github.com/llvm/llvm-project/issues/56301 and
-/// https://reviews.llvm.org/D157070 for the example and the full discussion.
-///
-/// The short-term solution to this problem is to mark the call as uninlinable.
-/// But we don't want to do this if the call is known to be trivial, which is
-/// very common.
-///
-/// The long-term solution may introduce patterns like:
-///
-///  call @llvm.coro.await_suspend(ptr %awaiter, ptr %handle,
-///                                ptr @awaitSuspendFn)
-///
-/// Then it is much easier to perform the safety analysis in the middle end.
-/// If it is safe to inline the call to awaitSuspend, we can replace it in the
-/// CoroEarly pass. Otherwise we could replace it in the CoroSplit pass.
-static void tryMarkAwaitSuspendNoInline(Sema &S, OpaqueValueExpr *Awaiter,
-                                        CallExpr *AwaitSuspend) {
-  // The method here to extract the awaiter decl is not precise.
-  // This is intentional. Since it is hard to perform the analysis in the
-  // frontend due to the complexity of C++'s type systems.
-  // And we prefer to perform such analysis in the middle end since it is
-  // easier to implement and more powerful.
-  CXXRecordDecl *AwaiterDecl =
-      Awaiter->getType().getNonReferenceType()->getAsCXXRecordDecl();
-
-  if (AwaiterDecl && AwaiterDecl->field_empty())
-    return;
-
-  FunctionDecl *FD = AwaitSuspend->getDirectCallee();
-
-  assert(FD);
-
-  // If the `await_suspend()` function is marked as `always_inline` explicitly,
-  // we should give the user the right to control the codegen.
-  if (FD->hasAttr<NoInlineAttr>() || FD->hasAttr<AlwaysInlineAttr>())
-    return;
-
-  // This is problematic if the user calls the await_suspend standalone. But on
-  // the on hand, it is not incorrect semantically since inlining is not part
-  // of the standard. On the other hand, it is relatively rare to call
-  // the await_suspend function standalone.
-  //
-  // And given we've already had the long-term plan, the current workaround
-  // looks relatively tolerant.
-  FD->addAttr(
-      NoInlineAttr::CreateImplicit(S.getASTContext(), FD->getLocation()));
+  return S.MaybeCreateExprWithCleanups(JustAddress);
 }
 
 /// Build calls to await_ready, await_suspend, and await_resume for a co_await
@@ -461,7 +403,11 @@ static ReadySuspendResumeResult buildCoawaitCalls(Sema &S, VarDecl *CoroPromise,
 
   // Assume valid until we see otherwise.
   // Further operations are responsible for setting IsInalid to true.
-  ReadySuspendResumeResult Calls = {{}, Operand, /*IsInvalid=*/false};
+  ReadySuspendResumeResult Calls = {{},
+                                    Operand,
+                                    /*OpaqueFramePtr=*/nullptr,
+                                    /*IsSuspendNoThrow=*/false,
+                                    /*IsInvalid=*/false};
 
   using ACT = ReadySuspendResumeResult::AwaitCallType;
 
@@ -495,8 +441,17 @@ static ReadySuspendResumeResult buildCoawaitCalls(Sema &S, VarDecl *CoroPromise,
       Calls.Results[ACT::ACT_Ready] = S.MaybeCreateExprWithCleanups(Conv.get());
   }
 
+  Expr *GetFramePtr =
+      S.BuildBuiltinCallExpr(Loc, Builtin::BI__builtin_coro_frame, {});
+
+  OpaqueValueExpr *FramePtr = new (S.Context)
+      OpaqueValueExpr(Loc, GetFramePtr->getType(), VK_PRValue,
+                      GetFramePtr->getObjectKind(), GetFramePtr);
+
+  Calls.OpaqueFramePtr = FramePtr;
+
   ExprResult CoroHandleRes =
-      buildCoroutineHandle(S, CoroPromise->getType(), Loc);
+      buildCoroutineHandle(S, CoroPromise->getType(), FramePtr, Loc);
   if (CoroHandleRes.isInvalid()) {
     Calls.IsInvalid = true;
     return Calls;
@@ -513,10 +468,6 @@ static ReadySuspendResumeResult buildCoawaitCalls(Sema &S, VarDecl *CoroPromise,
     //     type Z.
     QualType RetType = AwaitSuspend->getCallReturnType(S.Context);
 
-    // We need to mark await_suspend as noinline temporarily. See the comment
-    // of tryMarkAwaitSuspendNoInline for details.
-    tryMarkAwaitSuspendNoInline(S, Operand, AwaitSuspend);
-
     // Support for coroutine_handle returning await_suspend.
     if (Expr *TailCallSuspend =
             maybeTailCall(S, RetType, AwaitSuspend, Loc))
@@ -542,6 +493,10 @@ static ReadySuspendResumeResult buildCoawaitCalls(Sema &S, VarDecl *CoroPromise,
     }
   }
 
+  if (Calls.Results[ACT::ACT_Suspend]) {
+    Calls.IsSuspendNoThrow = isNoThrow(S, Calls.Results[ACT::ACT_Suspend]);
+  }
+
   BuildSubExpr(ACT::ACT_Resume, "await_resume", std::nullopt);
 
   // Make sure the awaiter object gets a chance to be cleaned up.
@@ -694,6 +649,59 @@ static FunctionScopeInfo *checkCoroutineContext(Sema &S, SourceLocation Loc,
   return ScopeInfo;
 }
 
+/// Recursively check \p E and all its children to see if any call target
+/// (including constructor call) is declared noexcept. Also any value returned
+/// from the call has a noexcept destructor.
+static bool isNoThrow(Sema &S, const Stmt *E) {
+  auto isDeclNoexcept = [&](const Decl *D, bool IsDtor = false) -> bool {
+    // In the case of dtor, the call to dtor is implicit and hence we should
+    // pass nullptr to canCalleeThrow.
+    if (Sema::canCalleeThrow(S, IsDtor ? nullptr : cast<Expr>(E), D)) {
+      return false;
+    }
+    return true;
+  };
+
+  if (auto *CE = dyn_cast<CXXConstructExpr>(E)) {
+    CXXConstructorDecl *Ctor = CE->getConstructor();
+    if (!isDeclNoexcept(Ctor)) {
+      return false;
+    }
+    // Check the corresponding destructor of the constructor.
+    if (!isDeclNoexcept(Ctor->getParent()->getDestructor(), /*IsDtor=*/true)) {
+      return false;
+    }
+  } else if (auto *CE = dyn_cast<CallExpr>(E)) {
+    if (CE->isTypeDependent())
+      return false;
+
+    if (!isDeclNoexcept(CE->getCalleeDecl())) {
+      return false;
+    }
+
+    QualType ReturnType = CE->getCallReturnType(S.getASTContext());
+    // Check the destructor of the call return type, if any.
+    if (ReturnType.isDestructedType() ==
+        QualType::DestructionKind::DK_cxx_destructor) {
+      const auto *T =
+          cast<RecordType>(ReturnType.getCanonicalType().getTypePtr());
+      if (!isDeclNoexcept(cast<CXXRecordDecl>(T->getDecl())->getDestructor(),
+                          /*IsDtor=*/true)) {
+        return false;
+      }
+    }
+  }
+  for (const auto *Child : E->children()) {
+    if (!Child)
+      continue;
+    if (!isNoThrow(S, Child)) {
+      return false;
+    }
+  }
+
+  return true;
+}
+
 /// Recursively check \p E and all its children to see if any call target
 /// (including constructor call) is declared noexcept. Also any value returned
 /// from the call has a noexcept destructor.
@@ -992,9 +1000,9 @@ ExprResult Sema::BuildResolvedCoawaitExpr(SourceLocation Loc, Expr *Operand,
   if (RSS.IsInvalid)
     return ExprError();
 
-  Expr *Res = new (Context)
-      CoawaitExpr(Loc, Operand, Awaiter, RSS.Results[0], RSS.Results[1],
-                  RSS.Results[2], RSS.OpaqueValue, IsImplicit);
+  Expr *Res = new (Context) CoawaitExpr(
+      Loc, Operand, Awaiter, RSS.Results[0], RSS.Results[1], RSS.Results[2],
+      RSS.IsSuspendNoThrow, RSS.OpaqueValue, RSS.OpaqueFramePtr, IsImplicit);
 
   return Res;
 }
@@ -1050,9 +1058,9 @@ ExprResult Sema::BuildCoyieldExpr(SourceLocation Loc, Expr *E) {
   if (RSS.IsInvalid)
     return ExprError();
 
-  Expr *Res =
-      new (Context) CoyieldExpr(Loc, Operand, E, RSS.Results[0], RSS.Results[1],
-                                RSS.Results[2], RSS.OpaqueValue);
+  Expr *Res = new (Context) CoyieldExpr(
+      Loc, Operand, E, RSS.Results[0], RSS.Results[1], RSS.Results[2],
+      RSS.IsSuspendNoThrow, RSS.OpaqueValue, RSS.OpaqueFramePtr);
 
   return Res;
 }
diff --git a/clang/test/AST/coroutine-locals-cleanup.cpp b/clang/test/AST/coroutine-locals-cleanup.cpp
index ce106b8e230a10e..6264df01fa2acb2 100644
--- a/clang/test/AST/coroutine-locals-cleanup.cpp
+++ b/clang/test/AST/coroutine-locals-cleanup.cpp
@@ -90,10 +90,7 @@ Task bar() {
 // CHECK:                 ExprWithCleanups {{.*}} 'bool'
 // CHECK-NEXT:              CXXMemberCallExpr {{.*}} 'bool'
 // CHECK-NEXT:                MemberExpr {{.*}} .await_ready
-// CHECK:                 CallExpr {{.*}} 'void'
-// CHECK-NEXT:              ImplicitCastExpr {{.*}} 'void (*)(void *)'
-// CHECK-NEXT:                DeclRefExpr {{.*}} '__builtin_coro_resume' 'void (void *)'
-// CHECK-NEXT:              ExprWithCleanups {{.*}} 'void *'
+// CHECK:                 ExprWithCleanups {{.*}} 'void *'
 
 // CHECK:           CaseStmt
 // CHECK:             ExprWithCleanups {{.*}} 'void'
@@ -103,7 +100,4 @@ Task bar() {
 // CHECK:                 ExprWithCleanups {{.*}} 'bool'
 // CHECK-NEXT:              CXXMemberCallExpr {{.*}} 'bool'
 // CHECK-NEXT:                MemberExpr {{.*}} .await_ready
-// CHECK:                 CallExpr {{.*}} 'void'
-// CHECK-NEXT:              ImplicitCastExpr {{.*}} 'void (*)(void *)'
-// CHECK-NEXT:                DeclRefExpr {{.*}} '__builtin_coro_resume' 'void (void *)'
-// CHECK-NEXT:              ExprWithCleanups {{.*}} 'void *'
+// CHECK:                 ExprWithCleanups {{.*}} 'void *'
diff --git a/clang/test/CodeGenCoroutines/coro-always-inline.cpp b/clang/test/CodeGenCoroutines/coro-always-inline.cpp
index 6e13a62fbd98659..d4f67a73f517268 100644
--- a/clang/test/CodeGenCoroutines/coro-always-inline.cpp
+++ b/clang/test/CodeGenCoroutines/coro-always-inline.cpp
@@ -34,7 +34,7 @@ struct coroutine_traits {
 // CHECK-LABEL: @_Z3foov
 // CHECK-LABEL: entry:
 // CHECK: %ref.tmp.reload.addr = getelementptr
-// CHECK: %ref.tmp4.reload.addr = getelementptr
+// CHECK: %ref.tmp3.reload.addr = getelementptr
 void foo() { co_return; }
 
 // Check that bar is not inlined even it's marked as always_inline.
diff --git a/clang/test/CodeGenCoroutines/coro-await.cpp b/clang/test/CodeGenCoroutines/coro-await.cpp
index dc5a765ccb83d78..c19a8428b153282 100644
--- a/clang/test/CodeGenCoroutines/coro-await.cpp
+++ b/clang/test/CodeGenCoroutines/coro-await.cpp
@@ -71,16 +71,13 @@ extern "C" void f0() {
   // CHECK: [[SUSPEND_BB]]:
   // CHECK: %[[SUSPEND_ID:.+]] = call token @llvm.coro.save(
   // ---------------------------
-  // Build the coroutine handle and pass it to await_suspend
+  // Call coro.await.suspend
   // ---------------------------
-  // CHECK: call ptr @_ZNSt16coroutine_handleINSt16coroutine_traitsIJvEE12promise_typeEE12from_addressEPv(ptr %[[FRAME]])
-  //   ... many lines of code to coerce coroutine_handle into an ptr scalar
-  // CHECK: %[[CH:.+]] = load ptr, ptr %{{.+}}
-  // CHECK: call void @_ZN14suspend_always13await_suspendESt16coroutine_handleIvE(ptr {{[^,]*}} %[[AWAITABLE]], ptr %[[CH]])
+  // CHECK-NEXT: call void @llvm.coro.await.suspend(ptr %[[AWAITABLE]], ptr %[[FRAME]], ptr @__await_suspend_helper_f0await)
   // -------------------------
   // Generate a suspend point:
   // -------------------------
-  // CHECK: %[[OUTCOME:.+]] = call i8 @llvm.coro.suspend(token %[[SUSPEND_ID]], i1 false)
+  // CHECK-NEXT: %[[OUTCOME:.+]] = call i8 @llvm.coro.suspend(token %[[SUSPEND_ID]], i1 false)
   // CHECK: switch i8 %[[OUTCOME]], label %[[RET_BB:.+]] [
   // CHECK:   i8 0, label %[[READY_BB]]
   // CHECK:   i8 1, label %[[CLEANUP_BB:.+]]
@@ -101,6 +98,17 @@ extern "C" void f0() {
   // CHECK-NEXT: call zeroext i1 @_ZN10final_susp11await_readyEv(ptr
   // CHECK: %[[FINALSP_ID:.+]] = call token @llvm.coro.save(
   // CHECK: call i8 @llvm.coro.suspend(token %[[FINALSP_ID]], i1 true)
+
+  // await suspend helper
+  // CHECK: define{{.*}} @__await_suspend_helper_f0await(ptr {{[^,]*}} %[[AWAITABLE_ARG:.+]], ptr {{[^,]*}} %[[FRAME_ARG:.+]])
+  // CHECK: store ptr %[[AWAITABLE_ARG]], ptr %[[AWAITABLE_TMP:.+]],
+  // CHECK: store ptr %[[FRAME_ARG]], ptr %[[FRAME_TMP:.+]],
+  // CHECK: %[[AWAITABLE:.+]] = load ptr, ptr %[[AWAITABLE_TMP]]
+  // CHECK: %[[FRAME:.+]] = load ptr, ptr %[[FRAME_TMP]]
+  // CHECK: call ptr @_ZNSt16coroutine_handleINSt16coroutine_traitsIJvEE12promise_typeEE12from_addressEPv(ptr %[[FRAME]])
+  //   ... many lines of code to coerce coroutine_handle into an ptr scalar
+  // CHECK: %[[CH:.+]] = load ptr, ptr %{{.+}}
+  // CHECK: call void @_ZN14suspend_always13await_suspendESt16coroutine_handleIvE(ptr {{[^,]*}} %[[AWAITABLE]], ptr %[[CH]])
 }
 
 struct suspend_maybe {
@@ -131,7 +139,7 @@ extern "C" void f1(int) {
 
   // See if we need to suspend:
   // --------------------------
-  // CHECK: %[[READY:.+]] = call zeroext i1 @_ZN13suspend_maybe11await_readyEv(ptr {{[^,]*}} %[[AWAITABLE]])
+  // CHECK: %[[READY:.+]] = call zeroext i1 @_ZN13suspend_maybe11await_readyEv(ptr {{[^,]*}} %[[AWAITABLE:.+]])
   // CHECK: br i1 %[[READY]], label %[[READY_BB:.+]], label %[[SUSPEND_BB:.+]]
 
   // If we are suspending:
@@ -139,12 +147,9 @@ extern "C" void f1(int) {
   // CHECK: [[SUSPEND_BB]]:
   // CHECK: %[[SUSPEND_ID:.+]] = call token @llvm.coro.save(
   // ---------------------------
-  // Build the coroutine handle and pass it to await_suspend
+  // Call coro.await.suspend
   // ---------------------------
-  // CHECK: call ptr @_ZNSt16coroutine_handleINSt16coroutine_traitsIJviEE12promise_typeEE12from_addressEPv(ptr %[[FRAME]])
-  //   ... many lines of code to coerce coroutine_handle into an ptr scalar
-  // CHECK: %[[CH:.+]] = load ptr, ptr %{{.+}}
-  // CHECK: %[[YES:.+]] = call zeroext i1 @_ZN13suspend_maybe13await_suspendESt16coroutine_handleIvE(ptr {{[^,]*}} %[[AWAITABLE]], ptr %[[CH]])
+  // CHECK-NEXT: %[[YES:.+]] = call i1 @llvm.coro.await.suspend.bool(ptr %[[AWAITABLE]], ptr %[[FRAME]], ptr @__await_suspend_helper_f1yield)
   // -------------------------------------------
   // See if await_suspend decided not to suspend
   // -------------------------------------------
@@ -155,6 +160,18 @@ extern "C" void f1(int) {
 
   // CHECK: [[READY_BB]]:
   // CHECK:     call void @_ZN13suspend_maybe12await_resumeEv(ptr {{[^,]*}} %[[AWAITABLE]])
+
+  // Await suspend helper
+  // CHECK: define {{.*}} i1 @__await_suspend_helper_f1yield(ptr {{[^,]*}} %[[AWAITABLE_ARG:.+]], ptr {{[^,]*}} %[[FRAME_ARG:.+]])
+  // CHECK: store ptr %[[AWAITABLE_ARG]], ptr %[[AWAITABLE_TMP:.+]],
+  // CHECK: store ptr %[[FRAME_ARG]], ptr %[[FRAME_TMP:.+]],
+  // CHECK: %[[AWAITABLE:.+]] = load ptr, ptr %[[AWAITABLE_TMP]]
+  // CHECK: %[[FRAME:.+]] = load ptr, ptr %[[FRAME_TMP]]
+  // CHECK: call ptr @_ZNSt16coroutine_handleINSt16coroutine_traitsIJviEE12promise_typeEE12from_addressEPv(ptr %[[FRAME]])
+  //   ... many lines of code to coerce coroutine_handle into an ptr scalar
+  // CHECK: %[[CH:.+]] = load ptr, ptr %{{.+}}
+  // CHECK: %[[YES:.+]] = call zeroext i1 @_ZN13suspend_maybe13await_suspendESt16coroutine_handleIvE(ptr {{[^,]*}} %[[AWAITABLE]], ptr %[[CH]]) 
+  // CHECK-NEXT: ret i1 %[[YES]]
 }
 
 struct ComplexAwaiter {
@@ -340,11 +357,39 @@ struct TailCallAwait {
 
 // CHECK-LABEL: @TestTailcall(
 extern "C" void TestTailcall() {
+  // CHECK: %[[PROMISE:.+]] = alloca %"struct.std::coroutine_traits<void>::promise_type"
+  // CHECK: %[[FRAME:.+]] = call ptr @llvm.coro.begin(
   co_await TailCallAwait{};
+  // CHECK: %[[READY:.+]] = call zeroext i1 @_ZN13TailCallAwait11await_readyEv(ptr {{[^,]*}} %[[AWAITABLE:.+]])
+  // CHECK: br i1 %[[READY]], label %[[READY_BB:.+]], label %[[SUSPEND_BB:.+]]
 
-  // CHECK: %[[RESULT:.+]] = call ptr @_ZN13TailCallAwait13await_suspendESt16coroutine_handleIvE(ptr
-  // CHECK: %[[COERCE:.+]] = getelementptr inbounds %"struct.std::coroutine_handle", ptr %[[TMP:.+]], i32 0, i32 0
-  // CHECK: store ptr %[[RESULT]], ptr %[[COERCE]]
-  // CHECK: %[[ADDR:.+]] = call ptr @_ZNSt16coroutine_handleIvE7addressEv(ptr {{[^,]*}} %[[TMP]])
-  // CHECK: call void @llvm.coro.resume(ptr %[[ADDR]])
+  // If we are suspending:
+  // ---------------------
+  // CHECK: [[SUSPEND_BB]]:
+  // CHECK: %[[SUSPEND_ID:.+]] = call token @llvm.coro.save(
+  // ---------------------------
+  // Call coro.await.suspend
+  // ---------------------------
+  // CHECK-NEXT: %[[RESUMED:.+]] = call ptr @llvm.coro.await.suspend.handle(ptr %[[AWAITABLE]], ptr %[[FRAME]], ptr @__await_suspend_helper_TestTailcallawait)
+  // CHECK-NEXT: call void @llvm.coro.resume(ptr %[[RESUMED]])
+  // CHECK-NEXT: %[[OUTCOME:.+]] = call i8 @llvm.coro.suspend(token %[[SUSPEND_ID]], i1 false)
+  // CHECK-NEXT: switch i8 %[[OUTCOME]], label %[[RET_BB:.+]] [
+  // CHECK-NEXT:   i8 0, label %[[READY_BB]]
+  // CHECK-NEXT:   i8 1, label %{{.+}}
+  // CHECK-NEXT: ]
+
+  // Await suspend helper
+  // CHECK: define {{.*}} ptr @__await_suspend_helper_TestTailcallawait(ptr {{[^,]*}} %[[AWAITABLE_ARG:.+]], ptr {{[^,]*}} %[[FRAME_ARG:.+]])
+  // CHECK: store ptr %[[AWAITABLE_ARG]], ptr %[[AWAITABLE_TMP:.+]],
+  // CHECK: store ptr %[[FRAME_ARG]], ptr %[[FRAME_TMP:.+]],
+  // CHECK: %[[AWAITABLE:.+]] = load ptr, ptr %[[AWAITABLE_TMP]]
+  // CHECK: %[[FRAME:.+]] = load ptr, ptr %[[FRAME_TMP]]
+  // CHECK: call ptr  @_ZNSt16coroutine_handleINSt16coroutine_traitsIJvEE12promise_typeEE12from_addressEPv(ptr %[[FRAME]])
+  //   ... many lines of code to coerce coroutine_handle into an ptr scalar
+  // CHECK: %[[CH:.+]] = load ptr, ptr %{{.+}}
+  // CHECK-NEXT: %[[RESULT:.+]] = call ptr @_ZN13TailCallAwait13await_suspendESt16coroutine_handleIvE(ptr {{[^,]*}} %[[AWAITABLE]], ptr %[[CH]]) 
+  // CHECK-NEXT: %[[COERCE:.+]] = getelementptr inbounds %"struct.std::coroutine_handle", ptr %[[TMP:.+]], i32 0, i32 0
+  // CHECK-NEXT: store ptr %[[RESULT]], ptr %[[COERCE]]
+  // CHECK-NEXT: %[[ADDR:.+]] = call ptr @_ZNSt16coroutine_handleIvE7addressEv(ptr {{[^,]*}} %[[TMP]])
+  // CHECK-NEXT: ret ptr %[[ADDR]]
 }
diff --git a/clang/test/CodeGenCoroutines/coro-awaiter-noinline-suspend.cpp b/clang/test/CodeGenCoroutines/coro-awaiter-noinline-suspend.cpp
deleted file mode 100644
index f95286faf46ec8d..000000000000000
--- a/clang/test/CodeGenCoroutines/coro-awaiter-noinline-suspend.cpp
+++ /dev/null
@@ -1,168 +0,0 @@
-// Tests that we can mark await-suspend as noinline correctly.
-//
-// RUN: %clang_cc1 -std=c++20 -triple x86_64-unknown-linux-gnu -emit-llvm -o - %s \
-// RUN:     -O1 -disable-llvm-passes | FileCheck %s
-
-#include "Inputs/coroutine.h"
-
-struct Task {
-  struct promise_type {
-    struct FinalAwaiter {
-      bool await_ready() const noexcept { return false; }
-      template <typename PromiseType>
-      std::coroutine_handle<> await_suspend(std::coroutine_handle<PromiseType> h) noexcept {
-        return h.promise().continuation;
-      }
-      void await_resume() noexcept {}
-    };
-
-    Task get_return_object() noexcept {
-      return std::coroutine_handle<promise_type>::from_promise(*this);
-    }
-
-    std::suspend_always initial_suspend() noexcept { return {}; }
-    FinalAwaiter final_suspend() noexcept { return {}; }
-    void unhandled_exception() noexcept {}
-    void return_void() noexcept {}
-
-    std::coroutine_handle<> continuation;
-  };
-
-  Task(std::coroutine_handle<promise_type> handle);
-  ~Task();
-
-private:
-  std::coroutine_handle<promise_type> handle;
-};
-
-struct StatefulAwaiter {
-    int value;
-    bool await_ready() const noexcept { return false; }
-    template <typename PromiseType>
-    void await_suspend(std::coroutine_handle<PromiseType> h) noexcept {}
-    void await_resume() noexcept {}
-};
-
-typedef std::suspend_always NoStateAwaiter;
-using AnotherStatefulAwaiter = StatefulAwaiter;
-
-template <class T>
-struct TemplatedAwaiter {
-    T value;
-    bool await_ready() const noexcept { return false; }
-    template <typename PromiseType>
-    void await_suspend(std::coroutine_handle<PromiseType> h) noexcept {}
-    void await_resume() noexcept {}
-};
-
-
-class Awaitable {};
-StatefulAwaiter operator co_await(Awaitable) {
-  return StatefulAwaiter{};
-}
-
-StatefulAwaiter GlobalAwaiter;
-class Awaitable2 {};
-StatefulAwaiter& operator co_await(Awaitable2) {
-  return GlobalAwaiter;
-}
-
-struct AlwaysInlineStatefulAwaiter {
-    void* value;
-    bool await_ready() const noexcept { return false; }
-
-    template <typename PromiseType>
-    __attribute__((always_inline))
-    void await_suspend(std::coroutine_handle<PromiseType> h) noexcept {}
-
-    void await_resume() noexcept {}
-};
-
-Task testing() {
-    co_await std::suspend_always{};
-    co_await StatefulAwaiter{};
-    co_await AnotherStatefulAwaiter{};
-    
-    // Test lvalue case.
-    StatefulAwaiter awaiter;
-    co_await awaiter;
-
-    // The explicit call to await_suspend is not considered suspended.
-    awaiter.await_suspend(std::coroutine_handle<void>::from_address(nullptr));
-
-    co_await TemplatedAwaiter<int>{};
-    TemplatedAwaiter<int> TemplatedAwaiterInstace;
-    co_await TemplatedAwaiterInstace;
-
-    co_await Awaitable{};
-    co_await Awaitable2{};
-
-    co_await AlwaysInlineStatefulAwaiter{};
-}
-
-struct AwaitTransformTask {
-  struct promise_type {
-    struct FinalAwaiter {
-      bool await_ready() const noexcept { return false; }
-      template <typename PromiseType>
-      std::coroutine_handle<> await_suspend(std::coroutine_handle<PromiseType> h) noexcept {
-        return h.promise().continuation;
-      }
-      void await_resume() noexcept {}
-    };
-
-    AwaitTransformTask get_return_object() noexcept {
-      return std::coroutine_handle<promise_type>::from_promise(*this);
-    }
-
-    std::suspend_always initial_suspend() noexcept { return {}; }
-    FinalAwaiter final_suspend() noexcept { return {}; }
-    void unhandled_exception() noexcept {}
-    void return_void() noexcept {}
-
-    template <typename Awaitable>
-    auto await_transform(Awaitable &&awaitable) {
-      return awaitable;
-    }
-
-    std::coroutine_handle<> continuation;
-  };
-
-  AwaitTransformTask(std::coroutine_handle<promise_type> handle);
-  ~AwaitTransformTask();
-
-private:
-  std::coroutine_handle<promise_type> handle;
-};
-
-struct awaitableWithGetAwaiter {
-  bool await_ready() const noexcept { return false; }
-  template <typename PromiseType>
-  void await_suspend(std::coroutine_handle<PromiseType> h) noexcept {}
-  void await_resume() noexcept {}
-};
-
-AwaitTransformTask testingWithAwaitTransform() {
-  co_await awaitableWithGetAwaiter{};
-}
-
-// CHECK: define{{.*}}@_ZNSt14suspend_always13await_suspendESt16coroutine_handleIvE{{.*}}#[[NORMAL_ATTR:[0-9]+]]
-
-// CHECK: define{{.*}}@_ZN15StatefulAwaiter13await_suspendIN4Task12promise_typeEEEvSt16coroutine_handleIT_E{{.*}}#[[NOINLINE_ATTR:[0-9]+]]
-
-// CHECK: define{{.*}}@_ZN15StatefulAwaiter13await_suspendIvEEvSt16coroutine_handleIT_E{{.*}}#[[NORMAL_ATTR]]
-
-// CHECK: define{{.*}}@_ZN16TemplatedAwaiterIiE13await_suspendIN4Task12promise_typeEEEvSt16coroutine_handleIT_E{{.*}}#[[NOINLINE_ATTR]]
-
-// CHECK: define{{.*}}@_ZN27AlwaysInlineStatefulAwaiter13await_suspendIN4Task12promise_typeEEEvSt16coroutine_handleIT_E{{.*}}#[[ALWAYS_INLINE_ATTR:[0-9]+]]
-
-// CHECK: define{{.*}}@_ZN4Task12promise_type12FinalAwaiter13await_suspendIS0_EESt16coroutine_handleIvES3_IT_E{{.*}}#[[NORMAL_ATTR]]
-
-// CHECK: define{{.*}}@_ZN23awaitableWithGetAwaiter13await_suspendIN18AwaitTransformTask12promise_typeEEEvSt16coroutine_handleIT_E{{.*}}#[[NORMAL_ATTR]]
-
-// CHECK: define{{.*}}@_ZN18AwaitTransformTask12promise_type12FinalAwaiter13await_suspendIS0_EESt16coroutine_handleIvES3_IT_E{{.*}}#[[NORMAL_ATTR]]
-
-// CHECK-NOT: attributes #[[NORMAL_ATTR]] = noinline
-// CHECK: attributes #[[NOINLINE_ATTR]] = {{.*}}noinline
-// CHECK-NOT: attributes #[[ALWAYS_INLINE_ATTR]] = {{.*}}noinline
-// CHECK: attributes #[[ALWAYS_INLINE_ATTR]] = {{.*}}alwaysinline
diff --git a/clang/test/CodeGenCoroutines/coro-dwarf.cpp b/clang/test/CodeGenCoroutines/coro-dwarf.cpp
index 7914babe5483a45..70cfe5bc8a25faa 100644
--- a/clang/test/CodeGenCoroutines/coro-dwarf.cpp
+++ b/clang/test/CodeGenCoroutines/coro-dwarf.cpp
@@ -70,3 +70,15 @@ void f_coro(int val, MoveOnly moParam, MoveAndCopy mcParam) {
 // CHECK: !{{[0-9]+}} = !DILocalVariable(name: "moParam", arg: 2, scope: ![[SP]], file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}})
 // CHECK: !{{[0-9]+}} = !DILocalVariable(name: "mcParam", arg: 3, scope: ![[SP]], file: !{{[0-9]+}}, line: {{[0-9]+}}, type: !{{[0-9]+}})
 // CHECK: !{{[0-9]+}} = !DILocalVariable(name: "__promise",
+
+// CHECK: !{{[0-9]+}} = distinct !DISubprogram(linkageName: "__await_suspend_helper__Z6f_coroi8MoveOnly11MoveAndCopyinit"
+// CHECK-NEXT: !{{[0-9]+}} = !DIFile
+// CHECK-NEXT: !{{[0-9]+}} = !DISubroutineType
+// CHECK-NEXT: !{{[0-9]+}} = !DILocalVariable(arg: 1,
+// CHECK-NEXT: !{{[0-9]+}} = !DILocation
+// CHECK-NEXT: !{{[0-9]+}} = !DILocalVariable(arg: 2,
+
+// CHECK: !{{[0-9]+}} = distinct !DISubprogram(linkageName: "__await_suspend_helper__Z6f_coroi8MoveOnly11MoveAndCopyfinal"
+// CHECK-NEXT: !{{[0-9]+}} = !DILocalVariable(arg: 1,
+// CHECK-NEXT: !{{[0-9]+}} = !DILocation
+// CHECK-NEXT: !{{[0-9]+}} = !DILocalVariable(arg: 2,
diff --git a/clang/test/CodeGenCoroutines/coro-function-try-block.cpp b/clang/test/CodeGenCoroutines/coro-function-try-block.cpp
index f609eb55b8e771f..023a95870361fd2 100644
--- a/clang/test/CodeGenCoroutines/coro-function-try-block.cpp
+++ b/clang/test/CodeGenCoroutines/coro-function-try-block.cpp
@@ -19,5 +19,5 @@ task f() try {
 }
 
 // CHECK-LABEL: define{{.*}} void @_Z1fv(
-// CHECK: call void @_ZNSt13suspend_never13await_suspendESt16coroutine_handleIvE(
+// CHECK: call void @llvm.coro.await.suspend(
 // CHECK: call void @_ZN4task12promise_type11return_voidEv(
diff --git a/clang/test/CodeGenCoroutines/coro-simplify-suspend-point.cpp b/clang/test/CodeGenCoroutines/coro-simplify-suspend-point.cpp
new file mode 100644
index 000000000000000..49fed1ed145dd31
--- /dev/null
+++ b/clang/test/CodeGenCoroutines/coro-simplify-suspend-point.cpp
@@ -0,0 +1,67 @@
+// Test that we can perform suspend point simplification
+// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -std=c++20 -O1 -emit-llvm %s -o - | FileCheck %s
+// RUN: %clang -std=c++20 -O1 -emit-llvm -c  %s -o %t && %clang -c %t
+
+#include "Inputs/coroutine.h"
+
+struct detached_task {
+  struct promise_type {
+    detached_task get_return_object() noexcept {
+      return detached_task{std::coroutine_handle<promise_type>::from_promise(*this)};
+    }
+
+    void return_void() noexcept {}
+
+    struct final_awaiter {
+      bool await_ready() noexcept { return false; }
+      std::coroutine_handle<> await_suspend(std::coroutine_handle<promise_type> h) noexcept {
+        h.destroy();
+        return {};
+      }
+      void await_resume() noexcept {}
+    };
+
+    void unhandled_exception() noexcept {}
+
+    final_awaiter final_suspend() noexcept { return {}; }
+
+    std::suspend_always initial_suspend() noexcept { return {}; }
+  };
+
+  ~detached_task() {
+    if (coro_) {
+      coro_.destroy();
+      coro_ = {};
+    }
+  }
+
+  void start() && {
+    auto tmp = coro_;
+    coro_ = {};
+    tmp.resume();
+  }
+
+  std::coroutine_handle<promise_type> coro_;
+};
+
+class SelfResumeAwaiter final
+{
+public:
+    bool await_ready() noexcept { return false; }
+    std::coroutine_handle<> await_suspend(std::coroutine_handle<> h) { 
+      return h; 
+    }
+    void await_resume() noexcept {}
+};
+
+// Check that there are only two calls in resume: external()
+// and coroutine destroy
+// CHECK-LABEL: define{{.*}}void @_Z3foov.resume
+// CHECK-NOT: {{.*}}call{{.*}}
+// CHECK: tail call{{.*}}void %{{[0-9+]}}(
+// CHECK-NOT: {{.*}}call{{.*}}
+// CHECK: define
+detached_task foo() {
+  co_await SelfResumeAwaiter{};
+  co_return;
+}
diff --git a/clang/test/CodeGenCoroutines/coro-symmetric-transfer-01.cpp b/clang/test/CodeGenCoroutines/coro-symmetric-transfer-01.cpp
index c0b9e9ee2c55818..f1870b8eceff99a 100644
--- a/clang/test/CodeGenCoroutines/coro-symmetric-transfer-01.cpp
+++ b/clang/test/CodeGenCoroutines/coro-symmetric-transfer-01.cpp
@@ -47,13 +47,9 @@ detached_task foo() {
   co_return;
 }
 
+// FIXME: is this test needed now?
 // check that the lifetime of the coroutine handle used to obtain the address is contained within single basic block, and hence does not live across suspension points.
 // CHECK-LABEL: final.suspend:
 // CHECK:         %{{.+}} = call token @llvm.coro.save(ptr null)
-// CHECK:         call void @llvm.lifetime.start.p0(i64 8, ptr %[[HDL:.+]])
-// CHECK:         %[[CALL:.+]] = call ptr @_ZN13detached_task12promise_type13final_awaiter13await_suspendESt16coroutine_handleIS0_E(
-// CHECK:         %[[HDL_CAST2:.+]] = getelementptr inbounds %"struct.std::coroutine_handle.0", ptr %[[HDL]], i32 0, i32 0
-// CHECK:         store ptr %[[CALL]], ptr %[[HDL_CAST2]], align 8
-// CHECK:         %[[HDL_TRANSFER:.+]] = call noundef ptr @_ZNKSt16coroutine_handleIvE7addressEv(ptr noundef {{.*}}%[[HDL]])
-// CHECK:         call void @llvm.lifetime.end.p0(i64 8, ptr %[[HDL]])
+// CHECK:         %[[HDL_TRANSFER:.+]] = call ptr @llvm.coro.await.suspend.handle
 // CHECK:         call void @llvm.coro.resume(ptr %[[HDL_TRANSFER]])
diff --git a/clang/test/CodeGenCoroutines/coro-symmetric-transfer-02.cpp b/clang/test/CodeGenCoroutines/coro-symmetric-transfer-02.cpp
index 890d55e41de9538..ca6cf74115a3b1e 100644
--- a/clang/test/CodeGenCoroutines/coro-symmetric-transfer-02.cpp
+++ b/clang/test/CodeGenCoroutines/coro-symmetric-transfer-02.cpp
@@ -89,10 +89,8 @@ Task bar() {
 // CHECK:         br i1 %{{.+}}, label %[[CASE1_AWAIT_READY:.+]], label %[[CASE1_AWAIT_SUSPEND:.+]]
 // CHECK:       [[CASE1_AWAIT_SUSPEND]]:
 // CHECK-NEXT:    %{{.+}} = call token @llvm.coro.save(ptr null)
-// CHECK-NEXT:    call void @llvm.lifetime.start.p0(i64 8, ptr %[[TMP1:.+]])
-
-// CHECK:    call void @llvm.lifetime.end.p0(i64 8, ptr %[[TMP1]])
-// CHECK-NEXT:    call void @llvm.coro.resume
+// CHECK-NEXT:    %[[HANDLE1_PTR:.+]] = call ptr @llvm.coro.await.suspend.handle
+// CHECK-NEXT:    call void @llvm.coro.resume(ptr %[[HANDLE1_PTR]])
 // CHECK-NEXT:    %{{.+}} = call i8 @llvm.coro.suspend
 // CHECK-NEXT:    switch i8 %{{.+}}, label %coro.ret [
 // CHECK-NEXT:      i8 0, label %[[CASE1_AWAIT_READY]]
@@ -106,10 +104,8 @@ Task bar() {
 // CHECK:         br i1 %{{.+}}, label %[[CASE2_AWAIT_READY:.+]], label %[[CASE2_AWAIT_SUSPEND:.+]]
 // CHECK:       [[CASE2_AWAIT_SUSPEND]]:
 // CHECK-NEXT:    %{{.+}} = call token @llvm.coro.save(ptr null)
-// CHECK-NEXT:    call void @llvm.lifetime.start.p0(i64 8, ptr %[[TMP2:.+]])
-
-// CHECK:    call void @llvm.lifetime.end.p0(i64 8, ptr %[[TMP2]])
-// CHECK-NEXT:    call void @llvm.coro.resume
+// CHECK-NEXT:    %[[HANDLE2_PTR:.+]] = call ptr @llvm.coro.await.suspend.handle
+// CHECK-NEXT:    call void @llvm.coro.resume(ptr %[[HANDLE2_PTR]])
 // CHECK-NEXT:    %{{.+}} = call i8 @llvm.coro.suspend
 // CHECK-NEXT:    switch i8 %{{.+}}, label %coro.ret [
 // CHECK-NEXT:      i8 0, label %[[CASE2_AWAIT_READY]]
diff --git a/clang/test/CodeGenCoroutines/pr59181.cpp b/clang/test/CodeGenCoroutines/pr59181.cpp
index 80f4634db252146..ac732b6232e76e9 100644
--- a/clang/test/CodeGenCoroutines/pr59181.cpp
+++ b/clang/test/CodeGenCoroutines/pr59181.cpp
@@ -48,13 +48,14 @@ void foo() {
   bar(true);
 }
 
+// FIXME: the test doesn't seem to be relevant anymore,
+// because objects that require cleanup are no more emitted in the suspend block
 // CHECK: cleanup.cont:{{.*}}
 // CHECK-NEXT: load i8
 // CHECK-NEXT: trunc
 // CHECK-NEXT: store i1 false
-// CHECK-NEXT: call void @llvm.lifetime.start.p0(i64 8, ptr [[REF:%ref.tmp[0-9]+]])
 
 // CHECK: await.suspend:{{.*}}
-// CHECK-NOT: call void @llvm.lifetime.start.p0(i64 8, ptr [[REF]])
-// CHECK: call void @_ZZN4Task12promise_type15await_transformES_EN10Suspension13await_suspendESt16coroutine_handleIvE
-// CHECK-NEXT: call void @llvm.lifetime.end.p0(i64 8, ptr [[REF]])
+// CHECK-NOT: call void @llvm.lifetime
+// CHECK: call void @llvm.coro.await.suspend(
+// CHECK-NEXT: %{{[0-9]+}} = call i8 @llvm.coro.suspend(
diff --git a/clang/test/SemaCXX/co_await-ast.cpp b/clang/test/SemaCXX/co_await-ast.cpp
index 10cee21da0e87cd..ed3a988653eb29e 100644
--- a/clang/test/SemaCXX/co_await-ast.cpp
+++ b/clang/test/SemaCXX/co_await-ast.cpp
@@ -81,9 +81,10 @@ awaitable foo() {
 // CHECK:   |     |         `-CallExpr {{.*}} 'coroutine_handle<awaitable_frame>':'std::coroutine_handle<awaitable_frame>'
 // CHECK:   |     |           |-ImplicitCastExpr {{.*}} 'coroutine_handle<awaitable_frame> (*)(void *) noexcept' <FunctionToPointerDecay>
 // CHECK:   |     |           | `-DeclRefExpr {{.*}} 'coroutine_handle<awaitable_frame> (void *) noexcept' lvalue CXXMethod {{.*}} 'from_address' 'coroutine_handle<awaitable_frame> (void *) noexcept'
-// CHECK:   |     |           `-CallExpr {{.*}} 'void *'
-// CHECK:   |     |             `-ImplicitCastExpr {{.*}} 'void *(*)() noexcept' <FunctionToPointerDecay>
-// CHECK:   |     |               `-DeclRefExpr {{.*}} 'void *() noexcept' lvalue Function {{.*}} '__builtin_coro_frame' 'void *() noexcept'
+// CHECK:   |     |           `-OpaqueValueExpr {{.*}} 'void *'
+// CHECK:   |     |             `-CallExpr {{.*}} 'void *'
+// CHECK:   |     |               `-ImplicitCastExpr {{.*}} 'void *(*)() noexcept' <FunctionToPointerDecay>
+// CHECK:   |     |                 `-DeclRefExpr {{.*}} 'void *() noexcept' lvalue Function {{.*}} '__builtin_coro_frame' 'void *() noexcept'
 // CHECK:   |     `-CXXMemberCallExpr {{.*}} 'void'
 // CHECK:   |       `-MemberExpr {{.*}} '<bound member function type>' .await_resume {{.*}}
 // CHECK:   |         `-ImplicitCastExpr {{.*}} 'const awaitable_frame::result_t' lvalue <NoOp>
diff --git a/llvm/include/llvm/IR/Intrinsics.td b/llvm/include/llvm/IR/Intrinsics.td
index 3c19c7b063652c2..50a50c240b8ad3e 100644
--- a/llvm/include/llvm/IR/Intrinsics.td
+++ b/llvm/include/llvm/IR/Intrinsics.td
@@ -1687,6 +1687,18 @@ def int_coro_promise : Intrinsic<[llvm_ptr_ty],
                                  [llvm_ptr_ty, llvm_i32_ty, llvm_i1_ty],
                                  [IntrNoMem, NoCapture<ArgIndex<0>>]>;
 
+def int_coro_await_suspend : Intrinsic<[],
+                                       [llvm_ptr_ty, llvm_ptr_ty, llvm_ptr_ty],
+                                       [Throws]>;
+
+def int_coro_await_suspend_bool : Intrinsic<[llvm_i1_ty],
+                                            [llvm_ptr_ty, llvm_ptr_ty, llvm_ptr_ty],
+                                            [Throws]>;
+
+def int_coro_await_suspend_handle : Intrinsic<[llvm_ptr_ty],
+                                              [llvm_ptr_ty, llvm_ptr_ty, llvm_ptr_ty],
+                                              [Throws]>;
+
 // Coroutine Lowering Intrinsics. Used internally by coroutine passes.
 
 def int_coro_subfn_addr : DefaultAttrsIntrinsic<
diff --git a/llvm/lib/IR/Verifier.cpp b/llvm/lib/IR/Verifier.cpp
index 91cf91fbc788bd9..2ff0de88ccf0cc7 100644
--- a/llvm/lib/IR/Verifier.cpp
+++ b/llvm/lib/IR/Verifier.cpp
@@ -4947,6 +4947,9 @@ void Verifier::visitInstruction(Instruction &I) {
                 F->getIntrinsicID() == Intrinsic::seh_scope_end ||
                 F->getIntrinsicID() == Intrinsic::coro_resume ||
                 F->getIntrinsicID() == Intrinsic::coro_destroy ||
+                F->getIntrinsicID() == Intrinsic::coro_await_suspend ||
+                F->getIntrinsicID() == Intrinsic::coro_await_suspend_bool ||
+                F->getIntrinsicID() == Intrinsic::coro_await_suspend_handle ||
                 F->getIntrinsicID() ==
                     Intrinsic::experimental_patchpoint_void ||
                 F->getIntrinsicID() == Intrinsic::experimental_patchpoint_i64 ||
diff --git a/llvm/lib/Transforms/Coroutines/CoroInstr.h b/llvm/lib/Transforms/Coroutines/CoroInstr.h
index f01aa58eb899961..ce4fe5a4e4de4e6 100644
--- a/llvm/lib/Transforms/Coroutines/CoroInstr.h
+++ b/llvm/lib/Transforms/Coroutines/CoroInstr.h
@@ -78,6 +78,36 @@ class LLVM_LIBRARY_VISIBILITY CoroAllocInst : public IntrinsicInst {
   }
 };
 
+/// This represents the llvm.coro.await.suspend instruction.
+class LLVM_LIBRARY_VISIBILITY CoroAwaitSuspendInst : public CallBase {
+  enum { AwaiterArg, FrameArg, HelperArg };
+
+public:
+  Value *getAwaiter() const { return getArgOperand(AwaiterArg); }
+
+  Value *getFrame() const { return getArgOperand(FrameArg); }
+
+  Function *getHelperFunction() const {
+    return cast<Function>(getArgOperand(HelperArg));
+  }
+
+  // Methods to support type inquiry through isa, cast, and dyn_cast:
+  static bool classof(const CallBase *CB) {
+    if (const Function *CF = CB->getCalledFunction()) {
+      auto IID = CF->getIntrinsicID();
+      return IID == Intrinsic::coro_await_suspend ||
+             IID == Intrinsic::coro_await_suspend_bool ||
+             IID == Intrinsic::coro_await_suspend_handle;
+    }
+
+    return false;
+  }
+
+  static bool classof(const Value *V) {
+    return isa<CallBase>(V) && classof(cast<CallBase>(V));
+  }
+};
+
 /// This represents a common base class for llvm.coro.id instructions.
 class LLVM_LIBRARY_VISIBILITY AnyCoroIdInst : public IntrinsicInst {
 public:
diff --git a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
index 7758b52abc20466..e01db71a4608a88 100644
--- a/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroSplit.cpp
@@ -79,6 +79,73 @@ using namespace llvm;
 
 namespace {
 
+// Created on demand if the coro-early pass has work to do.
+class Lowerer : public coro::LowererBase {
+  IRBuilder<> Builder;
+  void lowerAwaitSuspend(CoroAwaitSuspendInst *CB);
+
+public:
+  Lowerer(Module &M) : LowererBase(M), Builder(Context) {}
+
+  void lowerAwaitSuspends(Function &F);
+};
+
+void Lowerer::lowerAwaitSuspend(CoroAwaitSuspendInst *CB) {
+  auto Helper = CB->getHelperFunction();
+  auto Awaiter = CB->getAwaiter();
+  auto FramePtr = CB->getFrame();
+
+  Builder.SetInsertPoint(CB);
+
+  CallBase *NewCall = nullptr;
+  if (auto Invoke = dyn_cast<InvokeInst>(CB)) {
+    auto HelperInvoke =
+        Builder.CreateInvoke(Helper, Invoke->getNormalDest(),
+                             Invoke->getUnwindDest(), {Awaiter, FramePtr});
+
+    HelperInvoke->setCallingConv(Invoke->getCallingConv());
+    std::copy(Invoke->bundle_op_info_begin(), Invoke->bundle_op_info_end(),
+              HelperInvoke->bundle_op_info_begin());
+    AttributeList NewAttributes =
+        Invoke->getAttributes().removeParamAttributes(Context, 2);
+    HelperInvoke->setAttributes(NewAttributes);
+    HelperInvoke->setDebugLoc(Invoke->getDebugLoc());
+    NewCall = HelperInvoke;
+  } else if (auto Call = dyn_cast<CallInst>(CB)) {
+    auto HelperCall = Builder.CreateCall(Helper, {Awaiter, FramePtr});
+
+    AttributeList NewAttributes =
+        Call->getAttributes().removeParamAttributes(Context, 2);
+    HelperCall->setAttributes(NewAttributes);
+    HelperCall->setDebugLoc(Call->getDebugLoc());
+    NewCall = HelperCall;
+  }
+
+  CB->replaceAllUsesWith(NewCall);
+  CB->eraseFromParent();
+
+  InlineFunctionInfo FnInfo;
+  auto InlineRes = InlineFunction(*NewCall, FnInfo);
+  assert(InlineRes.isSuccess() && "Expected inlining to succeed");
+  (void)InlineRes;
+}
+
+void Lowerer::lowerAwaitSuspends(Function &F) {
+  SmallVector<CoroAwaitSuspendInst *, 4> AwaitSuspends;
+
+  for (Instruction &I : llvm::make_early_inc_range(instructions(F))) {
+    auto *CB = dyn_cast<CallBase>(&I);
+    if (!CB)
+      continue;
+
+    if (auto *AWS = dyn_cast<CoroAwaitSuspendInst>(CB))
+      AwaitSuspends.push_back(AWS);
+  }
+
+  for (auto *AWS : AwaitSuspends)
+    lowerAwaitSuspend(AWS);
+}
+
 /// A little helper class for building
 class CoroCloner {
 public:
@@ -1513,6 +1580,11 @@ static void handleNoSuspendCoroutine(coro::Shape &Shape) {
 // the coroutine and if that is the case we cannot eliminate the suspend point.
 static bool hasCallsInBlockBetween(Instruction *From, Instruction *To) {
   for (Instruction *I = From; I != To; I = I->getNextNode()) {
+    // This one could resume the coroutine,
+    // but additional analysis before the check should ensure,
+    // that it can't happen
+    if (isa<CoroAwaitSuspendInst>(I))
+      continue;
     // Assume that no intrinsic can resume the coroutine.
     if (isa<IntrinsicInst>(I))
       continue;
@@ -1553,6 +1625,9 @@ static bool hasCallsInBlocksBetween(BasicBlock *SaveBB, BasicBlock *ResDesBB) {
 }
 
 static bool hasCallsBetween(Instruction *Save, Instruction *ResumeOrDestroy) {
+  if (Save == ResumeOrDestroy)
+    return false;
+
   auto *SaveBB = Save->getParent();
   auto *ResumeOrDestroyBB = ResumeOrDestroy->getParent();
 
@@ -1575,6 +1650,35 @@ static bool hasCallsBetween(Instruction *Save, Instruction *ResumeOrDestroy) {
   return false;
 }
 
+// Check if await-suspend helper is "simple".
+// The conditions are:
+// 1. There is exactly one ret ptr instruction
+// 2. The return result is exactly coroutine frame parameter, passed to helper
+// FIXME: perform more sophisiticated analysis?
+static bool isSimpleHelper(CoroAwaitSuspendInst *AWS) {
+  auto Helper = AWS->getHelperFunction();
+
+  SmallVector<ReturnInst *, 4> Rets;
+
+  for (auto &BB : *Helper) {
+    if (BB.empty())
+      continue;
+    auto terminator = BB.getTerminator();
+    if (!terminator)
+      continue;
+    if (auto Ret = dyn_cast<ReturnInst>(terminator))
+      Rets.push_back(cast<ReturnInst>(terminator));
+  }
+
+  // FIXME: get rid of magical constant
+  for (auto Ret : Rets)
+    if (Ret->getReturnValue() != Helper->getArg(1) ||
+        hasCallsBetween(Helper->getEntryBlock().getFirstNonPHI(), Ret))
+      return false;
+
+  return true;
+}
+
 // If a SuspendIntrin is preceded by Resume or Destroy, we can eliminate the
 // suspend point and replace it with nornal control flow.
 static bool simplifySuspendPoint(CoroSuspendInst *Suspend,
@@ -1598,9 +1702,18 @@ static bool simplifySuspendPoint(CoroSuspendInst *Suspend,
   if (!SubFn)
     return false;
 
-  // Does not refer to the current coroutine, we cannot do anything with it.
-  if (SubFn->getFrame() != CoroBegin)
-    return false;
+  auto Frame = SubFn->getFrame();
+
+  // Check that frame directly always refers to the current coroutine,
+  // either directly or via helper
+  if (Frame != CoroBegin) {
+    auto *AWS = dyn_cast<CoroAwaitSuspendInst>(Frame);
+    if (!AWS)
+      return false;
+
+    if (AWS->getFrame() != CoroBegin || !isSimpleHelper(AWS))
+      return false;
+  }
 
   // See if the transformation is safe. Specifically, see if there are any
   // calls in between Save and CallInstr. They can potenitally resume the
@@ -1678,12 +1791,16 @@ static void simplifySuspendPoints(coro::Shape &Shape) {
   }
 }
 
-static void splitSwitchCoroutine(Function &F, coro::Shape &Shape,
+static void splitSwitchCoroutine(Module &M, Function &F, coro::Shape &Shape,
                                  SmallVectorImpl<Function *> &Clones,
                                  TargetTransformInfo &TTI) {
   assert(Shape.ABI == coro::ABI::Switch);
 
   createResumeEntryBlock(F, Shape);
+
+  Lowerer lowerer(M);
+  lowerer.lowerAwaitSuspends(F);
+
   auto ResumeClone = createClone(F, ".resume", Shape,
                                  CoroCloner::Kind::SwitchResume);
   auto DestroyClone = createClone(F, ".destroy", Shape,
@@ -2003,31 +2120,31 @@ namespace {
 }
 
 static coro::Shape
-splitCoroutine(Function &F, SmallVectorImpl<Function *> &Clones,
+splitCoroutine(Module &M, Function &F, SmallVectorImpl<Function *> &Clones,
                TargetTransformInfo &TTI, bool OptimizeFrame,
                std::function<bool(Instruction &)> MaterializableCallback) {
-  PrettyStackTraceFunction prettyStackTrace(F);
+    PrettyStackTraceFunction prettyStackTrace(F);
 
-  // The suspend-crossing algorithm in buildCoroutineFrame get tripped
-  // up by uses in unreachable blocks, so remove them as a first pass.
-  removeUnreachableBlocks(F);
+    // The suspend-crossing algorithm in buildCoroutineFrame get tripped
+    // up by uses in unreachable blocks, so remove them as a first pass.
+    removeUnreachableBlocks(F);
 
-  coro::Shape Shape(F, OptimizeFrame);
-  if (!Shape.CoroBegin)
-    return Shape;
+    coro::Shape Shape(F, OptimizeFrame);
+    if (!Shape.CoroBegin)
+      return Shape;
 
-  simplifySuspendPoints(Shape);
-  buildCoroutineFrame(F, Shape, MaterializableCallback);
-  replaceFrameSizeAndAlignment(Shape);
+    simplifySuspendPoints(Shape);
+    buildCoroutineFrame(F, Shape, MaterializableCallback);
+    replaceFrameSizeAndAlignment(Shape);
 
-  // If there are no suspend points, no split required, just remove
-  // the allocation and deallocation blocks, they are not needed.
-  if (Shape.CoroSuspends.empty()) {
-    handleNoSuspendCoroutine(Shape);
-  } else {
-    switch (Shape.ABI) {
-    case coro::ABI::Switch:
-      splitSwitchCoroutine(F, Shape, Clones, TTI);
+    // If there are no suspend points, no split required, just remove
+    // the allocation and deallocation blocks, they are not needed.
+    if (Shape.CoroSuspends.empty()) {
+      handleNoSuspendCoroutine(Shape);
+    } else {
+      switch (Shape.ABI) {
+      case coro::ABI::Switch:
+      splitSwitchCoroutine(M, F, Shape, Clones, TTI);
       break;
     case coro::ABI::Async:
       splitAsyncCoroutine(F, Shape, Clones);
@@ -2037,7 +2154,7 @@ splitCoroutine(Function &F, SmallVectorImpl<Function *> &Clones,
       splitRetconCoroutine(F, Shape, Clones);
       break;
     }
-  }
+    }
 
   // Replace all the swifterror operations in the original function.
   // This invalidates SwiftErrorOps in the Shape.
@@ -2212,7 +2329,7 @@ PreservedAnalyses CoroSplitPass::run(LazyCallGraph::SCC &C,
     SmallVector<Function *, 4> Clones;
     auto &ORE = FAM.getResult<OptimizationRemarkEmitterAnalysis>(F);
     const coro::Shape Shape =
-        splitCoroutine(F, Clones, FAM.getResult<TargetIRAnalysis>(F),
+        splitCoroutine(M, F, Clones, FAM.getResult<TargetIRAnalysis>(F),
                        OptimizeFrame, MaterializableCallback);
     updateCallGraphAfterCoroutineSplit(*N, Shape, Clones, C, CG, AM, UR, FAM);
 
diff --git a/llvm/lib/Transforms/Coroutines/Coroutines.cpp b/llvm/lib/Transforms/Coroutines/Coroutines.cpp
index eef5543bae24ab9..8ecdf019a2c84a4 100644
--- a/llvm/lib/Transforms/Coroutines/Coroutines.cpp
+++ b/llvm/lib/Transforms/Coroutines/Coroutines.cpp
@@ -67,6 +67,9 @@ static const char *const CoroIntrinsics[] = {
     "llvm.coro.async.resume",
     "llvm.coro.async.size.replace",
     "llvm.coro.async.store_resume",
+    "llvm.coro.await.suspend",
+    "llvm.coro.await.suspend.bool",
+    "llvm.coro.await.suspend.handle",
     "llvm.coro.begin",
     "llvm.coro.destroy",
     "llvm.coro.done",



More information about the cfe-commits mailing list