[llvm] r368788 - Extend coroutines to support a "returned continuation" lowering.

John McCall via llvm-commits llvm-commits at lists.llvm.org
Tue Aug 13 20:53:17 PDT 2019


Author: rjmccall
Date: Tue Aug 13 20:53:17 2019
New Revision: 368788

URL: http://llvm.org/viewvc/llvm-project?rev=368788&view=rev
Log:
Extend coroutines to support a "returned continuation" lowering.

A quick contrast of this ABI with the currently-implemented ABI:

- Allocation is implicitly managed by the lowering passes, which is fine
  for frontends that are fine with assuming that allocation cannot fail.
  This assumption is necessary to implement dynamic allocas anyway.

- The lowering attempts to fit the coroutine frame into an opaque,
  statically-sized buffer before falling back on allocation; the same
  buffer must be provided to every resume point.  A buffer must be at
  least pointer-sized.

- The resume and destroy functions have been combined; the continuation
  function takes a parameter indicating whether it has succeeded.

- Conversely, every suspend point begins its own continuation function.

- The continuation function pointer is directly returned to the caller
  instead of being stored in the frame.  The continuation can therefore
  directly destroy the frame when exiting the coroutine instead of having
  to leave it in a defunct state.

- Other values can be returned directly to the caller instead of going
  through a promise allocation.  The frontend provides a "prototype"
  function declaration from which the type, calling convention, and
  attributes of the continuation functions are taken.

- On the caller side, the frontend can generate natural IR that directly
  uses the continuation functions as long as it prevents IPO with the
  coroutine until lowering has happened.  In combination with the point
  above, the frontend is almost totally in charge of the ABI of the
  coroutine.

- Unique-yield coroutines are given some special treatment.

Added:
    llvm/trunk/test/Transforms/Coroutines/coro-retcon-once-value.ll
    llvm/trunk/test/Transforms/Coroutines/coro-retcon-once-value2.ll
    llvm/trunk/test/Transforms/Coroutines/coro-retcon-value.ll
    llvm/trunk/test/Transforms/Coroutines/coro-retcon.ll
Modified:
    llvm/trunk/docs/Coroutines.rst
    llvm/trunk/include/llvm/IR/Intrinsics.td
    llvm/trunk/lib/Transforms/Coroutines/CoroCleanup.cpp
    llvm/trunk/lib/Transforms/Coroutines/CoroEarly.cpp
    llvm/trunk/lib/Transforms/Coroutines/CoroFrame.cpp
    llvm/trunk/lib/Transforms/Coroutines/CoroInstr.h
    llvm/trunk/lib/Transforms/Coroutines/CoroInternal.h
    llvm/trunk/lib/Transforms/Coroutines/CoroSplit.cpp
    llvm/trunk/lib/Transforms/Coroutines/Coroutines.cpp
    llvm/trunk/test/Transforms/Coroutines/coro-debug.ll

Modified: llvm/trunk/docs/Coroutines.rst
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/Coroutines.rst?rev=368788&r1=368787&r2=368788&view=diff
==============================================================================
--- llvm/trunk/docs/Coroutines.rst (original)
+++ llvm/trunk/docs/Coroutines.rst Tue Aug 13 20:53:17 2019
@@ -41,32 +41,144 @@ then destroy it:
 In addition to the function stack frame which exists when a coroutine is 
 executing, there is an additional region of storage that contains objects that 
 keep the coroutine state when a coroutine is suspended. This region of storage
-is called **coroutine frame**. It is created when a coroutine is called and 
-destroyed when a coroutine runs to completion or destroyed by a call to 
-the `coro.destroy`_ intrinsic. 
-
-An LLVM coroutine is represented as an LLVM function that has calls to
-`coroutine intrinsics`_ defining the structure of the coroutine.
-After lowering, a coroutine is split into several
-functions that represent three different ways of how control can enter the 
-coroutine: 
-
-1. a ramp function, which represents an initial invocation of the coroutine that
-   creates the coroutine frame and executes the coroutine code until it 
-   encounters a suspend point or reaches the end of the function;
-
-2. a coroutine resume function that is invoked when the coroutine is resumed;
-
-3. a coroutine destroy function that is invoked when the coroutine is destroyed.
-
-.. note:: Splitting out resume and destroy functions are just one of the 
-   possible ways of lowering the coroutine. We chose it for initial 
-   implementation as it matches closely the mental model and results in 
-   reasonably nice code.
+is called the **coroutine frame**. It is created when a coroutine is called
+and destroyed when a coroutine either runs to completion or is destroyed
+while suspended.
+
+LLVM currently supports two styles of coroutine lowering. These styles
+support substantially different sets of features, have substantially
+different ABIs, and expect substantially different patterns of frontend
+code generation. However, the styles also have a great deal in common.
+
+In all cases, an LLVM coroutine is initially represented as an ordinary LLVM
+function that has calls to `coroutine intrinsics`_ defining the structure of
+the coroutine. The coroutine function is then, in the most general case,
+rewritten by the coroutine lowering passes to become the "ramp function",
+the initial entrypoint of the coroutine, which executes until a suspend point
+is first reached. The remainder of the original coroutine function is split
+out into some number of "resume functions". Any state which must persist
+across suspensions is stored in the coroutine frame. The resume functions
+must somehow be able to handle either a "normal" resumption, which continues
+the normal execution of the coroutine, or an "abnormal" resumption, which
+must unwind the coroutine without attempting to suspend it.
+
+Switched-Resume Lowering
+------------------------
+
+In LLVM's standard switched-resume lowering, signaled by the use of
+`llvm.coro.id`, the coroutine frame is stored as part of a "coroutine
+object" which represents a handle to a particular invocation of the
+coroutine.  All coroutine objects support a common ABI allowing certain
+features to be used without knowing anything about the coroutine's
+implementation:
+
+- A coroutine object can be queried to see if it has reached completion
+  with `llvm.coro.done`.
+
+- A coroutine object can be resumed normally if it has not already reached
+  completion with `llvm.coro.resume`.
+
+- A coroutine object can be destroyed, invalidating the coroutine object,
+  with `llvm.coro.destroy`.  This must be done separately even if the
+  coroutine has reached completion normally.
+
+- "Promise" storage, which is known to have a certain size and alignment,
+  can be projected out of the coroutine object with `llvm.coro.promise`.
+  The coroutine implementation must have been compiled to define a promise
+  of the same size and alignment.
+
+In general, interacting with a coroutine object in any of these ways while
+it is running has undefined behavior.
+
+The coroutine function is split into three functions, representing three
+different ways that control can enter the coroutine:
+
+1. the ramp function that is initially invoked, which takes arbitrary
+   arguments and returns a pointer to the coroutine object;
+
+2. a coroutine resume function that is invoked when the coroutine is resumed,
+   which takes a pointer to the coroutine object and returns `void`;
+
+3. a coroutine destroy function that is invoked when the coroutine is
+   destroyed, which takes a pointer to the coroutine object and returns
+   `void`.
+
+Because the resume and destroy functions are shared across all suspend
+points, suspend points must store the index of the active suspend in
+the coroutine object, and the resume/destroy functions must switch over
+that index to get back to the correct point.  Hence the name of this
+lowering.
+
+Pointers to the resume and destroy functions are stored in the coroutine
+object at known offsets which are fixed for all coroutines.  A completed
+coroutine is represented with a null resume function.
+
+There is a somewhat complex protocol of intrinsics for allocating and
+deallocating the coroutine object.  It is complex in order to allow the
+allocation to be elided due to inlining.  This protocol is discussed
+in further detail below.
+
+The frontend may generate code to call the coroutine function directly;
+this will become a call to the ramp function and will return a pointer
+to the coroutine object.  The frontend should always resume or destroy
+the coroutine using the corresping intrinsics.
+
+Returned-Continuation Lowering
+------------------------------
+
+In returned-continuation lowering, signaled by the use of
+`llvm.coro.id.retcon` or `llvm.coro.id.retcon.once`, some aspects of
+the ABI must be handled more explicitly by the frontend.
+
+In this lowering, every suspend point takes a list of "yielded values"
+which are returned back to the caller along with a function pointer,
+called the continuation function.  The coroutine is resumed by simply
+calling this continuation function pointer.  The original coroutine
+is divided into the ramp function and then an arbitrary number of
+these continuation functions, one for each suspend point.
+
+LLVM actually supports two closely-related returned-continuation
+lowerings:
+
+- In normal returned-continuation lowering, the coroutine may suspend
+  itself multiple times. This means that a continuation function
+  itself returns another continuation pointer, as well as a list of
+  yielded values.
+
+  The coroutine indicates that it has run to completion by returning
+  a null continuation pointer. Any yielded values will be `undef`
+  should be ignored.
+
+- In yield-once returned-continuation lowering, the coroutine must
+  suspend itself exactly once (or throw an exception).  The ramp
+  function returns a continuation function pointer and yielded
+  values, but the continuation function simply returns `void`
+  when the coroutine has run to completion.
+
+The coroutine frame is maintained in a fixed-size buffer that is
+passed to the `coro.id` intrinsic, which guarantees a certain size
+and alignment statically. The same buffer must be passed to the
+continuation function(s). The coroutine will allocate memory if the
+buffer is insufficient, in which case it will need to store at
+least that pointer in the buffer; therefore the buffer must always
+be at least pointer-sized. How the coroutine uses the buffer may
+vary between suspend points.
+
+In addition to the buffer pointer, continuation functions take an
+argument indicating whether the coroutine is being resumed normally
+(zero) or abnormally (non-zero).
+
+LLVM is currently ineffective at statically eliminating allocations
+after fully inlining returned-continuation coroutines into a caller.
+This may be acceptable if LLVM's coroutine support is primarily being
+used for low-level lowering and inlining is expected to be applied
+earlier in the pipeline.
 
 Coroutines by Example
 =====================
 
+The examples below are all of switched-resume coroutines.
+
 Coroutine Representation
 ------------------------
 
@@ -554,6 +666,7 @@ and python iterator `__next__` would loo
     return *(int*)coro.promise(hdl, 4, false);
   }
 
+
 Intrinsics
 ==========
 
@@ -580,7 +693,7 @@ Overview:
 """""""""
 
 The '``llvm.coro.destroy``' intrinsic destroys a suspended
-coroutine.
+switched-resume coroutine.
 
 Arguments:
 """"""""""
@@ -607,7 +720,7 @@ frame. Destroying a coroutine that is no
 Overview:
 """""""""
 
-The '``llvm.coro.resume``' intrinsic resumes a suspended coroutine.
+The '``llvm.coro.resume``' intrinsic resumes a suspended switched-resume coroutine.
 
 Arguments:
 """"""""""
@@ -634,8 +747,8 @@ Resuming a coroutine that is not suspend
 Overview:
 """""""""
 
-The '``llvm.coro.done``' intrinsic checks whether a suspended coroutine is at 
-the final suspend point or not.
+The '``llvm.coro.done``' intrinsic checks whether a suspended
+switched-resume coroutine is at the final suspend point or not.
 
 Arguments:
 """"""""""
@@ -661,7 +774,7 @@ Overview:
 """""""""
 
 The '``llvm.coro.promise``' intrinsic obtains a pointer to a 
-`coroutine promise`_ given a coroutine handle and vice versa.
+`coroutine promise`_ given a switched-resume coroutine handle and vice versa.
 
 Arguments:
 """"""""""
@@ -739,7 +852,8 @@ Overview:
 """""""""
 
 The '``llvm.coro.size``' intrinsic returns the number of bytes
-required to store a `coroutine frame`_.
+required to store a `coroutine frame`_.  This is only supported for
+switched-resume coroutines.
 
 Arguments:
 """"""""""
@@ -772,7 +886,8 @@ The first argument is a token returned b
 identifying the coroutine.
 
 The second argument is a pointer to a block of memory where coroutine frame
-will be stored if it is allocated dynamically.
+will be stored if it is allocated dynamically.  This pointer is ignored
+for returned-continuation coroutines.
 
 Semantics:
 """"""""""
@@ -798,7 +913,8 @@ Overview:
 
 The '``llvm.coro.free``' intrinsic returns a pointer to a block of memory where 
 coroutine frame is stored or `null` if this instance of a coroutine did not use
-dynamically allocated memory for its coroutine frame.
+dynamically allocated memory for its coroutine frame.  This intrinsic is not
+supported for returned-continuation coroutines.
 
 Arguments:
 """"""""""
@@ -847,6 +963,7 @@ Overview:
 
 The '``llvm.coro.alloc``' intrinsic returns `true` if dynamic allocation is
 required to obtain a memory for the coroutine frame and `false` otherwise.
+This is not supported for returned-continuation coroutines.
 
 Arguments:
 """"""""""
@@ -944,7 +1061,8 @@ coroutine frame.
 Overview:
 """""""""
 
-The '``llvm.coro.id``' intrinsic returns a token identifying a coroutine.
+The '``llvm.coro.id``' intrinsic returns a token identifying a
+switched-resume coroutine.
 
 Arguments:
 """"""""""
@@ -975,6 +1093,79 @@ duplicated.
 
 A frontend should emit exactly one `coro.id` intrinsic per coroutine.
 
+.. _coro.id.retcon:
+
+'llvm.coro.id.retcon' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+::
+
+  declare token @llvm.coro.id.retcon(i32 <size>, i32 <align>, i8* <buffer>,
+                                     i8* <continuation prototype>,
+                                     i8* <alloc>, i8* <dealloc>)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.id.retcon``' intrinsic returns a token identifying a
+multiple-suspend returned-continuation coroutine.
+
+The 'result-type sequence' of the coroutine is defined as follows:
+
+- if the return type of the coroutine function is ``void``, it is the
+  empty sequence;
+
+- if the return type of the coroutine function is a ``struct``, it is the
+  element types of that ``struct`` in order;
+
+- otherwise, it is just the return type of the coroutine function.
+
+The first element of the result-type sequence must be a pointer type;
+continuation functions will be coerced to this type.  The rest of
+the sequence are the 'yield types', and any suspends in the coroutine
+must take arguments of these types.
+
+Arguments:
+""""""""""
+
+The first and second arguments are the expected size and alignment of
+the buffer provided as the third argument.  They must be constant.
+
+The fourth argument must be a reference to a global function, called
+the 'continuation prototype function'.  The type, calling convention,
+and attributes of any continuation functions will be taken from this
+declaration.  The return type of the prototype function must match the
+return type of the current function.  The first parameter type must be
+a pointer type.  The second parameter type must be an integer type;
+it will be used only as a boolean flag.
+
+The fifth argument must be a reference to a global function that will
+be used to allocate memory.  It may not fail, either by returning null
+or throwing an exception.  It must take an integer and return a pointer.
+
+The sixth argument must be a reference to a global function that will
+be used to deallocate memory.  It must take a pointer and return ``void``.
+
+'llvm.coro.id.retcon.once' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+::
+
+  declare token @llvm.coro.id.retcon.once(i32 <size>, i32 <align>, i8* <buffer>,
+                                          i8* <prototype>,
+                                          i8* <alloc>, i8* <dealloc>)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.id.retcon.once``' intrinsic returns a token identifying a
+unique-suspend returned-continuation coroutine.
+
+Arguments:
+""""""""""
+
+As for ``llvm.core.id.retcon``, except that the return type of the
+continuation prototype must be `void` instead of matching the
+coroutine's return type.
+
 .. _coro.end:
 
 'llvm.coro.end' Intrinsic
@@ -1008,6 +1199,17 @@ The purpose of this intrinsic is to allo
 other code that is only relevant during the initial invocation of the coroutine
 and should not be present in resume and destroy parts. 
 
+In returned-continuation lowering, ``llvm.coro.end`` fully destroys the
+coroutine frame.  If the second argument is `false`, it also returns from
+the coroutine with a null continuation pointer, and the next instruction
+will be unreachable.  If the second argument is `true`, it falls through
+so that the following logic can resume unwinding.  In a yield-once
+coroutine, reaching a non-unwind ``llvm.coro.end`` without having first
+reached a ``llvm.coro.suspend.retcon`` has undefined behavior.
+
+The remainder of this section describes the behavior under switched-resume
+lowering.
+
 This intrinsic is lowered when a coroutine is split into
 the start, resume and destroy parts. In the start part, it is a no-op,
 in resume and destroy parts, it is replaced with `ret void` instruction and
@@ -1078,11 +1280,11 @@ The following table summarizes the handl
 Overview:
 """""""""
 
-The '``llvm.coro.suspend``' marks the point where execution of the coroutine 
-need to get suspended and control returned back to the caller.
-Conditional branches consuming the result of this intrinsic lead to basic blocks
-where coroutine should proceed when suspended (-1), resumed (0) or destroyed 
-(1).
+The '``llvm.coro.suspend``' marks the point where execution of a
+switched-resume coroutine is suspended and control is returned back
+to the caller.  Conditional branches consuming the result of this
+intrinsic lead to basic blocks where coroutine should proceed when
+suspended (-1), resumed (0) or destroyed (1).
 
 Arguments:
 """"""""""
@@ -1178,6 +1380,45 @@ to the coroutine:
     switch i8 %suspend1, label %suspend [i8 0, label %resume1
                                          i8 1, label %cleanup]
 
+.. _coro.suspend.retcon:
+
+'llvm.coro.suspend.retcon' Intrinsic
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+::
+
+  declare i1 @llvm.coro.suspend.retcon(...)
+
+Overview:
+"""""""""
+
+The '``llvm.coro.suspend.retcon``' intrinsic marks the point where
+execution of a returned-continuation coroutine is suspended and control
+is returned back to the caller.
+
+`llvm.coro.suspend.retcon`` does not support separate save points;
+they are not useful when the continuation function is not locally
+accessible.  That would be a more appropriate feature for a ``passcon``
+lowering that is not yet implemented.
+
+Arguments:
+""""""""""
+
+The types of the arguments must exactly match the yielded-types sequence
+of the coroutine.  They will be turned into return values from the ramp
+and continuation functions, along with the next continuation function.
+
+Semantics:
+""""""""""
+
+The result of the intrinsic indicates whether the coroutine should resume
+abnormally (non-zero).
+
+In a normal coroutine, it is undefined behavior if the coroutine executes
+a call to ``llvm.coro.suspend.retcon`` after resuming abnormally.
+
+In a yield-once coroutine, it is undefined behavior if the coroutine
+executes a call to ``llvm.coro.suspend.retcon`` after resuming in any way.
+
 .. _coro.param:
 
 'llvm.coro.param' Intrinsic

Modified: llvm/trunk/include/llvm/IR/Intrinsics.td
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/include/llvm/IR/Intrinsics.td?rev=368788&r1=368787&r2=368788&view=diff
==============================================================================
--- llvm/trunk/include/llvm/IR/Intrinsics.td (original)
+++ llvm/trunk/include/llvm/IR/Intrinsics.td Tue Aug 13 20:53:17 2019
@@ -959,6 +959,14 @@ def int_coro_id : Intrinsic<[llvm_token_
                              llvm_ptr_ty, llvm_ptr_ty],
                             [IntrArgMemOnly, IntrReadMem,
                              ReadNone<1>, ReadOnly<2>, NoCapture<2>]>;
+def int_coro_id_retcon : Intrinsic<[llvm_token_ty],
+    [llvm_i32_ty, llvm_i32_ty, llvm_ptr_ty,
+     llvm_ptr_ty, llvm_ptr_ty, llvm_ptr_ty],
+    []>;
+def int_coro_id_retcon_once : Intrinsic<[llvm_token_ty],
+    [llvm_i32_ty, llvm_i32_ty, llvm_ptr_ty,
+     llvm_ptr_ty, llvm_ptr_ty, llvm_ptr_ty],
+    []>;
 def int_coro_alloc : Intrinsic<[llvm_i1_ty], [llvm_token_ty], []>;
 def int_coro_begin : Intrinsic<[llvm_ptr_ty], [llvm_token_ty, llvm_ptr_ty],
                                [WriteOnly<1>]>;
@@ -974,6 +982,9 @@ def int_coro_size : Intrinsic<[llvm_anyi
 
 def int_coro_save : Intrinsic<[llvm_token_ty], [llvm_ptr_ty], []>;
 def int_coro_suspend : Intrinsic<[llvm_i8_ty], [llvm_token_ty, llvm_i1_ty], []>;
+def int_coro_suspend_retcon : Intrinsic<[llvm_i1_ty], [llvm_vararg_ty], []>;
+def int_coro_prepare_retcon : Intrinsic<[llvm_ptr_ty], [llvm_ptr_ty],
+                                        [IntrNoMem]>;
 
 def int_coro_param : Intrinsic<[llvm_i1_ty], [llvm_ptr_ty, llvm_ptr_ty],
                                [IntrNoMem, ReadNone<0>, ReadNone<1>]>;

Modified: llvm/trunk/lib/Transforms/Coroutines/CoroCleanup.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Coroutines/CoroCleanup.cpp?rev=368788&r1=368787&r2=368788&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/Coroutines/CoroCleanup.cpp (original)
+++ llvm/trunk/lib/Transforms/Coroutines/CoroCleanup.cpp Tue Aug 13 20:53:17 2019
@@ -73,6 +73,8 @@ bool Lowerer::lowerRemainingCoroIntrinsi
         II->replaceAllUsesWith(ConstantInt::getTrue(Context));
         break;
       case Intrinsic::coro_id:
+      case Intrinsic::coro_id_retcon:
+      case Intrinsic::coro_id_retcon_once:
         II->replaceAllUsesWith(ConstantTokenNone::get(Context));
         break;
       case Intrinsic::coro_subfn_addr:
@@ -111,7 +113,8 @@ struct CoroCleanup : FunctionPass {
   bool doInitialization(Module &M) override {
     if (coro::declaresIntrinsics(M, {"llvm.coro.alloc", "llvm.coro.begin",
                                      "llvm.coro.subfn.addr", "llvm.coro.free",
-                                     "llvm.coro.id"}))
+                                     "llvm.coro.id", "llvm.coro.id.retcon",
+                                     "llvm.coro.id.retcon.once"}))
       L = llvm::make_unique<Lowerer>(M);
     return false;
   }

Modified: llvm/trunk/lib/Transforms/Coroutines/CoroEarly.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Coroutines/CoroEarly.cpp?rev=368788&r1=368787&r2=368788&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/Coroutines/CoroEarly.cpp (original)
+++ llvm/trunk/lib/Transforms/Coroutines/CoroEarly.cpp Tue Aug 13 20:53:17 2019
@@ -91,13 +91,14 @@ void Lowerer::lowerCoroDone(IntrinsicIns
   Value *Operand = II->getArgOperand(0);
 
   // ResumeFnAddr is the first pointer sized element of the coroutine frame.
+  static_assert(coro::Shape::SwitchFieldIndex::Resume == 0,
+                "resume function not at offset zero");
   auto *FrameTy = Int8Ptr;
   PointerType *FramePtrTy = FrameTy->getPointerTo();
 
   Builder.SetInsertPoint(II);
   auto *BCI = Builder.CreateBitCast(Operand, FramePtrTy);
-  auto *Gep = Builder.CreateConstInBoundsGEP1_32(FrameTy, BCI, 0);
-  auto *Load = Builder.CreateLoad(FrameTy, Gep);
+  auto *Load = Builder.CreateLoad(BCI);
   auto *Cond = Builder.CreateICmpEQ(Load, NullPtr);
 
   II->replaceAllUsesWith(Cond);
@@ -189,6 +190,10 @@ bool Lowerer::lowerEarlyIntrinsics(Funct
           }
         }
         break;
+      case Intrinsic::coro_id_retcon:
+      case Intrinsic::coro_id_retcon_once:
+        F.addFnAttr(CORO_PRESPLIT_ATTR, PREPARED_FOR_SPLIT);
+        break;
       case Intrinsic::coro_resume:
         lowerResumeOrDestroy(CS, CoroSubFnInst::ResumeIndex);
         break;
@@ -231,10 +236,17 @@ struct CoroEarly : public FunctionPass {
   // This pass has work to do only if we find intrinsics we are going to lower
   // in the module.
   bool doInitialization(Module &M) override {
-    if (coro::declaresIntrinsics(
-            M, {"llvm.coro.id", "llvm.coro.destroy", "llvm.coro.done",
-                "llvm.coro.end", "llvm.coro.noop", "llvm.coro.free",
-                "llvm.coro.promise", "llvm.coro.resume", "llvm.coro.suspend"}))
+    if (coro::declaresIntrinsics(M, {"llvm.coro.id",
+                                     "llvm.coro.id.retcon",
+                                     "llvm.coro.id.retcon.once",
+                                     "llvm.coro.destroy",
+                                     "llvm.coro.done",
+                                     "llvm.coro.end",
+                                     "llvm.coro.noop", 
+                                     "llvm.coro.free",
+                                     "llvm.coro.promise",
+                                     "llvm.coro.resume",
+                                     "llvm.coro.suspend"}))
       L = llvm::make_unique<Lowerer>(M);
     return false;
   }

Modified: llvm/trunk/lib/Transforms/Coroutines/CoroFrame.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Coroutines/CoroFrame.cpp?rev=368788&r1=368787&r2=368788&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/Coroutines/CoroFrame.cpp (original)
+++ llvm/trunk/lib/Transforms/Coroutines/CoroFrame.cpp Tue Aug 13 20:53:17 2019
@@ -120,6 +120,15 @@ struct SuspendCrossingInfo {
         return false;
 
     BasicBlock *UseBB = I->getParent();
+
+    // As a special case, treat uses by an llvm.coro.suspend.retcon
+    // as if they were uses in the suspend's single predecessor: the
+    // uses conceptually occur before the suspend.
+    if (isa<CoroSuspendRetconInst>(I)) {
+      UseBB = UseBB->getSinglePredecessor();
+      assert(UseBB && "should have split coro.suspend into its own block");
+    }
+
     return hasPathCrossingSuspendPoint(DefBB, UseBB);
   }
 
@@ -128,7 +137,17 @@ struct SuspendCrossingInfo {
   }
 
   bool isDefinitionAcrossSuspend(Instruction &I, User *U) const {
-    return isDefinitionAcrossSuspend(I.getParent(), U);
+    auto *DefBB = I.getParent();
+
+    // As a special case, treat values produced by an llvm.coro.suspend.*
+    // as if they were defined in the single successor: the uses
+    // conceptually occur after the suspend.
+    if (isa<AnyCoroSuspendInst>(I)) {
+      DefBB = DefBB->getSingleSuccessor();
+      assert(DefBB && "should have split coro.suspend into its own block");
+    }
+
+    return isDefinitionAcrossSuspend(DefBB, U);
   }
 };
 } // end anonymous namespace
@@ -183,9 +202,10 @@ SuspendCrossingInfo::SuspendCrossingInfo
     B.Suspend = true;
     B.Kills |= B.Consumes;
   };
-  for (CoroSuspendInst *CSI : Shape.CoroSuspends) {
+  for (auto *CSI : Shape.CoroSuspends) {
     markSuspendBlock(CSI);
-    markSuspendBlock(CSI->getCoroSave());
+    if (auto *Save = CSI->getCoroSave())
+      markSuspendBlock(Save);
   }
 
   // Iterate propagating consumes and kills until they stop changing.
@@ -261,11 +281,13 @@ SuspendCrossingInfo::SuspendCrossingInfo
 // We build up the list of spills for every case where a use is separated
 // from the definition by a suspend point.
 
+static const unsigned InvalidFieldIndex = ~0U;
+
 namespace {
 class Spill {
   Value *Def = nullptr;
   Instruction *User = nullptr;
-  unsigned FieldNo = 0;
+  unsigned FieldNo = InvalidFieldIndex;
 
 public:
   Spill(Value *Def, llvm::User *U) : Def(Def), User(cast<Instruction>(U)) {}
@@ -280,11 +302,11 @@ public:
   // the definition the first time they encounter it. Consider refactoring
   // SpillInfo into two arrays to normalize the spill representation.
   unsigned fieldIndex() const {
-    assert(FieldNo && "Accessing unassigned field");
+    assert(FieldNo != InvalidFieldIndex && "Accessing unassigned field");
     return FieldNo;
   }
   void setFieldIndex(unsigned FieldNumber) {
-    assert(!FieldNo && "Reassigning field number");
+    assert(FieldNo == InvalidFieldIndex && "Reassigning field number");
     FieldNo = FieldNumber;
   }
 };
@@ -376,18 +398,30 @@ static StructType *buildFrameType(Functi
   SmallString<32> Name(F.getName());
   Name.append(".Frame");
   StructType *FrameTy = StructType::create(C, Name);
-  auto *FramePtrTy = FrameTy->getPointerTo();
-  auto *FnTy = FunctionType::get(Type::getVoidTy(C), FramePtrTy,
-                                 /*isVarArg=*/false);
-  auto *FnPtrTy = FnTy->getPointerTo();
-
-  // Figure out how wide should be an integer type storing the suspend index.
-  unsigned IndexBits = std::max(1U, Log2_64_Ceil(Shape.CoroSuspends.size()));
-  Type *PromiseType = Shape.PromiseAlloca
-                          ? Shape.PromiseAlloca->getType()->getElementType()
-                          : Type::getInt1Ty(C);
-  SmallVector<Type *, 8> Types{FnPtrTy, FnPtrTy, PromiseType,
-                               Type::getIntNTy(C, IndexBits)};
+  SmallVector<Type *, 8> Types;
+
+  AllocaInst *PromiseAlloca = Shape.getPromiseAlloca();
+
+  if (Shape.ABI == coro::ABI::Switch) {
+    auto *FramePtrTy = FrameTy->getPointerTo();
+    auto *FnTy = FunctionType::get(Type::getVoidTy(C), FramePtrTy,
+                                   /*IsVarArg=*/false);
+    auto *FnPtrTy = FnTy->getPointerTo();
+
+    // Figure out how wide should be an integer type storing the suspend index.
+    unsigned IndexBits = std::max(1U, Log2_64_Ceil(Shape.CoroSuspends.size()));
+    Type *PromiseType = PromiseAlloca
+                            ? PromiseAlloca->getType()->getElementType()
+                            : Type::getInt1Ty(C);
+    Type *IndexType = Type::getIntNTy(C, IndexBits);
+    Types.push_back(FnPtrTy);
+    Types.push_back(FnPtrTy);
+    Types.push_back(PromiseType);
+    Types.push_back(IndexType);
+  } else {
+    assert(PromiseAlloca == nullptr && "lowering doesn't support promises");
+  }
+
   Value *CurrentDef = nullptr;
 
   Padder.addTypes(Types);
@@ -399,7 +433,7 @@ static StructType *buildFrameType(Functi
 
     CurrentDef = S.def();
     // PromiseAlloca was already added to Types array earlier.
-    if (CurrentDef == Shape.PromiseAlloca)
+    if (CurrentDef == PromiseAlloca)
       continue;
 
     uint64_t Count = 1;
@@ -430,6 +464,22 @@ static StructType *buildFrameType(Functi
   }
   FrameTy->setBody(Types);
 
+  switch (Shape.ABI) {
+  case coro::ABI::Switch:
+    break;
+
+  // Remember whether the frame is inline in the storage.
+  case coro::ABI::Retcon:
+  case coro::ABI::RetconOnce: {
+    auto &Layout = F.getParent()->getDataLayout();
+    auto Id = Shape.getRetconCoroId();
+    Shape.RetconLowering.IsFrameInlineInStorage
+      = (Layout.getTypeAllocSize(FrameTy) <= Id->getStorageSize() &&
+         Layout.getABITypeAlignment(FrameTy) <= Id->getStorageAlignment());
+    break;
+  }
+  }
+
   return FrameTy;
 }
 
@@ -476,7 +526,7 @@ static Instruction *splitBeforeCatchSwit
 //    whatever
 //
 //
-static Instruction *insertSpills(SpillInfo &Spills, coro::Shape &Shape) {
+static Instruction *insertSpills(const SpillInfo &Spills, coro::Shape &Shape) {
   auto *CB = Shape.CoroBegin;
   LLVMContext &C = CB->getContext();
   IRBuilder<> Builder(CB->getNextNode());
@@ -488,7 +538,9 @@ static Instruction *insertSpills(SpillIn
   Value *CurrentValue = nullptr;
   BasicBlock *CurrentBlock = nullptr;
   Value *CurrentReload = nullptr;
-  unsigned Index = 0; // Proper field number will be read from field definition.
+
+  // Proper field number will be read from field definition.
+  unsigned Index = InvalidFieldIndex;
 
   // We need to keep track of any allocas that need "spilling"
   // since they will live in the coroutine frame now, all access to them
@@ -496,9 +548,11 @@ static Instruction *insertSpills(SpillIn
   // we remember allocas and their indices to be handled once we processed
   // all the spills.
   SmallVector<std::pair<AllocaInst *, unsigned>, 4> Allocas;
-  // Promise alloca (if present) has a fixed field number (Shape::PromiseField)
-  if (Shape.PromiseAlloca)
-    Allocas.emplace_back(Shape.PromiseAlloca, coro::Shape::PromiseField);
+  // Promise alloca (if present) has a fixed field number.
+  if (auto *PromiseAlloca = Shape.getPromiseAlloca()) {
+    assert(Shape.ABI == coro::ABI::Switch);
+    Allocas.emplace_back(PromiseAlloca, coro::Shape::SwitchFieldIndex::Promise);
+  }
 
   // Create a GEP with the given index into the coroutine frame for the original
   // value Orig. Appends an extra 0 index for array-allocas, preserving the
@@ -526,7 +580,7 @@ static Instruction *insertSpills(SpillIn
   // Create a load instruction to reload the spilled value from the coroutine
   // frame.
   auto CreateReload = [&](Instruction *InsertBefore) {
-    assert(Index && "accessing unassigned field number");
+    assert(Index != InvalidFieldIndex && "accessing unassigned field number");
     Builder.SetInsertPoint(InsertBefore);
 
     auto *G = GetFramePointer(Index, CurrentValue);
@@ -558,23 +612,33 @@ static Instruction *insertSpills(SpillIn
         // coroutine frame.
 
         Instruction *InsertPt = nullptr;
-        if (isa<Argument>(CurrentValue)) {
+        if (auto Arg = dyn_cast<Argument>(CurrentValue)) {
           // For arguments, we will place the store instruction right after
           // the coroutine frame pointer instruction, i.e. bitcast of
           // coro.begin from i8* to %f.frame*.
           InsertPt = FramePtr->getNextNode();
+
+          // If we're spilling an Argument, make sure we clear 'nocapture'
+          // from the coroutine function.
+          Arg->getParent()->removeParamAttr(Arg->getArgNo(),
+                                            Attribute::NoCapture);
+
         } else if (auto *II = dyn_cast<InvokeInst>(CurrentValue)) {
           // If we are spilling the result of the invoke instruction, split the
           // normal edge and insert the spill in the new block.
           auto NewBB = SplitEdge(II->getParent(), II->getNormalDest());
           InsertPt = NewBB->getTerminator();
-        } else if (dyn_cast<PHINode>(CurrentValue)) {
+        } else if (isa<PHINode>(CurrentValue)) {
           // Skip the PHINodes and EH pads instructions.
           BasicBlock *DefBlock = cast<Instruction>(E.def())->getParent();
           if (auto *CSI = dyn_cast<CatchSwitchInst>(DefBlock->getTerminator()))
             InsertPt = splitBeforeCatchSwitch(CSI);
           else
             InsertPt = &*DefBlock->getFirstInsertionPt();
+        } else if (auto CSI = dyn_cast<AnyCoroSuspendInst>(CurrentValue)) {
+          // Don't spill immediately after a suspend; splitting assumes
+          // that the suspend will be followed by a branch.
+          InsertPt = CSI->getParent()->getSingleSuccessor()->getFirstNonPHI();
         } else {
           // For all other values, the spill is placed immediately after
           // the definition.
@@ -613,13 +677,14 @@ static Instruction *insertSpills(SpillIn
   }
 
   BasicBlock *FramePtrBB = FramePtr->getParent();
-  Shape.AllocaSpillBlock =
-      FramePtrBB->splitBasicBlock(FramePtr->getNextNode(), "AllocaSpillBB");
-  Shape.AllocaSpillBlock->splitBasicBlock(&Shape.AllocaSpillBlock->front(),
-                                          "PostSpill");
 
-  Builder.SetInsertPoint(&Shape.AllocaSpillBlock->front());
+  auto SpillBlock =
+    FramePtrBB->splitBasicBlock(FramePtr->getNextNode(), "AllocaSpillBB");      
+  SpillBlock->splitBasicBlock(&SpillBlock->front(), "PostSpill");
+  Shape.AllocaSpillBlock = SpillBlock;
+
   // If we found any allocas, replace all of their remaining uses with Geps.
+  Builder.SetInsertPoint(&SpillBlock->front());
   for (auto &P : Allocas) {
     auto *G = GetFramePointer(P.second, P.first);
 
@@ -900,16 +965,17 @@ void coro::buildCoroutineFrame(Function
   // access to local variables.
   LowerDbgDeclare(F);
 
-  Shape.PromiseAlloca = Shape.CoroBegin->getId()->getPromise();
-  if (Shape.PromiseAlloca) {
-    Shape.CoroBegin->getId()->clearPromise();
+  if (Shape.ABI == coro::ABI::Switch &&
+      Shape.SwitchLowering.PromiseAlloca) {
+    Shape.getSwitchCoroId()->clearPromise();
   }
 
   // Make sure that all coro.save, coro.suspend and the fallthrough coro.end
   // intrinsics are in their own blocks to simplify the logic of building up
   // SuspendCrossing data.
-  for (CoroSuspendInst *CSI : Shape.CoroSuspends) {
-    splitAround(CSI->getCoroSave(), "CoroSave");
+  for (auto *CSI : Shape.CoroSuspends) {
+    if (auto *Save = CSI->getCoroSave())
+      splitAround(Save, "CoroSave");
     splitAround(CSI, "CoroSuspend");
   }
 
@@ -957,7 +1023,8 @@ void coro::buildCoroutineFrame(Function
       continue;
     // The Coroutine Promise always included into coroutine frame, no need to
     // check for suspend crossing.
-    if (Shape.PromiseAlloca == &I)
+    if (Shape.ABI == coro::ABI::Switch &&
+        Shape.SwitchLowering.PromiseAlloca == &I)
       continue;
 
     for (User *U : I.users())

Modified: llvm/trunk/lib/Transforms/Coroutines/CoroInstr.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Coroutines/CoroInstr.h?rev=368788&r1=368787&r2=368788&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/Coroutines/CoroInstr.h (original)
+++ llvm/trunk/lib/Transforms/Coroutines/CoroInstr.h Tue Aug 13 20:53:17 2019
@@ -27,6 +27,7 @@
 
 #include "llvm/IR/GlobalVariable.h"
 #include "llvm/IR/IntrinsicInst.h"
+#include "llvm/Support/raw_ostream.h"
 
 namespace llvm {
 
@@ -77,10 +78,8 @@ public:
   }
 };
 
-/// This represents the llvm.coro.alloc instruction.
-class LLVM_LIBRARY_VISIBILITY CoroIdInst : public IntrinsicInst {
-  enum { AlignArg, PromiseArg, CoroutineArg, InfoArg };
-
+/// This represents a common base class for llvm.coro.id instructions.
+class LLVM_LIBRARY_VISIBILITY AnyCoroIdInst : public IntrinsicInst {
 public:
   CoroAllocInst *getCoroAlloc() {
     for (User *U : users())
@@ -97,6 +96,24 @@ public:
     llvm_unreachable("no coro.begin associated with coro.id");
   }
 
+  // Methods to support type inquiry through isa, cast, and dyn_cast:
+  static bool classof(const IntrinsicInst *I) {
+    auto ID = I->getIntrinsicID();
+    return ID == Intrinsic::coro_id ||
+           ID == Intrinsic::coro_id_retcon ||
+           ID == Intrinsic::coro_id_retcon_once;
+  }
+
+  static bool classof(const Value *V) {
+    return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
+  }
+};
+
+/// This represents the llvm.coro.id instruction.
+class LLVM_LIBRARY_VISIBILITY CoroIdInst : public AnyCoroIdInst {
+  enum { AlignArg, PromiseArg, CoroutineArg, InfoArg };
+
+public:
   AllocaInst *getPromise() const {
     Value *Arg = getArgOperand(PromiseArg);
     return isa<ConstantPointerNull>(Arg)
@@ -182,6 +199,80 @@ public:
   }
 };
 
+/// This represents either the llvm.coro.id.retcon or
+/// llvm.coro.id.retcon.once instruction.
+class LLVM_LIBRARY_VISIBILITY AnyCoroIdRetconInst : public AnyCoroIdInst {
+  enum { SizeArg, AlignArg, StorageArg, PrototypeArg, AllocArg, DeallocArg };
+
+public:
+  void checkWellFormed() const;
+
+  uint64_t getStorageSize() const {
+    return cast<ConstantInt>(getArgOperand(SizeArg))->getZExtValue();
+  }
+
+  uint64_t getStorageAlignment() const {
+    return cast<ConstantInt>(getArgOperand(AlignArg))->getZExtValue();
+  }
+
+  Value *getStorage() const {
+    return getArgOperand(StorageArg);
+  }
+
+  /// Return the prototype for the continuation function.  The type,
+  /// attributes, and calling convention of the continuation function(s)
+  /// are taken from this declaration.
+  Function *getPrototype() const {
+    return cast<Function>(getArgOperand(PrototypeArg)->stripPointerCasts());
+  }
+
+  /// Return the function to use for allocating memory.
+  Function *getAllocFunction() const {
+    return cast<Function>(getArgOperand(AllocArg)->stripPointerCasts());
+  }
+
+  /// Return the function to use for deallocating memory.
+  Function *getDeallocFunction() const {
+    return cast<Function>(getArgOperand(DeallocArg)->stripPointerCasts());
+  }
+
+  // Methods to support type inquiry through isa, cast, and dyn_cast:
+  static bool classof(const IntrinsicInst *I) {
+    auto ID = I->getIntrinsicID();
+    return ID == Intrinsic::coro_id_retcon
+        || ID == Intrinsic::coro_id_retcon_once;
+  }
+  static bool classof(const Value *V) {
+    return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
+  }
+};
+
+/// This represents the llvm.coro.id.retcon instruction.
+class LLVM_LIBRARY_VISIBILITY CoroIdRetconInst
+    : public AnyCoroIdRetconInst {
+public:
+  // Methods to support type inquiry through isa, cast, and dyn_cast:
+  static bool classof(const IntrinsicInst *I) {
+    return I->getIntrinsicID() == Intrinsic::coro_id_retcon;
+  }
+  static bool classof(const Value *V) {
+    return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
+  }
+};
+
+/// This represents the llvm.coro.id.retcon.once instruction.
+class LLVM_LIBRARY_VISIBILITY CoroIdRetconOnceInst
+    : public AnyCoroIdRetconInst {
+public:
+  // Methods to support type inquiry through isa, cast, and dyn_cast:
+  static bool classof(const IntrinsicInst *I) {
+    return I->getIntrinsicID() == Intrinsic::coro_id_retcon_once;
+  }
+  static bool classof(const Value *V) {
+    return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
+  }
+};
+
 /// This represents the llvm.coro.frame instruction.
 class LLVM_LIBRARY_VISIBILITY CoroFrameInst : public IntrinsicInst {
 public:
@@ -215,7 +306,9 @@ class LLVM_LIBRARY_VISIBILITY CoroBeginI
   enum { IdArg, MemArg };
 
 public:
-  CoroIdInst *getId() const { return cast<CoroIdInst>(getArgOperand(IdArg)); }
+  AnyCoroIdInst *getId() const {
+    return cast<AnyCoroIdInst>(getArgOperand(IdArg));
+  }
 
   Value *getMem() const { return getArgOperand(MemArg); }
 
@@ -261,8 +354,22 @@ public:
   }
 };
 
+class LLVM_LIBRARY_VISIBILITY AnyCoroSuspendInst : public IntrinsicInst {
+public:
+  CoroSaveInst *getCoroSave() const;
+
+  // Methods to support type inquiry through isa, cast, and dyn_cast:
+  static bool classof(const IntrinsicInst *I) {
+    return I->getIntrinsicID() == Intrinsic::coro_suspend ||
+           I->getIntrinsicID() == Intrinsic::coro_suspend_retcon;
+  }
+  static bool classof(const Value *V) {
+    return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
+  }
+};
+
 /// This represents the llvm.coro.suspend instruction.
-class LLVM_LIBRARY_VISIBILITY CoroSuspendInst : public IntrinsicInst {
+class LLVM_LIBRARY_VISIBILITY CoroSuspendInst : public AnyCoroSuspendInst {
   enum { SaveArg, FinalArg };
 
 public:
@@ -273,6 +380,7 @@ public:
     assert(isa<ConstantTokenNone>(Arg));
     return nullptr;
   }
+
   bool isFinal() const {
     return cast<Constant>(getArgOperand(FinalArg))->isOneValue();
   }
@@ -283,6 +391,37 @@ public:
   }
   static bool classof(const Value *V) {
     return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
+  }
+};
+
+inline CoroSaveInst *AnyCoroSuspendInst::getCoroSave() const {
+  if (auto Suspend = dyn_cast<CoroSuspendInst>(this))
+    return Suspend->getCoroSave();
+  return nullptr;
+}
+
+/// This represents the llvm.coro.suspend.retcon instruction.
+class LLVM_LIBRARY_VISIBILITY CoroSuspendRetconInst : public AnyCoroSuspendInst {
+public:
+  op_iterator value_begin() { return arg_begin(); }
+  const_op_iterator value_begin() const { return arg_begin(); }
+
+  op_iterator value_end() { return arg_end(); }
+  const_op_iterator value_end() const { return arg_end(); }
+
+  iterator_range<op_iterator> value_operands() {
+    return make_range(value_begin(), value_end());
+  }
+  iterator_range<const_op_iterator> value_operands() const {
+    return make_range(value_begin(), value_end());
+  }
+
+  // Methods to support type inquiry through isa, cast, and dyn_cast:
+  static bool classof(const IntrinsicInst *I) {
+    return I->getIntrinsicID() == Intrinsic::coro_suspend_retcon;
+  }
+  static bool classof(const Value *V) {
+    return isa<IntrinsicInst>(V) && classof(cast<IntrinsicInst>(V));
   }
 };
 

Modified: llvm/trunk/lib/Transforms/Coroutines/CoroInternal.h
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Coroutines/CoroInternal.h?rev=368788&r1=368787&r2=368788&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/Coroutines/CoroInternal.h (original)
+++ llvm/trunk/lib/Transforms/Coroutines/CoroInternal.h Tue Aug 13 20:53:17 2019
@@ -12,6 +12,7 @@
 #define LLVM_LIB_TRANSFORMS_COROUTINES_COROINTERNAL_H
 
 #include "CoroInstr.h"
+#include "llvm/IR/IRBuilder.h"
 #include "llvm/Transforms/Coroutines.h"
 
 namespace llvm {
@@ -61,37 +62,161 @@ struct LowererBase {
   Value *makeSubFnCall(Value *Arg, int Index, Instruction *InsertPt);
 };
 
+enum class ABI {
+  /// The "resume-switch" lowering, where there are separate resume and
+  /// destroy functions that are shared between all suspend points.  The
+  /// coroutine frame implicitly stores the resume and destroy functions,
+  /// the current index, and any promise value.
+  Switch,
+
+  /// The "returned-continuation" lowering, where each suspend point creates a
+  /// single continuation function that is used for both resuming and
+  /// destroying.  Does not support promises.
+  Retcon,
+
+  /// The "unique returned-continuation" lowering, where each suspend point
+  /// creates a single continuation function that is used for both resuming
+  /// and destroying.  Does not support promises.  The function is known to
+  /// suspend at most once during its execution, and the return value of
+  /// the continuation is void.
+  RetconOnce,
+};
+
 // Holds structural Coroutine Intrinsics for a particular function and other
 // values used during CoroSplit pass.
 struct LLVM_LIBRARY_VISIBILITY Shape {
   CoroBeginInst *CoroBegin;
   SmallVector<CoroEndInst *, 4> CoroEnds;
   SmallVector<CoroSizeInst *, 2> CoroSizes;
-  SmallVector<CoroSuspendInst *, 4> CoroSuspends;
+  SmallVector<AnyCoroSuspendInst *, 4> CoroSuspends;
 
-  // Field Indexes for known coroutine frame fields.
-  enum {
-    ResumeField,
-    DestroyField,
-    PromiseField,
-    IndexField,
+  // Field indexes for special fields in the switch lowering.
+  struct SwitchFieldIndex {
+    enum {
+      Resume,
+      Destroy,
+      Promise,
+      Index,
+      /// The index of the first spill field.
+      FirstSpill
+    };
   };
 
+  coro::ABI ABI;
+
   StructType *FrameTy;
   Instruction *FramePtr;
   BasicBlock *AllocaSpillBlock;
-  SwitchInst *ResumeSwitch;
-  AllocaInst *PromiseAlloca;
-  bool HasFinalSuspend;
+
+  struct SwitchLoweringStorage {
+    SwitchInst *ResumeSwitch;
+    AllocaInst *PromiseAlloca;
+    BasicBlock *ResumeEntryBlock;
+    bool HasFinalSuspend;
+  };
+
+  struct RetconLoweringStorage {
+    Function *ResumePrototype;
+    Function *Alloc;
+    Function *Dealloc;
+    BasicBlock *ReturnBlock;
+    bool IsFrameInlineInStorage;
+  };
+
+  union {
+    SwitchLoweringStorage SwitchLowering;
+    RetconLoweringStorage RetconLowering;
+  };
+
+  CoroIdInst *getSwitchCoroId() const {
+    assert(ABI == coro::ABI::Switch);
+    return cast<CoroIdInst>(CoroBegin->getId());
+  }
+
+  AnyCoroIdRetconInst *getRetconCoroId() const {
+    assert(ABI == coro::ABI::Retcon ||
+           ABI == coro::ABI::RetconOnce);
+    return cast<AnyCoroIdRetconInst>(CoroBegin->getId());
+  }
 
   IntegerType *getIndexType() const {
+    assert(ABI == coro::ABI::Switch);
     assert(FrameTy && "frame type not assigned");
-    return cast<IntegerType>(FrameTy->getElementType(IndexField));
+    return cast<IntegerType>(FrameTy->getElementType(SwitchFieldIndex::Index));
   }
   ConstantInt *getIndex(uint64_t Value) const {
     return ConstantInt::get(getIndexType(), Value);
   }
 
+  PointerType *getSwitchResumePointerType() const {
+    assert(ABI == coro::ABI::Switch);
+  assert(FrameTy && "frame type not assigned");
+  return cast<PointerType>(FrameTy->getElementType(SwitchFieldIndex::Resume));
+  }
+
+  FunctionType *getResumeFunctionType() const {
+    switch (ABI) {
+    case coro::ABI::Switch: {
+      auto *FnPtrTy = getSwitchResumePointerType();
+      return cast<FunctionType>(FnPtrTy->getPointerElementType());
+    }
+    case coro::ABI::Retcon:
+    case coro::ABI::RetconOnce:
+      return RetconLowering.ResumePrototype->getFunctionType();
+    }
+  }
+
+  ArrayRef<Type*> getRetconResultTypes() const {
+    assert(ABI == coro::ABI::Retcon ||
+           ABI == coro::ABI::RetconOnce);
+    auto FTy = CoroBegin->getFunction()->getFunctionType();
+
+    // This is checked by AnyCoroIdRetconInst::isWellFormed().
+    if (auto STy = dyn_cast<StructType>(FTy->getReturnType())) {
+      return STy->elements().slice(1);
+    } else {
+      return ArrayRef<Type*>();
+    }
+  }
+
+  CallingConv::ID getResumeFunctionCC() const {
+    switch (ABI) {
+    case coro::ABI::Switch:
+      return CallingConv::Fast;
+
+    case coro::ABI::Retcon:
+    case coro::ABI::RetconOnce:
+      return RetconLowering.ResumePrototype->getCallingConv();
+    }
+  }
+
+  unsigned getFirstSpillFieldIndex() const {
+    switch (ABI) {
+    case coro::ABI::Switch:
+      return SwitchFieldIndex::FirstSpill;
+
+    case coro::ABI::Retcon:
+    case coro::ABI::RetconOnce:
+      return 0;
+    }    
+  }
+
+  AllocaInst *getPromiseAlloca() const {
+    if (ABI == coro::ABI::Switch)
+      return SwitchLowering.PromiseAlloca;
+    return nullptr;
+  }
+
+  /// Allocate memory according to the rules of the active lowering.
+  ///
+  /// \param CG - if non-null, will be updated for the new call
+  Value *emitAlloc(IRBuilder<> &Builder, Value *Size, CallGraph *CG) const;
+
+  /// Deallocate memory according to the rules of the active lowering.
+  ///
+  /// \param CG - if non-null, will be updated for the new call
+  void emitDealloc(IRBuilder<> &Builder, Value *Ptr, CallGraph *CG) const;
+
   Shape() = default;
   explicit Shape(Function &F) { buildFrom(F); }
   void buildFrom(Function &F);

Modified: llvm/trunk/lib/Transforms/Coroutines/CoroSplit.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Coroutines/CoroSplit.cpp?rev=368788&r1=368787&r2=368788&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/Coroutines/CoroSplit.cpp (original)
+++ llvm/trunk/lib/Transforms/Coroutines/CoroSplit.cpp Tue Aug 13 20:53:17 2019
@@ -70,9 +70,193 @@ using namespace llvm;
 
 #define DEBUG_TYPE "coro-split"
 
+namespace {
+
+/// A little helper class for building 
+class CoroCloner {
+public:
+  enum class Kind {
+    /// The shared resume function for a switch lowering.
+    SwitchResume,
+
+    /// The shared unwind function for a switch lowering.
+    SwitchUnwind,
+
+    /// The shared cleanup function for a switch lowering.
+    SwitchCleanup,
+
+    /// An individual continuation function.
+    Continuation,
+  };
+private:
+  Function &OrigF;
+  Function *NewF;
+  const Twine &Suffix;
+  coro::Shape &Shape;
+  Kind FKind;
+  ValueToValueMapTy VMap;
+  IRBuilder<> Builder;
+  Value *NewFramePtr = nullptr;
+
+  /// The active suspend instruction; meaningful only for continuation ABIs.
+  AnyCoroSuspendInst *ActiveSuspend = nullptr;
+
+public:
+  /// Create a cloner for a switch lowering.
+  CoroCloner(Function &OrigF, const Twine &Suffix, coro::Shape &Shape,
+             Kind FKind)
+    : OrigF(OrigF), NewF(nullptr), Suffix(Suffix), Shape(Shape),
+      FKind(FKind), Builder(OrigF.getContext()) {
+    assert(Shape.ABI == coro::ABI::Switch);
+  }
+
+  /// Create a cloner for a continuation lowering.
+  CoroCloner(Function &OrigF, const Twine &Suffix, coro::Shape &Shape,
+             Function *NewF, AnyCoroSuspendInst *ActiveSuspend)
+    : OrigF(OrigF), NewF(NewF), Suffix(Suffix), Shape(Shape),
+      FKind(Kind::Continuation), Builder(OrigF.getContext()),
+      ActiveSuspend(ActiveSuspend) {
+    assert(Shape.ABI == coro::ABI::Retcon ||
+           Shape.ABI == coro::ABI::RetconOnce);
+    assert(NewF && "need existing function for continuation");
+    assert(ActiveSuspend && "need active suspend point for continuation");
+  }
+
+  Function *getFunction() const {
+    assert(NewF != nullptr && "declaration not yet set");
+    return NewF;
+  }
+
+  void create();
+
+private:
+  bool isSwitchDestroyFunction() {
+    switch (FKind) {
+    case Kind::Continuation:
+    case Kind::SwitchResume:
+      return false;
+    case Kind::SwitchUnwind:
+    case Kind::SwitchCleanup:
+      return true;
+    }
+  }
+
+  void createDeclaration();
+  void replaceEntryBlock();
+  Value *deriveNewFramePointer();
+  void replaceCoroSuspends();
+  void replaceCoroEnds();
+  void handleFinalSuspend();
+  void maybeFreeContinuationStorage();
+};
+
+} // end anonymous namespace
+
+static void maybeFreeRetconStorage(IRBuilder<> &Builder, coro::Shape &Shape,
+                                   Value *FramePtr, CallGraph *CG) {
+  assert(Shape.ABI == coro::ABI::Retcon ||
+         Shape.ABI == coro::ABI::RetconOnce);
+  if (Shape.RetconLowering.IsFrameInlineInStorage)
+    return;
+
+  Shape.emitDealloc(Builder, FramePtr, CG);
+}
+
+/// Replace a non-unwind call to llvm.coro.end.
+static void replaceFallthroughCoroEnd(CoroEndInst *End, coro::Shape &Shape,
+                                      Value *FramePtr, bool InResume,
+                                      CallGraph *CG) {
+  // Start inserting right before the coro.end.
+  IRBuilder<> Builder(End);
+
+  // Create the return instruction.
+  switch (Shape.ABI) {
+  // The cloned functions in switch-lowering always return void.
+  case coro::ABI::Switch:
+    // coro.end doesn't immediately end the coroutine in the main function
+    // in this lowering, because we need to deallocate the coroutine.
+    if (!InResume)
+      return;
+    Builder.CreateRetVoid();
+    break;
+
+  // In unique continuation lowering, the continuations always return void.
+  // But we may have implicitly allocated storage.
+  case coro::ABI::RetconOnce:
+    maybeFreeRetconStorage(Builder, Shape, FramePtr, CG);
+    Builder.CreateRetVoid();
+    break;
+
+  // In non-unique continuation lowering, we signal completion by returning
+  // a null continuation.
+  case coro::ABI::Retcon: {
+    maybeFreeRetconStorage(Builder, Shape, FramePtr, CG);
+    auto RetTy = Shape.getResumeFunctionType()->getReturnType();
+    auto RetStructTy = dyn_cast<StructType>(RetTy);
+    PointerType *ContinuationTy =
+      cast<PointerType>(RetStructTy ? RetStructTy->getElementType(0) : RetTy);
+
+    Value *ReturnValue = ConstantPointerNull::get(ContinuationTy);
+    if (RetStructTy) {
+      ReturnValue = Builder.CreateInsertValue(UndefValue::get(RetStructTy),
+                                              ReturnValue, 0);
+    }
+    Builder.CreateRet(ReturnValue);
+    break;
+  }
+  }
+
+  // Remove the rest of the block, by splitting it into an unreachable block.
+  auto *BB = End->getParent();
+  BB->splitBasicBlock(End);
+  BB->getTerminator()->eraseFromParent();
+}
+
+/// Replace an unwind call to llvm.coro.end.
+static void replaceUnwindCoroEnd(CoroEndInst *End, coro::Shape &Shape,
+                                 Value *FramePtr, bool InResume, CallGraph *CG){
+  IRBuilder<> Builder(End);
+
+  switch (Shape.ABI) {
+  // In switch-lowering, this does nothing in the main function.
+  case coro::ABI::Switch:
+    if (!InResume)
+      return;
+    break;
+
+  // In continuation-lowering, this frees the continuation storage.
+  case coro::ABI::Retcon:
+  case coro::ABI::RetconOnce:
+    maybeFreeRetconStorage(Builder, Shape, FramePtr, CG);
+    break;
+  }
+
+  // If coro.end has an associated bundle, add cleanupret instruction.
+  if (auto Bundle = End->getOperandBundle(LLVMContext::OB_funclet)) {
+    auto *FromPad = cast<CleanupPadInst>(Bundle->Inputs[0]);
+    auto *CleanupRet = Builder.CreateCleanupRet(FromPad, nullptr);
+    End->getParent()->splitBasicBlock(End);
+    CleanupRet->getParent()->getTerminator()->eraseFromParent();
+  }
+}
+
+static void replaceCoroEnd(CoroEndInst *End, coro::Shape &Shape,
+                           Value *FramePtr, bool InResume, CallGraph *CG) {
+  if (End->isUnwind())
+    replaceUnwindCoroEnd(End, Shape, FramePtr, InResume, CG);
+  else
+    replaceFallthroughCoroEnd(End, Shape, FramePtr, InResume, CG);
+
+  auto &Context = End->getContext();
+  End->replaceAllUsesWith(InResume ? ConstantInt::getTrue(Context)
+                                   : ConstantInt::getFalse(Context));
+  End->eraseFromParent();
+}
+
 // Create an entry block for a resume function with a switch that will jump to
 // suspend points.
-static BasicBlock *createResumeEntryBlock(Function &F, coro::Shape &Shape) {
+static void createResumeEntryBlock(Function &F, coro::Shape &Shape) {
+  assert(Shape.ABI == coro::ABI::Switch);
   LLVMContext &C = F.getContext();
 
   // resume.entry:
@@ -91,15 +275,16 @@ static BasicBlock *createResumeEntryBloc
   IRBuilder<> Builder(NewEntry);
   auto *FramePtr = Shape.FramePtr;
   auto *FrameTy = Shape.FrameTy;
-  auto *GepIndex = Builder.CreateConstInBoundsGEP2_32(
-      FrameTy, FramePtr, 0, coro::Shape::IndexField, "index.addr");
+  auto *GepIndex = Builder.CreateStructGEP(
+      FrameTy, FramePtr, coro::Shape::SwitchFieldIndex::Index, "index.addr");
   auto *Index = Builder.CreateLoad(Shape.getIndexType(), GepIndex, "index");
   auto *Switch =
       Builder.CreateSwitch(Index, UnreachBB, Shape.CoroSuspends.size());
-  Shape.ResumeSwitch = Switch;
+  Shape.SwitchLowering.ResumeSwitch = Switch;
 
   size_t SuspendIndex = 0;
-  for (CoroSuspendInst *S : Shape.CoroSuspends) {
+  for (auto *AnyS : Shape.CoroSuspends) {
+    auto *S = cast<CoroSuspendInst>(AnyS);
     ConstantInt *IndexVal = Shape.getIndex(SuspendIndex);
 
     // Replace CoroSave with a store to Index:
@@ -109,14 +294,15 @@ static BasicBlock *createResumeEntryBloc
     Builder.SetInsertPoint(Save);
     if (S->isFinal()) {
       // Final suspend point is represented by storing zero in ResumeFnAddr.
-      auto *GepIndex = Builder.CreateConstInBoundsGEP2_32(FrameTy, FramePtr, 0,
-                                                          0, "ResumeFn.addr");
+      auto *GepIndex = Builder.CreateStructGEP(FrameTy, FramePtr,
+                                 coro::Shape::SwitchFieldIndex::Resume,
+                                  "ResumeFn.addr");
       auto *NullPtr = ConstantPointerNull::get(cast<PointerType>(
           cast<PointerType>(GepIndex->getType())->getElementType()));
       Builder.CreateStore(NullPtr, GepIndex);
     } else {
-      auto *GepIndex = Builder.CreateConstInBoundsGEP2_32(
-          FrameTy, FramePtr, 0, coro::Shape::IndexField, "index.addr");
+      auto *GepIndex = Builder.CreateStructGEP(
+          FrameTy, FramePtr, coro::Shape::SwitchFieldIndex::Index, "index.addr");
       Builder.CreateStore(IndexVal, GepIndex);
     }
     Save->replaceAllUsesWith(ConstantTokenNone::get(C));
@@ -164,48 +350,9 @@ static BasicBlock *createResumeEntryBloc
   Builder.SetInsertPoint(UnreachBB);
   Builder.CreateUnreachable();
 
-  return NewEntry;
+  Shape.SwitchLowering.ResumeEntryBlock = NewEntry;
 }
 
-// In Resumers, we replace fallthrough coro.end with ret void and delete the
-// rest of the block.
-static void replaceFallthroughCoroEnd(IntrinsicInst *End,
-                                      ValueToValueMapTy &VMap) {
-  auto *NewE = cast<IntrinsicInst>(VMap[End]);
-  ReturnInst::Create(NewE->getContext(), nullptr, NewE);
-
-  // Remove the rest of the block, by splitting it into an unreachable block.
-  auto *BB = NewE->getParent();
-  BB->splitBasicBlock(NewE);
-  BB->getTerminator()->eraseFromParent();
-}
-
-// In Resumers, we replace unwind coro.end with True to force the immediate
-// unwind to caller.
-static void replaceUnwindCoroEnds(coro::Shape &Shape, ValueToValueMapTy &VMap) {
-  if (Shape.CoroEnds.empty())
-    return;
-
-  LLVMContext &Context = Shape.CoroEnds.front()->getContext();
-  auto *True = ConstantInt::getTrue(Context);
-  for (CoroEndInst *CE : Shape.CoroEnds) {
-    if (!CE->isUnwind())
-      continue;
-
-    auto *NewCE = cast<IntrinsicInst>(VMap[CE]);
-
-    // If coro.end has an associated bundle, add cleanupret instruction.
-    if (auto Bundle = NewCE->getOperandBundle(LLVMContext::OB_funclet)) {
-      Value *FromPad = Bundle->Inputs[0];
-      auto *CleanupRet = CleanupReturnInst::Create(FromPad, nullptr, NewCE);
-      NewCE->getParent()->splitBasicBlock(NewCE);
-      CleanupRet->getParent()->getTerminator()->eraseFromParent();
-    }
-
-    NewCE->replaceAllUsesWith(True);
-    NewCE->eraseFromParent();
-  }
-}
 
 // Rewrite final suspend point handling. We do not use suspend index to
 // represent the final suspend point. Instead we zero-out ResumeFnAddr in the
@@ -216,83 +363,242 @@ static void replaceUnwindCoroEnds(coro::
 // In the destroy function, we add a code sequence to check if ResumeFnAddress
 // is Null, and if so, jump to the appropriate label to handle cleanup from the
 // final suspend point.
-static void handleFinalSuspend(IRBuilder<> &Builder, Value *FramePtr,
-                               coro::Shape &Shape, SwitchInst *Switch,
-                               bool IsDestroy) {
-  assert(Shape.HasFinalSuspend);
+void CoroCloner::handleFinalSuspend() {
+  assert(Shape.ABI == coro::ABI::Switch &&
+         Shape.SwitchLowering.HasFinalSuspend);
+  auto *Switch = cast<SwitchInst>(VMap[Shape.SwitchLowering.ResumeSwitch]);
   auto FinalCaseIt = std::prev(Switch->case_end());
   BasicBlock *ResumeBB = FinalCaseIt->getCaseSuccessor();
   Switch->removeCase(FinalCaseIt);
-  if (IsDestroy) {
+  if (isSwitchDestroyFunction()) {
     BasicBlock *OldSwitchBB = Switch->getParent();
     auto *NewSwitchBB = OldSwitchBB->splitBasicBlock(Switch, "Switch");
     Builder.SetInsertPoint(OldSwitchBB->getTerminator());
-    auto *GepIndex = Builder.CreateConstInBoundsGEP2_32(Shape.FrameTy, FramePtr,
-                                                        0, 0, "ResumeFn.addr");
-    auto *Load = Builder.CreateLoad(
-        Shape.FrameTy->getElementType(coro::Shape::ResumeField), GepIndex);
-    auto *NullPtr =
-        ConstantPointerNull::get(cast<PointerType>(Load->getType()));
-    auto *Cond = Builder.CreateICmpEQ(Load, NullPtr);
+    auto *GepIndex = Builder.CreateStructGEP(Shape.FrameTy, NewFramePtr,
+                                       coro::Shape::SwitchFieldIndex::Resume,
+                                             "ResumeFn.addr");
+    auto *Load = Builder.CreateLoad(Shape.getSwitchResumePointerType(),
+                                    GepIndex);
+    auto *Cond = Builder.CreateIsNull(Load);
     Builder.CreateCondBr(Cond, ResumeBB, NewSwitchBB);
     OldSwitchBB->getTerminator()->eraseFromParent();
   }
 }
 
-// Create a resume clone by cloning the body of the original function, setting
-// new entry block and replacing coro.suspend an appropriate value to force
-// resume or cleanup pass for every suspend point.
-static Function *createClone(Function &F, Twine Suffix, coro::Shape &Shape,
-                             BasicBlock *ResumeEntry, int8_t FnIndex) {
-  Module *M = F.getParent();
-  auto *FrameTy = Shape.FrameTy;
-  auto *FnPtrTy = cast<PointerType>(FrameTy->getElementType(0));
-  auto *FnTy = cast<FunctionType>(FnPtrTy->getElementType());
+static Function *createCloneDeclaration(Function &OrigF, coro::Shape &Shape,
+                                        const Twine &Suffix,
+                                        Module::iterator InsertBefore) {
+  Module *M = OrigF.getParent();
+  auto *FnTy = Shape.getResumeFunctionType();
 
   Function *NewF =
-      Function::Create(FnTy, GlobalValue::LinkageTypes::ExternalLinkage,
-                       F.getName() + Suffix, M);
+      Function::Create(FnTy, GlobalValue::LinkageTypes::InternalLinkage,
+                       OrigF.getName() + Suffix);
   NewF->addParamAttr(0, Attribute::NonNull);
   NewF->addParamAttr(0, Attribute::NoAlias);
 
-  ValueToValueMapTy VMap;
+  M->getFunctionList().insert(InsertBefore, NewF);
+
+  return NewF;
+}
+
+void CoroCloner::replaceCoroSuspends() {
+  Value *SuspendResult;
+
+  switch (Shape.ABI) {
+  // In switch lowering, replace coro.suspend with the appropriate value
+  // for the type of function we're extracting.
+  // Replacing coro.suspend with (0) will result in control flow proceeding to
+  // a resume label associated with a suspend point, replacing it with (1) will
+  // result in control flow proceeding to a cleanup label associated with this
+  // suspend point.
+  case coro::ABI::Switch:
+    SuspendResult = Builder.getInt8(isSwitchDestroyFunction() ? 1 : 0);
+    break;
+
+  // In continuation lowering, replace all of the suspend uses with false to
+  // indicate that they're not unwinding resumes.  We've already mapped the
+  // active suspend to the appropriate argument, so any other suspend values
+  // that are still being used must be from previous suspends.  It's UB to try
+  // to suspend during unwind, so they must be from regular resumes.
+  case coro::ABI::RetconOnce:
+  case coro::ABI::Retcon:
+    SuspendResult = Builder.getInt1(false);
+    break;
+  }
+
+  for (AnyCoroSuspendInst *CS : Shape.CoroSuspends) {
+    // The active suspend was handled earlier.
+    if (CS == ActiveSuspend) continue;
+
+    auto *MappedCS = cast<AnyCoroSuspendInst>(VMap[CS]);
+    MappedCS->replaceAllUsesWith(SuspendResult);
+    MappedCS->eraseFromParent();
+  }
+}
+
+void CoroCloner::replaceCoroEnds() {
+  for (CoroEndInst *CE : Shape.CoroEnds) {
+    // We use a null call graph because there's no call graph node for
+    // the cloned function yet.  We'll just be rebuilding that later.
+    auto NewCE = cast<CoroEndInst>(VMap[CE]);
+    replaceCoroEnd(NewCE, Shape, NewFramePtr, /*in resume*/ true, nullptr);
+  }
+}
+
+void CoroCloner::replaceEntryBlock() {
+  // In the original function, the AllocaSpillBlock is a block immediately
+  // following the allocation of the frame object which defines GEPs for
+  // all the allocas that have been moved into the frame, and it ends by
+  // branching to the original beginning of the coroutine.  Make this 
+  // the entry block of the cloned function.
+  auto *Entry = cast<BasicBlock>(VMap[Shape.AllocaSpillBlock]);
+  Entry->setName("entry" + Suffix);
+  Entry->moveBefore(&NewF->getEntryBlock());
+  Entry->getTerminator()->eraseFromParent();
+
+  // Clear all predecessors of the new entry block.  There should be
+  // exactly one predecessor, which we created when splitting out
+  // AllocaSpillBlock to begin with.
+  assert(Entry->hasOneUse());
+  auto BranchToEntry = cast<BranchInst>(Entry->user_back());
+  assert(BranchToEntry->isUnconditional());
+  Builder.SetInsertPoint(BranchToEntry);
+  Builder.CreateUnreachable();
+  BranchToEntry->eraseFromParent();
+
+  // TODO: move any allocas into Entry that weren't moved into the frame.
+  // (Currently we move all allocas into the frame.)
+
+  // Branch from the entry to the appropriate place.
+  Builder.SetInsertPoint(Entry);
+  switch (Shape.ABI) {
+  case coro::ABI::Switch: {
+    // In switch-lowering, we built a resume-entry block in the original
+    // function.  Make the entry block branch to this.
+    auto *SwitchBB =
+      cast<BasicBlock>(VMap[Shape.SwitchLowering.ResumeEntryBlock]);
+    Builder.CreateBr(SwitchBB);
+    break;
+  }
+
+  case coro::ABI::Retcon:
+  case coro::ABI::RetconOnce: {
+    // In continuation ABIs, we want to branch to immediately after the
+    // active suspend point.  Earlier phases will have put the suspend in its
+    // own basic block, so just thread our jump directly to its successor.
+    auto MappedCS = cast<CoroSuspendRetconInst>(VMap[ActiveSuspend]);
+    auto Branch = cast<BranchInst>(MappedCS->getNextNode());
+    assert(Branch->isUnconditional());
+    Builder.CreateBr(Branch->getSuccessor(0));
+    break;
+  }
+  }
+}
+
+/// Derive the value of the new frame pointer.
+Value *CoroCloner::deriveNewFramePointer() {
+  // Builder should be inserting to the front of the new entry block.
+
+  switch (Shape.ABI) {
+  // In switch-lowering, the argument is the frame pointer.
+  case coro::ABI::Switch:
+    return &*NewF->arg_begin();
+
+  // In continuation-lowering, the argument is the opaque storage.
+  case coro::ABI::Retcon:
+  case coro::ABI::RetconOnce: {
+    Argument *NewStorage = &*NewF->arg_begin();
+    auto FramePtrTy = Shape.FrameTy->getPointerTo();
+
+    // If the storage is inline, just bitcast to the storage to the frame type.
+    if (Shape.RetconLowering.IsFrameInlineInStorage)
+      return Builder.CreateBitCast(NewStorage, FramePtrTy);
+
+    // Otherwise, load the real frame from the opaque storage.
+    auto FramePtrPtr =
+      Builder.CreateBitCast(NewStorage, FramePtrTy->getPointerTo());
+    return Builder.CreateLoad(FramePtrPtr);
+  }
+  }
+  llvm_unreachable("bad ABI");
+}
+
+/// Clone the body of the original function into a resume function of
+/// some sort.
+void CoroCloner::create() {
+  // Create the new function if we don't already have one.
+  if (!NewF) {
+    NewF = createCloneDeclaration(OrigF, Shape, Suffix,
+                                  OrigF.getParent()->end());
+  }
+
   // Replace all args with undefs. The buildCoroutineFrame algorithm already
   // rewritten access to the args that occurs after suspend points with loads
   // and stores to/from the coroutine frame.
-  for (Argument &A : F.args())
+  for (Argument &A : OrigF.args())
     VMap[&A] = UndefValue::get(A.getType());
 
   SmallVector<ReturnInst *, 4> Returns;
 
-  CloneFunctionInto(NewF, &F, VMap, /*ModuleLevelChanges=*/true, Returns);
-  NewF->setLinkage(GlobalValue::LinkageTypes::InternalLinkage);
+  CloneFunctionInto(NewF, &OrigF, VMap, /*ModuleLevelChanges=*/true, Returns);
 
-  // Remove old returns.
-  for (ReturnInst *Return : Returns)
-    changeToUnreachable(Return, /*UseLLVMTrap=*/false);
-
-  // Remove old return attributes.
-  NewF->removeAttributes(
-      AttributeList::ReturnIndex,
-      AttributeFuncs::typeIncompatible(NewF->getReturnType()));
+  auto &Context = NewF->getContext();
 
-  // Make AllocaSpillBlock the new entry block.
-  auto *SwitchBB = cast<BasicBlock>(VMap[ResumeEntry]);
-  auto *Entry = cast<BasicBlock>(VMap[Shape.AllocaSpillBlock]);
-  Entry->moveBefore(&NewF->getEntryBlock());
-  Entry->getTerminator()->eraseFromParent();
-  BranchInst::Create(SwitchBB, Entry);
-  Entry->setName("entry" + Suffix);
+  // Replace the attributes of the new function:
+  auto OrigAttrs = NewF->getAttributes();
+  auto NewAttrs = AttributeList();
+
+  switch (Shape.ABI) {
+  case coro::ABI::Switch:
+    // Bootstrap attributes by copying function attributes from the
+    // original function.  This should include optimization settings and so on.
+    NewAttrs = NewAttrs.addAttributes(Context, AttributeList::FunctionIndex,
+                                      OrigAttrs.getFnAttributes());
+    break;
+
+  case coro::ABI::Retcon:
+  case coro::ABI::RetconOnce:
+    // If we have a continuation prototype, just use its attributes,
+    // full-stop.
+    NewAttrs = Shape.RetconLowering.ResumePrototype->getAttributes();
+    break;
+  }
+
+  // Make the frame parameter nonnull and noalias.
+  NewAttrs = NewAttrs.addParamAttribute(Context, 0, Attribute::NonNull);
+  NewAttrs = NewAttrs.addParamAttribute(Context, 0, Attribute::NoAlias);
+
+  switch (Shape.ABI) {
+  // In these ABIs, the cloned functions always return 'void', and the
+  // existing return sites are meaningless.  Note that for unique
+  // continuations, this includes the returns associated with suspends;
+  // this is fine because we can't suspend twice.
+  case coro::ABI::Switch:
+  case coro::ABI::RetconOnce:
+    // Remove old returns.
+    for (ReturnInst *Return : Returns)
+      changeToUnreachable(Return, /*UseLLVMTrap=*/false);
+    break;
+
+  // With multi-suspend continuations, we'll already have eliminated the
+  // original returns and inserted returns before all the suspend points,
+  // so we want to leave any returns in place.
+  case coro::ABI::Retcon:
+    break;
+  }
 
-  // Clear all predecessors of the new entry block.
-  auto *Switch = cast<SwitchInst>(VMap[Shape.ResumeSwitch]);
-  Entry->replaceAllUsesWith(Switch->getDefaultDest());
+  NewF->setAttributes(NewAttrs);
+  NewF->setCallingConv(Shape.getResumeFunctionCC());
 
-  IRBuilder<> Builder(&NewF->getEntryBlock().front());
+  // Set up the new entry block.
+  replaceEntryBlock();
+
+  Builder.SetInsertPoint(&NewF->getEntryBlock().front());
+  NewFramePtr = deriveNewFramePointer();
 
   // Remap frame pointer.
-  Argument *NewFramePtr = &*NewF->arg_begin();
-  Value *OldFramePtr = cast<Value>(VMap[Shape.FramePtr]);
+  Value *OldFramePtr = VMap[Shape.FramePtr];
   NewFramePtr->takeName(OldFramePtr);
   OldFramePtr->replaceAllUsesWith(NewFramePtr);
 
@@ -302,50 +608,55 @@ static Function *createClone(Function &F
   Value *OldVFrame = cast<Value>(VMap[Shape.CoroBegin]);
   OldVFrame->replaceAllUsesWith(NewVFrame);
 
-  // Rewrite final suspend handling as it is not done via switch (allows to
-  // remove final case from the switch, since it is undefined behavior to resume
-  // the coroutine suspended at the final suspend point.
-  if (Shape.HasFinalSuspend) {
-    auto *Switch = cast<SwitchInst>(VMap[Shape.ResumeSwitch]);
-    bool IsDestroy = FnIndex != 0;
-    handleFinalSuspend(Builder, NewFramePtr, Shape, Switch, IsDestroy);
+  switch (Shape.ABI) {
+  case coro::ABI::Switch:
+    // Rewrite final suspend handling as it is not done via switch (allows to
+    // remove final case from the switch, since it is undefined behavior to
+    // resume the coroutine suspended at the final suspend point.
+    if (Shape.SwitchLowering.HasFinalSuspend)
+      handleFinalSuspend();
+    break;
+
+  case coro::ABI::Retcon:
+  case coro::ABI::RetconOnce:
+    // Replace the active suspend with the should-unwind argument.
+    // Coerce it to i1 if necessary.
+    assert(ActiveSuspend != nullptr &&
+           "no active suspend when lowering a continuation-style coroutine");
+    Value *ShouldUnwind = &*std::next(NewF->arg_begin());
+    if (!ShouldUnwind->getType()->isIntegerTy(1))
+      ShouldUnwind = Builder.CreateIsNotNull(ShouldUnwind);
+    VMap[ActiveSuspend]->replaceAllUsesWith(ShouldUnwind);
+    break;
   }
 
-  // Replace coro suspend with the appropriate resume index.
-  // Replacing coro.suspend with (0) will result in control flow proceeding to
-  // a resume label associated with a suspend point, replacing it with (1) will
-  // result in control flow proceeding to a cleanup label associated with this
-  // suspend point.
-  auto *NewValue = Builder.getInt8(FnIndex ? 1 : 0);
-  for (CoroSuspendInst *CS : Shape.CoroSuspends) {
-    auto *MappedCS = cast<CoroSuspendInst>(VMap[CS]);
-    MappedCS->replaceAllUsesWith(NewValue);
-    MappedCS->eraseFromParent();
-  }
+  // Handle suspends.
+  replaceCoroSuspends();
 
   // Remove coro.end intrinsics.
-  replaceFallthroughCoroEnd(Shape.CoroEnds.front(), VMap);
-  replaceUnwindCoroEnds(Shape, VMap);
+  replaceCoroEnds();
+
   // Eliminate coro.free from the clones, replacing it with 'null' in cleanup,
   // to suppress deallocation code.
-  coro::replaceCoroFree(cast<CoroIdInst>(VMap[Shape.CoroBegin->getId()]),
-                        /*Elide=*/FnIndex == 2);
-
-  NewF->setCallingConv(CallingConv::Fast);
-
-  return NewF;
+  if (Shape.ABI == coro::ABI::Switch)
+    coro::replaceCoroFree(cast<CoroIdInst>(VMap[Shape.CoroBegin->getId()]),
+                          /*Elide=*/ FKind == CoroCloner::Kind::SwitchCleanup);
 }
 
-static void removeCoroEnds(coro::Shape &Shape) {
-  if (Shape.CoroEnds.empty())
-    return;
-
-  LLVMContext &Context = Shape.CoroEnds.front()->getContext();
-  auto *False = ConstantInt::getFalse(Context);
+// Create a resume clone by cloning the body of the original function, setting
+// new entry block and replacing coro.suspend an appropriate value to force
+// resume or cleanup pass for every suspend point.
+static Function *createClone(Function &F, const Twine &Suffix,
+                             coro::Shape &Shape, CoroCloner::Kind FKind) {
+  CoroCloner Cloner(F, Suffix, Shape, FKind);
+  Cloner.create();
+  return Cloner.getFunction();
+}
 
-  for (CoroEndInst *CE : Shape.CoroEnds) {
-    CE->replaceAllUsesWith(False);
-    CE->eraseFromParent();
+/// Remove calls to llvm.coro.end in the original function.
+static void removeCoroEnds(coro::Shape &Shape, CallGraph *CG) {
+  for (auto End : Shape.CoroEnds) {
+    replaceCoroEnd(End, Shape, Shape.FramePtr, /*in resume*/ false, CG);
   }
 }
 
@@ -377,8 +688,12 @@ static void replaceFrameSize(coro::Shape
 //                    i8* bitcast([2 x void(%f.frame*)*] * @f.resumers to i8*))
 //
 // Assumes that all the functions have the same signature.
-static void setCoroInfo(Function &F, CoroBeginInst *CoroBegin,
-                        std::initializer_list<Function *> Fns) {
+static void setCoroInfo(Function &F, coro::Shape &Shape,
+                        ArrayRef<Function *> Fns) {
+  // This only works under the switch-lowering ABI because coro elision
+  // only works on the switch-lowering ABI.
+  assert(Shape.ABI == coro::ABI::Switch);
+
   SmallVector<Constant *, 4> Args(Fns.begin(), Fns.end());
   assert(!Args.empty());
   Function *Part = *Fns.begin();
@@ -393,29 +708,31 @@ static void setCoroInfo(Function &F, Cor
   // Update coro.begin instruction to refer to this constant.
   LLVMContext &C = F.getContext();
   auto *BC = ConstantExpr::getPointerCast(GV, Type::getInt8PtrTy(C));
-  CoroBegin->getId()->setInfo(BC);
+  Shape.getSwitchCoroId()->setInfo(BC);
 }
 
 // Store addresses of Resume/Destroy/Cleanup functions in the coroutine frame.
 static void updateCoroFrame(coro::Shape &Shape, Function *ResumeFn,
                             Function *DestroyFn, Function *CleanupFn) {
+  assert(Shape.ABI == coro::ABI::Switch);
+
   IRBuilder<> Builder(Shape.FramePtr->getNextNode());
-  auto *ResumeAddr = Builder.CreateConstInBoundsGEP2_32(
-      Shape.FrameTy, Shape.FramePtr, 0, coro::Shape::ResumeField,
+  auto *ResumeAddr = Builder.CreateStructGEP(
+      Shape.FrameTy, Shape.FramePtr, coro::Shape::SwitchFieldIndex::Resume,
       "resume.addr");
   Builder.CreateStore(ResumeFn, ResumeAddr);
 
   Value *DestroyOrCleanupFn = DestroyFn;
 
-  CoroIdInst *CoroId = Shape.CoroBegin->getId();
+  CoroIdInst *CoroId = Shape.getSwitchCoroId();
   if (CoroAllocInst *CA = CoroId->getCoroAlloc()) {
     // If there is a CoroAlloc and it returns false (meaning we elide the
     // allocation, use CleanupFn instead of DestroyFn).
     DestroyOrCleanupFn = Builder.CreateSelect(CA, DestroyFn, CleanupFn);
   }
 
-  auto *DestroyAddr = Builder.CreateConstInBoundsGEP2_32(
-      Shape.FrameTy, Shape.FramePtr, 0, coro::Shape::DestroyField,
+  auto *DestroyAddr = Builder.CreateStructGEP(
+      Shape.FrameTy, Shape.FramePtr, coro::Shape::SwitchFieldIndex::Destroy,
       "destroy.addr");
   Builder.CreateStore(DestroyOrCleanupFn, DestroyAddr);
 }
@@ -520,21 +837,34 @@ static void addMustTailToCoroResumes(Fun
 
 // Coroutine has no suspend points. Remove heap allocation for the coroutine
 // frame if possible.
-static void handleNoSuspendCoroutine(CoroBeginInst *CoroBegin, Type *FrameTy) {
+static void handleNoSuspendCoroutine(coro::Shape &Shape) {
+  auto *CoroBegin = Shape.CoroBegin;
   auto *CoroId = CoroBegin->getId();
   auto *AllocInst = CoroId->getCoroAlloc();
-  coro::replaceCoroFree(CoroId, /*Elide=*/AllocInst != nullptr);
-  if (AllocInst) {
-    IRBuilder<> Builder(AllocInst);
-    // FIXME: Need to handle overaligned members.
-    auto *Frame = Builder.CreateAlloca(FrameTy);
-    auto *VFrame = Builder.CreateBitCast(Frame, Builder.getInt8PtrTy());
-    AllocInst->replaceAllUsesWith(Builder.getFalse());
-    AllocInst->eraseFromParent();
-    CoroBegin->replaceAllUsesWith(VFrame);
-  } else {
-    CoroBegin->replaceAllUsesWith(CoroBegin->getMem());
+  switch (Shape.ABI) {
+  case coro::ABI::Switch: {
+    auto SwitchId = cast<CoroIdInst>(CoroId);
+    coro::replaceCoroFree(SwitchId, /*Elide=*/AllocInst != nullptr);
+    if (AllocInst) {
+      IRBuilder<> Builder(AllocInst);
+      // FIXME: Need to handle overaligned members.
+      auto *Frame = Builder.CreateAlloca(Shape.FrameTy);
+      auto *VFrame = Builder.CreateBitCast(Frame, Builder.getInt8PtrTy());
+      AllocInst->replaceAllUsesWith(Builder.getFalse());
+      AllocInst->eraseFromParent();
+      CoroBegin->replaceAllUsesWith(VFrame);
+    } else {
+      CoroBegin->replaceAllUsesWith(CoroBegin->getMem());
+    }
+    break;
+  }
+
+  case coro::ABI::Retcon:
+  case coro::ABI::RetconOnce:
+    CoroBegin->replaceAllUsesWith(UndefValue::get(CoroBegin->getType()));
+    break;
   }
+
   CoroBegin->eraseFromParent();
 }
 
@@ -670,12 +1000,16 @@ static bool simplifySuspendPoint(CoroSus
 
 // Remove suspend points that are simplified.
 static void simplifySuspendPoints(coro::Shape &Shape) {
+  // Currently, the only simplification we do is switch-lowering-specific.
+  if (Shape.ABI != coro::ABI::Switch)
+    return;
+
   auto &S = Shape.CoroSuspends;
   size_t I = 0, N = S.size();
   if (N == 0)
     return;
   while (true) {
-    if (simplifySuspendPoint(S[I], Shape.CoroBegin)) {
+    if (simplifySuspendPoint(cast<CoroSuspendInst>(S[I]), Shape.CoroBegin)) {
       if (--N == I)
         break;
       std::swap(S[I], S[N]);
@@ -776,6 +1110,180 @@ static void relocateInstructionBefore(Co
   }
 }
 
+static void splitSwitchCoroutine(Function &F, coro::Shape &Shape,
+                                 SmallVectorImpl<Function *> &Clones) {
+  assert(Shape.ABI == coro::ABI::Switch);
+
+  createResumeEntryBlock(F, Shape);
+  auto ResumeClone = createClone(F, ".resume", Shape,
+                                 CoroCloner::Kind::SwitchResume);
+  auto DestroyClone = createClone(F, ".destroy", Shape,
+                                  CoroCloner::Kind::SwitchUnwind);
+  auto CleanupClone = createClone(F, ".cleanup", Shape,
+                                  CoroCloner::Kind::SwitchCleanup);
+
+  postSplitCleanup(*ResumeClone);
+  postSplitCleanup(*DestroyClone);
+  postSplitCleanup(*CleanupClone);
+
+  addMustTailToCoroResumes(*ResumeClone);
+
+  // Store addresses resume/destroy/cleanup functions in the coroutine frame.
+  updateCoroFrame(Shape, ResumeClone, DestroyClone, CleanupClone);
+
+  assert(Clones.empty());
+  Clones.push_back(ResumeClone);
+  Clones.push_back(DestroyClone);
+  Clones.push_back(CleanupClone);
+
+  // Create a constant array referring to resume/destroy/clone functions pointed
+  // by the last argument of @llvm.coro.info, so that CoroElide pass can
+  // determined correct function to call.
+  setCoroInfo(F, Shape, Clones);
+}
+
+static void splitRetconCoroutine(Function &F, coro::Shape &Shape,
+                                 SmallVectorImpl<Function *> &Clones) {
+  assert(Shape.ABI == coro::ABI::Retcon ||
+         Shape.ABI == coro::ABI::RetconOnce);
+  assert(Clones.empty());
+
+  // Reset various things that the optimizer might have decided it
+  // "knows" about the coroutine function due to not seeing a return.
+  F.removeFnAttr(Attribute::NoReturn);
+  F.removeAttribute(AttributeList::ReturnIndex, Attribute::NoAlias);
+  F.removeAttribute(AttributeList::ReturnIndex, Attribute::NonNull);
+
+  // Allocate the frame.
+  auto *Id = cast<AnyCoroIdRetconInst>(Shape.CoroBegin->getId());
+  Value *RawFramePtr;
+  if (Shape.RetconLowering.IsFrameInlineInStorage) {
+    RawFramePtr = Id->getStorage();
+  } else {
+    IRBuilder<> Builder(Id);
+
+    // Determine the size of the frame.
+    const DataLayout &DL = F.getParent()->getDataLayout();
+    auto Size = DL.getTypeAllocSize(Shape.FrameTy);
+
+    // Allocate.  We don't need to update the call graph node because we're
+    // going to recompute it from scratch after splitting.
+    RawFramePtr = Shape.emitAlloc(Builder, Builder.getInt64(Size), nullptr);
+    RawFramePtr =
+      Builder.CreateBitCast(RawFramePtr, Shape.CoroBegin->getType());
+
+    // Stash the allocated frame pointer in the continuation storage.
+    auto Dest = Builder.CreateBitCast(Id->getStorage(),
+                                      RawFramePtr->getType()->getPointerTo());
+    Builder.CreateStore(RawFramePtr, Dest);
+  }
+
+  // Map all uses of llvm.coro.begin to the allocated frame pointer.
+  {
+    // Make sure we don't invalidate Shape.FramePtr.
+    TrackingVH<Instruction> Handle(Shape.FramePtr);
+    Shape.CoroBegin->replaceAllUsesWith(RawFramePtr);
+    Shape.FramePtr = Handle.getValPtr();
+  }
+
+  // Create a unique return block.
+  BasicBlock *ReturnBB = nullptr;
+  SmallVector<PHINode *, 4> ReturnPHIs;
+
+  // Create all the functions in order after the main function.
+  auto NextF = std::next(F.getIterator());
+
+  // Create a continuation function for each of the suspend points.
+  Clones.reserve(Shape.CoroSuspends.size());
+  for (size_t i = 0, e = Shape.CoroSuspends.size(); i != e; ++i) {
+    auto Suspend = cast<CoroSuspendRetconInst>(Shape.CoroSuspends[i]);
+
+    // Create the clone declaration.
+    auto Continuation =
+      createCloneDeclaration(F, Shape, ".resume." + Twine(i), NextF);
+    Clones.push_back(Continuation);
+
+    // Insert a branch to the unified return block immediately before
+    // the suspend point.
+    auto SuspendBB = Suspend->getParent();
+    auto NewSuspendBB = SuspendBB->splitBasicBlock(Suspend);
+    auto Branch = cast<BranchInst>(SuspendBB->getTerminator());
+
+    // Create the unified return block.
+    if (!ReturnBB) {
+      // Place it before the first suspend.
+      ReturnBB = BasicBlock::Create(F.getContext(), "coro.return", &F,
+                                    NewSuspendBB);
+      Shape.RetconLowering.ReturnBlock = ReturnBB;
+
+      IRBuilder<> Builder(ReturnBB);
+
+      // Create PHIs for all the return values.
+      assert(ReturnPHIs.empty());
+
+      // First, the continuation.
+      ReturnPHIs.push_back(Builder.CreatePHI(Continuation->getType(),
+                                             Shape.CoroSuspends.size()));
+
+      // Next, all the directly-yielded values.
+      for (auto ResultTy : Shape.getRetconResultTypes())
+        ReturnPHIs.push_back(Builder.CreatePHI(ResultTy,
+                                               Shape.CoroSuspends.size()));
+
+      // Build the return value.
+      auto RetTy = F.getReturnType();
+
+      // Cast the continuation value if necessary.
+      // We can't rely on the types matching up because that type would
+      // have to be infinite.
+      auto CastedContinuationTy =
+        (ReturnPHIs.size() == 1 ? RetTy : RetTy->getStructElementType(0));
+      auto *CastedContinuation =
+        Builder.CreateBitCast(ReturnPHIs[0], CastedContinuationTy);
+
+      Value *RetV;
+      if (ReturnPHIs.size() == 1) {
+        RetV = CastedContinuation;
+      } else {
+        RetV = UndefValue::get(RetTy);
+        RetV = Builder.CreateInsertValue(RetV, CastedContinuation, 0);
+        for (size_t I = 1, E = ReturnPHIs.size(); I != E; ++I)
+          RetV = Builder.CreateInsertValue(RetV, ReturnPHIs[I], I);
+      }
+
+      Builder.CreateRet(RetV);
+    }
+
+    // Branch to the return block.
+    Branch->setSuccessor(0, ReturnBB);
+    ReturnPHIs[0]->addIncoming(Continuation, SuspendBB);
+    size_t NextPHIIndex = 1;
+    for (auto &VUse : Suspend->value_operands())
+      ReturnPHIs[NextPHIIndex++]->addIncoming(&*VUse, SuspendBB);
+    assert(NextPHIIndex == ReturnPHIs.size());
+  }
+
+  assert(Clones.size() == Shape.CoroSuspends.size());
+  for (size_t i = 0, e = Shape.CoroSuspends.size(); i != e; ++i) {
+    auto Suspend = Shape.CoroSuspends[i];
+    auto Clone = Clones[i];
+
+    CoroCloner(F, "resume." + Twine(i), Shape, Clone, Suspend).create();
+  }
+}
+
+static void splitCoroutine(Function &F, coro::Shape &Shape,
+                           SmallVectorImpl<Function *> &Clones) {
+  switch (Shape.ABI) {
+  case coro::ABI::Switch:
+    return splitSwitchCoroutine(F, Shape, Clones);
+  case coro::ABI::Retcon:
+  case coro::ABI::RetconOnce:
+    return splitRetconCoroutine(F, Shape, Clones);
+  }
+  llvm_unreachable("bad ABI kind");
+}
+
 static void splitCoroutine(Function &F, CallGraph &CG, CallGraphSCC &SCC) {
   EliminateUnreachableBlocks(F);
 
@@ -788,41 +1296,21 @@ static void splitCoroutine(Function &F,
   buildCoroutineFrame(F, Shape);
   replaceFrameSize(Shape);
 
+  SmallVector<Function*, 4> Clones;
+
   // If there are no suspend points, no split required, just remove
   // the allocation and deallocation blocks, they are not needed.
   if (Shape.CoroSuspends.empty()) {
-    handleNoSuspendCoroutine(Shape.CoroBegin, Shape.FrameTy);
-    removeCoroEnds(Shape);
-    postSplitCleanup(F);
-    coro::updateCallGraph(F, {}, CG, SCC);
-    return;
+    handleNoSuspendCoroutine(Shape);
+  } else {
+    splitCoroutine(F, Shape, Clones);
   }
 
-  auto *ResumeEntry = createResumeEntryBlock(F, Shape);
-  auto ResumeClone = createClone(F, ".resume", Shape, ResumeEntry, 0);
-  auto DestroyClone = createClone(F, ".destroy", Shape, ResumeEntry, 1);
-  auto CleanupClone = createClone(F, ".cleanup", Shape, ResumeEntry, 2);
-
-  // We no longer need coro.end in F.
-  removeCoroEnds(Shape);
-
+  removeCoroEnds(Shape, &CG);
   postSplitCleanup(F);
-  postSplitCleanup(*ResumeClone);
-  postSplitCleanup(*DestroyClone);
-  postSplitCleanup(*CleanupClone);
-
-  addMustTailToCoroResumes(*ResumeClone);
-
-  // Store addresses resume/destroy/cleanup functions in the coroutine frame.
-  updateCoroFrame(Shape, ResumeClone, DestroyClone, CleanupClone);
-
-  // Create a constant array referring to resume/destroy/clone functions pointed
-  // by the last argument of @llvm.coro.info, so that CoroElide pass can
-  // determined correct function to call.
-  setCoroInfo(F, Shape.CoroBegin, {ResumeClone, DestroyClone, CleanupClone});
 
   // Update call graph and add the functions we created to the SCC.
-  coro::updateCallGraph(F, {ResumeClone, DestroyClone, CleanupClone}, CG, SCC);
+  coro::updateCallGraph(F, Clones, CG, SCC);
 }
 
 // When we see the coroutine the first time, we insert an indirect call to a
@@ -881,6 +1369,78 @@ static void createDevirtTriggerFunc(Call
   SCC.initialize(Nodes);
 }
 
+/// Replace a call to llvm.coro.prepare.retcon.
+static void replacePrepare(CallInst *Prepare, CallGraph &CG) {
+  auto CastFn = Prepare->getArgOperand(0); // as an i8*
+  auto Fn = CastFn->stripPointerCasts(); // as its original type
+
+  // Find call graph nodes for the preparation.
+  CallGraphNode *PrepareUserNode = nullptr, *FnNode = nullptr;
+  if (auto ConcreteFn = dyn_cast<Function>(Fn)) {
+    PrepareUserNode = CG[Prepare->getFunction()];
+    FnNode = CG[ConcreteFn];
+  }
+
+  // Attempt to peephole this pattern:
+  //    %0 = bitcast [[TYPE]] @some_function to i8*
+  //    %1 = call @llvm.coro.prepare.retcon(i8* %0)
+  //    %2 = bitcast %1 to [[TYPE]]
+  // ==>
+  //    %2 = @some_function
+  for (auto UI = Prepare->use_begin(), UE = Prepare->use_end();
+         UI != UE; ) {
+    // Look for bitcasts back to the original function type.
+    auto *Cast = dyn_cast<BitCastInst>((UI++)->getUser());
+    if (!Cast || Cast->getType() != Fn->getType()) continue;
+
+    // Check whether the replacement will introduce new direct calls.
+    // If so, we'll need to update the call graph.
+    if (PrepareUserNode) {
+      for (auto &Use : Cast->uses()) {
+        auto CS = CallSite(Use.getUser());
+        if (!CS || !CS.isCallee(&Use)) continue;
+        PrepareUserNode->removeCallEdgeFor(CS);
+        PrepareUserNode->addCalledFunction(CS, FnNode);
+      }
+    }
+
+    // Replace and remove the cast.
+    Cast->replaceAllUsesWith(Fn);
+    Cast->eraseFromParent();
+  }
+
+  // Replace any remaining uses with the function as an i8*.
+  // This can never directly be a callee, so we don't need to update CG.
+  Prepare->replaceAllUsesWith(CastFn);
+  Prepare->eraseFromParent();
+
+  // Kill dead bitcasts.
+  while (auto *Cast = dyn_cast<BitCastInst>(CastFn)) {
+    if (!Cast->use_empty()) break;
+    CastFn = Cast->getOperand(0);
+    Cast->eraseFromParent();
+  }
+}
+
+/// Remove calls to llvm.coro.prepare.retcon, a barrier meant to prevent
+/// IPO from operating on calls to a retcon coroutine before it's been
+/// split.  This is only safe to do after we've split all retcon
+/// coroutines in the module.  We can do that this in this pass because
+/// this pass does promise to split all retcon coroutines (as opposed to
+/// switch coroutines, which are lowered in multiple stages).
+static bool replaceAllPrepares(Function *PrepareFn, CallGraph &CG) {
+  bool Changed = false;
+  for (auto PI = PrepareFn->use_begin(), PE = PrepareFn->use_end();
+         PI != PE; ) {
+    // Intrinsics can only be used in calls.
+    auto *Prepare = cast<CallInst>((PI++)->getUser());
+    replacePrepare(Prepare, CG);
+    Changed = true;
+  }
+
+  return Changed;
+}
+
 //===----------------------------------------------------------------------===//
 //                              Top Level Driver
 //===----------------------------------------------------------------------===//
@@ -899,7 +1459,9 @@ struct CoroSplit : public CallGraphSCCPa
   // A coroutine is identified by the presence of coro.begin intrinsic, if
   // we don't have any, this pass has nothing to do.
   bool doInitialization(CallGraph &CG) override {
-    Run = coro::declaresIntrinsics(CG.getModule(), {"llvm.coro.begin"});
+    Run = coro::declaresIntrinsics(CG.getModule(),
+                                   {"llvm.coro.begin",
+                                    "llvm.coro.prepare.retcon"});
     return CallGraphSCCPass::doInitialization(CG);
   }
 
@@ -907,6 +1469,12 @@ struct CoroSplit : public CallGraphSCCPa
     if (!Run)
       return false;
 
+    // Check for uses of llvm.coro.prepare.retcon.
+    auto PrepareFn =
+      SCC.getCallGraph().getModule().getFunction("llvm.coro.prepare.retcon");
+    if (PrepareFn && PrepareFn->use_empty())
+      PrepareFn = nullptr;
+
     // Find coroutines for processing.
     SmallVector<Function *, 4> Coroutines;
     for (CallGraphNode *CGN : SCC)
@@ -914,12 +1482,17 @@ struct CoroSplit : public CallGraphSCCPa
         if (F->hasFnAttribute(CORO_PRESPLIT_ATTR))
           Coroutines.push_back(F);
 
-    if (Coroutines.empty())
+    if (Coroutines.empty() && !PrepareFn)
       return false;
 
     CallGraph &CG = getAnalysis<CallGraphWrapperPass>().getCallGraph();
+
+    if (Coroutines.empty())
+      return replaceAllPrepares(PrepareFn, CG);
+
     createDevirtTriggerFunc(CG, SCC);
 
+    // Split all the coroutines.
     for (Function *F : Coroutines) {
       Attribute Attr = F->getFnAttribute(CORO_PRESPLIT_ATTR);
       StringRef Value = Attr.getValueAsString();
@@ -932,6 +1505,10 @@ struct CoroSplit : public CallGraphSCCPa
       F->removeFnAttr(CORO_PRESPLIT_ATTR);
       splitCoroutine(*F, CG, SCC);
     }
+
+    if (PrepareFn)
+      replaceAllPrepares(PrepareFn, CG);
+
     return true;
   }
 

Modified: llvm/trunk/lib/Transforms/Coroutines/Coroutines.cpp
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/lib/Transforms/Coroutines/Coroutines.cpp?rev=368788&r1=368787&r2=368788&view=diff
==============================================================================
--- llvm/trunk/lib/Transforms/Coroutines/Coroutines.cpp (original)
+++ llvm/trunk/lib/Transforms/Coroutines/Coroutines.cpp Tue Aug 13 20:53:17 2019
@@ -123,12 +123,26 @@ Value *coro::LowererBase::makeSubFnCall(
 static bool isCoroutineIntrinsicName(StringRef Name) {
   // NOTE: Must be sorted!
   static const char *const CoroIntrinsics[] = {
-      "llvm.coro.alloc",   "llvm.coro.begin",   "llvm.coro.destroy",
-      "llvm.coro.done",    "llvm.coro.end",     "llvm.coro.frame",
-      "llvm.coro.free",    "llvm.coro.id",      "llvm.coro.noop",
-      "llvm.coro.param",   "llvm.coro.promise", "llvm.coro.resume",
-      "llvm.coro.save",    "llvm.coro.size",    "llvm.coro.subfn.addr",
+      "llvm.coro.alloc",
+      "llvm.coro.begin",
+      "llvm.coro.destroy",
+      "llvm.coro.done",
+      "llvm.coro.end",
+      "llvm.coro.frame",
+      "llvm.coro.free",
+      "llvm.coro.id",
+      "llvm.coro.id.retcon",
+      "llvm.coro.id.retcon.once",
+      "llvm.coro.noop",
+      "llvm.coro.param",
+      "llvm.coro.prepare.retcon",
+      "llvm.coro.promise",
+      "llvm.coro.resume",
+      "llvm.coro.save",
+      "llvm.coro.size",
+      "llvm.coro.subfn.addr",
       "llvm.coro.suspend",
+      "llvm.coro.suspend.retcon",
   };
   return Intrinsic::lookupLLVMIntrinsicByName(CoroIntrinsics, Name) != -1;
 }
@@ -217,9 +231,6 @@ static void clear(coro::Shape &Shape) {
   Shape.FrameTy = nullptr;
   Shape.FramePtr = nullptr;
   Shape.AllocaSpillBlock = nullptr;
-  Shape.ResumeSwitch = nullptr;
-  Shape.PromiseAlloca = nullptr;
-  Shape.HasFinalSuspend = false;
 }
 
 static CoroSaveInst *createCoroSave(CoroBeginInst *CoroBegin,
@@ -235,6 +246,7 @@ static CoroSaveInst *createCoroSave(Coro
 
 // Collect "interesting" coroutine intrinsics.
 void coro::Shape::buildFrom(Function &F) {
+  bool HasFinalSuspend = false;
   size_t FinalSuspendIndex = 0;
   clear(*this);
   SmallVector<CoroFrameInst *, 8> CoroFrames;
@@ -257,9 +269,15 @@ void coro::Shape::buildFrom(Function &F)
         if (II->use_empty())
           UnusedCoroSaves.push_back(cast<CoroSaveInst>(II));
         break;
-      case Intrinsic::coro_suspend:
-        CoroSuspends.push_back(cast<CoroSuspendInst>(II));
-        if (CoroSuspends.back()->isFinal()) {
+      case Intrinsic::coro_suspend_retcon: {
+        auto Suspend = cast<CoroSuspendRetconInst>(II);
+        CoroSuspends.push_back(Suspend);
+        break;
+      }
+      case Intrinsic::coro_suspend: {
+        auto Suspend = cast<CoroSuspendInst>(II);
+        CoroSuspends.push_back(Suspend);
+        if (Suspend->isFinal()) {
           if (HasFinalSuspend)
             report_fatal_error(
               "Only one suspend point can be marked as final");
@@ -267,18 +285,23 @@ void coro::Shape::buildFrom(Function &F)
           FinalSuspendIndex = CoroSuspends.size() - 1;
         }
         break;
+      }
       case Intrinsic::coro_begin: {
         auto CB = cast<CoroBeginInst>(II);
-        if (CB->getId()->getInfo().isPreSplit()) {
-          if (CoroBegin)
-            report_fatal_error(
+
+        // Ignore coro id's that aren't pre-split.
+        auto Id = dyn_cast<CoroIdInst>(CB->getId());
+        if (Id && !Id->getInfo().isPreSplit())
+          break;
+
+        if (CoroBegin)
+          report_fatal_error(
                 "coroutine should have exactly one defining @llvm.coro.begin");
-          CB->addAttribute(AttributeList::ReturnIndex, Attribute::NonNull);
-          CB->addAttribute(AttributeList::ReturnIndex, Attribute::NoAlias);
-          CB->removeAttribute(AttributeList::FunctionIndex,
-                              Attribute::NoDuplicate);
-          CoroBegin = CB;
-        }
+        CB->addAttribute(AttributeList::ReturnIndex, Attribute::NonNull);
+        CB->addAttribute(AttributeList::ReturnIndex, Attribute::NoAlias);
+        CB->removeAttribute(AttributeList::FunctionIndex,
+                            Attribute::NoDuplicate);
+        CoroBegin = CB;
         break;
       }
       case Intrinsic::coro_end:
@@ -310,7 +333,7 @@ void coro::Shape::buildFrom(Function &F)
 
     // Replace all coro.suspend with undef and remove related coro.saves if
     // present.
-    for (CoroSuspendInst *CS : CoroSuspends) {
+    for (AnyCoroSuspendInst *CS : CoroSuspends) {
       CS->replaceAllUsesWith(UndefValue::get(CS->getType()));
       CS->eraseFromParent();
       if (auto *CoroSave = CS->getCoroSave())
@@ -324,19 +347,87 @@ void coro::Shape::buildFrom(Function &F)
     return;
   }
 
+  auto Id = CoroBegin->getId();
+  switch (auto IdIntrinsic = Id->getIntrinsicID()) {
+  case Intrinsic::coro_id: {
+    auto SwitchId = cast<CoroIdInst>(Id);
+    this->ABI = coro::ABI::Switch;
+    this->SwitchLowering.HasFinalSuspend = HasFinalSuspend;
+    this->SwitchLowering.ResumeSwitch = nullptr;
+    this->SwitchLowering.PromiseAlloca = SwitchId->getPromise();
+    this->SwitchLowering.ResumeEntryBlock = nullptr;
+
+    for (auto AnySuspend : CoroSuspends) {
+      auto Suspend = dyn_cast<CoroSuspendInst>(AnySuspend);
+      if (!Suspend) {
+        AnySuspend->dump();
+        report_fatal_error("coro.id must be paired with coro.suspend");
+      }
+
+      if (!Suspend->getCoroSave())
+        createCoroSave(CoroBegin, Suspend);
+    }
+    break;
+  }
+
+  case Intrinsic::coro_id_retcon:
+  case Intrinsic::coro_id_retcon_once: {
+    auto ContinuationId = cast<AnyCoroIdRetconInst>(Id);
+    ContinuationId->checkWellFormed();
+    this->ABI = (IdIntrinsic == Intrinsic::coro_id_retcon
+                  ? coro::ABI::Retcon
+                  : coro::ABI::RetconOnce);
+    auto Prototype = ContinuationId->getPrototype();
+    this->RetconLowering.ResumePrototype = Prototype;
+    this->RetconLowering.Alloc = ContinuationId->getAllocFunction();
+    this->RetconLowering.Dealloc = ContinuationId->getDeallocFunction();
+    this->RetconLowering.ReturnBlock = nullptr;
+    this->RetconLowering.IsFrameInlineInStorage = false;
+
+    // Determine the result value types, and make sure they match up with
+    // the values passed to the suspends.
+    auto ResultTys = getRetconResultTypes();
+
+    for (auto AnySuspend : CoroSuspends) {
+      auto Suspend = dyn_cast<CoroSuspendRetconInst>(AnySuspend);
+      if (!Suspend) {
+        AnySuspend->dump();
+        report_fatal_error("coro.id.retcon.* must be paired with "
+                           "coro.suspend.retcon");
+      }
+
+      auto SI = Suspend->value_begin(), SE = Suspend->value_end();
+      auto RI = ResultTys.begin(), RE = ResultTys.end();
+      for (; SI != SE && RI != RE; ++SI, ++RI) {
+        if ((*SI)->getType() != *RI) {
+          Suspend->dump();
+          Prototype->getFunctionType()->dump();
+          report_fatal_error("argument to coro.suspend.retcon does not "
+                             "match corresponding prototype function result");
+        }
+      }
+      if (SI != SE || RI != RE) {
+        Suspend->dump();
+        Prototype->getFunctionType()->dump();
+        report_fatal_error("wrong number of arguments to coro.suspend.retcon");
+      }
+    }
+    break;
+  }
+
+  default:
+    llvm_unreachable("coro.begin is not dependent on a coro.id call");
+  }
+
   // The coro.free intrinsic is always lowered to the result of coro.begin.
   for (CoroFrameInst *CF : CoroFrames) {
     CF->replaceAllUsesWith(CoroBegin);
     CF->eraseFromParent();
   }
 
-  // Canonicalize coro.suspend by inserting a coro.save if needed.
-  for (CoroSuspendInst *CS : CoroSuspends)
-    if (!CS->getCoroSave())
-      createCoroSave(CoroBegin, CS);
-
   // Move final suspend to be the last element in the CoroSuspends vector.
-  if (HasFinalSuspend &&
+  if (ABI == coro::ABI::Switch &&
+      SwitchLowering.HasFinalSuspend &&
       FinalSuspendIndex != CoroSuspends.size() - 1)
     std::swap(CoroSuspends[FinalSuspendIndex], CoroSuspends.back());
 
@@ -345,6 +436,153 @@ void coro::Shape::buildFrom(Function &F)
     CoroSave->eraseFromParent();
 }
 
+static void propagateCallAttrsFromCallee(CallInst *Call, Function *Callee) {
+  Call->setCallingConv(Callee->getCallingConv());
+  // TODO: attributes?
+}
+
+static void addCallToCallGraph(CallGraph *CG, CallInst *Call, Function *Callee){
+  if (CG)
+    (*CG)[Call->getFunction()]->addCalledFunction(Call, (*CG)[Callee]);
+}
+
+Value *coro::Shape::emitAlloc(IRBuilder<> &Builder, Value *Size,
+                              CallGraph *CG) const {
+  switch (ABI) {
+  case coro::ABI::Switch:
+    llvm_unreachable("can't allocate memory in coro switch-lowering");
+
+  case coro::ABI::Retcon:
+  case coro::ABI::RetconOnce: {
+    auto Alloc = RetconLowering.Alloc;
+    Size = Builder.CreateIntCast(Size,
+                                 Alloc->getFunctionType()->getParamType(0),
+                                 /*is signed*/ false);
+    auto *Call = Builder.CreateCall(Alloc, Size);
+    propagateCallAttrsFromCallee(Call, Alloc);
+    addCallToCallGraph(CG, Call, Alloc);
+    return Call;
+  }
+  }
+}
+
+void coro::Shape::emitDealloc(IRBuilder<> &Builder, Value *Ptr,
+                              CallGraph *CG) const {
+  switch (ABI) {
+  case coro::ABI::Switch:
+    llvm_unreachable("can't allocate memory in coro switch-lowering");
+
+  case coro::ABI::Retcon:
+  case coro::ABI::RetconOnce: {
+    auto Dealloc = RetconLowering.Dealloc;
+    Ptr = Builder.CreateBitCast(Ptr,
+                                Dealloc->getFunctionType()->getParamType(0));
+    auto *Call = Builder.CreateCall(Dealloc, Ptr);
+    propagateCallAttrsFromCallee(Call, Dealloc);
+    addCallToCallGraph(CG, Call, Dealloc);
+    return;
+  }
+  }
+}
+
+LLVM_ATTRIBUTE_NORETURN
+static void fail(const Instruction *I, const char *Reason, Value *V) {
+  I->dump();
+  if (V) {
+    errs() << "  Value: ";
+    V->printAsOperand(llvm::errs());
+    errs() << '\n';
+  }
+  report_fatal_error(Reason);
+}
+
+/// Check that the given value is a well-formed prototype for the
+/// llvm.coro.id.retcon.* intrinsics.
+static void checkWFRetconPrototype(const AnyCoroIdRetconInst *I, Value *V) {
+  auto F = dyn_cast<Function>(V->stripPointerCasts());
+  if (!F)
+    fail(I, "llvm.coro.retcon.* prototype not a Function", V);
+
+  auto FT = F->getFunctionType();
+
+  if (isa<CoroIdRetconInst>(I)) {
+    bool ResultOkay;
+    if (FT->getReturnType()->isPointerTy()) {
+      ResultOkay = true;
+    } else if (auto SRetTy = dyn_cast<StructType>(FT->getReturnType())) {
+      ResultOkay = (!SRetTy->isOpaque() &&
+                    SRetTy->getNumElements() > 0 &&
+                    SRetTy->getElementType(0)->isPointerTy());
+    } else {
+      ResultOkay = false;
+    }
+    if (!ResultOkay)
+      fail(I, "llvm.coro.retcon prototype must return pointer as first result",
+           F);
+
+    if (FT->getReturnType() !=
+          I->getFunction()->getFunctionType()->getReturnType())
+      fail(I, "llvm.coro.retcon.* prototype return type must be same as"
+              "current function return type", F);
+  } else {
+    // No meaningful validation to do here for llvm.coro.id.unique.once.
+  }
+
+  if (FT->getNumParams() != 2)
+    fail(I, "llvm.coro.retcon.* prototype must take exactly two parameters", F);
+  if (!FT->getParamType(0)->isPointerTy())
+    fail(I, "llvm.coro.retcon.* prototype must take pointer as 1st param", F);
+  if (!FT->getParamType(1)->isIntegerTy()) // an i1, but not for abi purposes
+    fail(I, "llvm.coro.retcon.* prototype must take integer as 2nd param", F);
+}
+
+/// Check that the given value is a well-formed allocator.
+static void checkWFAlloc(const Instruction *I, Value *V) {
+  auto F = dyn_cast<Function>(V->stripPointerCasts());
+  if (!F)
+    fail(I, "llvm.coro.* allocator not a Function", V);
+
+  auto FT = F->getFunctionType();
+  if (!FT->getReturnType()->isPointerTy())
+    fail(I, "llvm.coro.* allocator must return a pointer", F);
+
+  if (FT->getNumParams() != 1 ||
+      !FT->getParamType(0)->isIntegerTy())
+    fail(I, "llvm.coro.* allocator must take integer as only param", F);
+}
+
+/// Check that the given value is a well-formed deallocator.
+static void checkWFDealloc(const Instruction *I, Value *V) {
+  auto F = dyn_cast<Function>(V->stripPointerCasts());
+  if (!F)
+    fail(I, "llvm.coro.* deallocator not a Function", V);
+
+  auto FT = F->getFunctionType();
+  if (!FT->getReturnType()->isVoidTy())
+    fail(I, "llvm.coro.* deallocator must return void", F);
+
+  if (FT->getNumParams() != 1 ||
+      !FT->getParamType(0)->isPointerTy())
+    fail(I, "llvm.coro.* deallocator must take pointer as only param", F);
+}
+
+static void checkConstantInt(const Instruction *I, Value *V,
+                             const char *Reason) {
+  if (!isa<ConstantInt>(V)) {
+    fail(I, Reason, V);
+  }
+}
+
+void AnyCoroIdRetconInst::checkWellFormed() const {
+  checkConstantInt(this, getArgOperand(SizeArg),
+                   "size argument to coro.id.retcon.* must be constant");
+  checkConstantInt(this, getArgOperand(AlignArg),
+                   "alignment argument to coro.id.retcon.* must be constant");
+  checkWFRetconPrototype(this, getArgOperand(PrototypeArg));
+  checkWFAlloc(this, getArgOperand(AllocArg));
+  checkWFDealloc(this, getArgOperand(DeallocArg));
+}
+
 void LLVMAddCoroEarlyPass(LLVMPassManagerRef PM) {
   unwrap(PM)->add(createCoroEarlyPass());
 }

Modified: llvm/trunk/test/Transforms/Coroutines/coro-debug.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/Coroutines/coro-debug.ll?rev=368788&r1=368787&r2=368788&view=diff
==============================================================================
--- llvm/trunk/test/Transforms/Coroutines/coro-debug.ll (original)
+++ llvm/trunk/test/Transforms/Coroutines/coro-debug.ll Tue Aug 13 20:53:17 2019
@@ -127,9 +127,9 @@ attributes #7 = { noduplicate }
 !24 = !DILocation(line: 62, column: 3, scope: !6)
 
 ; CHECK: define i8* @f(i32 %x) #0 !dbg ![[ORIG:[0-9]+]]
-; CHECK: define internal fastcc void @f.resume(%f.Frame* %FramePtr) #0 !dbg ![[RESUME:[0-9]+]]
-; CHECK: define internal fastcc void @f.destroy(%f.Frame* %FramePtr) #0 !dbg ![[DESTROY:[0-9]+]]
-; CHECK: define internal fastcc void @f.cleanup(%f.Frame* %FramePtr) #0 !dbg ![[CLEANUP:[0-9]+]]
+; CHECK: define internal fastcc void @f.resume(%f.Frame* noalias nonnull %FramePtr) #0 !dbg ![[RESUME:[0-9]+]]
+; CHECK: define internal fastcc void @f.destroy(%f.Frame* noalias nonnull %FramePtr) #0 !dbg ![[DESTROY:[0-9]+]]
+; CHECK: define internal fastcc void @f.cleanup(%f.Frame* noalias nonnull %FramePtr) #0 !dbg ![[CLEANUP:[0-9]+]]
 
 ; CHECK: ![[ORIG]] = distinct !DISubprogram(name: "f", linkageName: "flink"
 ; CHECK: !DILocalVariable(name: "x", arg: 1, scope: ![[ORIG]]

Added: llvm/trunk/test/Transforms/Coroutines/coro-retcon-once-value.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/Coroutines/coro-retcon-once-value.ll?rev=368788&view=auto
==============================================================================
--- llvm/trunk/test/Transforms/Coroutines/coro-retcon-once-value.ll (added)
+++ llvm/trunk/test/Transforms/Coroutines/coro-retcon-once-value.ll Tue Aug 13 20:53:17 2019
@@ -0,0 +1,116 @@
+; RUN: opt < %s -enable-coroutines -O2 -S | FileCheck %s
+target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-apple-macosx10.12.0"
+
+define {i8*, i32} @f(i8* %buffer, i32* %array) {
+entry:
+  %id = call token @llvm.coro.id.retcon.once(i32 8, i32 8, i8* %buffer, i8* bitcast (void (i8*, i8)* @prototype to i8*), i8* bitcast (i8* (i32)* @allocate to i8*), i8* bitcast (void (i8*)* @deallocate to i8*))
+  %hdl = call i8* @llvm.coro.begin(token %id, i8* null)
+  %load = load i32, i32* %array
+  %load.pos = icmp sgt i32 %load, 0
+  br i1 %load.pos, label %pos, label %neg
+
+pos:
+  %unwind0 = call i1 (...) @llvm.coro.suspend.retcon(i32 %load)
+  br i1 %unwind0, label %cleanup, label %pos.cont
+
+pos.cont:
+  store i32 0, i32* %array, align 4
+  br label %cleanup
+
+neg:
+  %unwind1 = call i1 (...) @llvm.coro.suspend.retcon(i32 0)
+  br i1 %unwind1, label %cleanup, label %neg.cont
+
+neg.cont:
+  store i32 10, i32* %array, align 4
+  br label %cleanup
+
+cleanup:
+  call i1 @llvm.coro.end(i8* %hdl, i1 0)
+  unreachable
+}
+
+; CHECK-LABEL: define { i8*, i32 } @f(i8* %buffer, i32* %array)
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[T0:%.*]] = bitcast i8* %buffer to i32**
+; CHECK-NEXT:    store i32* %array, i32** [[T0]], align 8
+; CHECK-NEXT:    %load = load i32, i32* %array, align 4
+; CHECK-NEXT:    %load.pos = icmp sgt i32 %load, 0
+; CHECK-NEXT:    [[CONT:%.*]] = select i1 %load.pos, void (i8*, i8)* @f.resume.0, void (i8*, i8)* @f.resume.1
+; CHECK-NEXT:    [[VAL:%.*]] = select i1 %load.pos, i32 %load, i32 0
+; CHECK-NEXT:    [[CONT_CAST:%.*]] = bitcast void (i8*, i8)* [[CONT]] to i8*
+; CHECK-NEXT:    [[T0:%.*]] = insertvalue { i8*, i32 } undef, i8* [[CONT_CAST]], 0
+; CHECK-NEXT:    [[T1:%.*]] = insertvalue { i8*, i32 } [[T0]], i32 [[VAL]], 1
+; CHECK-NEXT:    ret { i8*, i32 } [[T1]]
+; CHECK-NEXT:  }
+
+; CHECK-LABEL: define internal void @f.resume.0(i8* noalias nonnull %0, i1 zeroext %1)
+; CHECK-NEXT:  :
+; CHECK-NEXT:    [[T0:%.*]] = icmp eq i8 %1, 0
+; CHECK-NEXT:    br i1 [[T0]],
+; CHECK:       :
+; CHECK-NEXT:    [[T0:%.*]] = bitcast i8* %0 to i32**
+; CHECK-NEXT:    [[RELOAD:%.*]] = load i32*, i32** [[T0]], align 8
+; CHECK-NEXT:    store i32 0, i32* [[RELOAD]], align 4
+; CHECK-NEXT:    br label
+; CHECK:       :
+; CHECK-NEXT:    ret void
+; CHECK-NEXT:  }
+
+; CHECK-LABEL: define internal void @f.resume.1(i8* noalias nonnull %0, i1 zeroext %1)
+; CHECK-NEXT:  :
+; CHECK-NEXT:    [[T0:%.*]] = icmp eq i8 %1, 0
+; CHECK-NEXT:    br i1 [[T0]],
+; CHECK:       :
+; CHECK-NEXT:    [[T0:%.*]] = bitcast i8* %0 to i32**
+; CHECK-NEXT:    [[RELOAD:%.*]] = load i32*, i32** [[T0]], align 8
+; CHECK-NEXT:    store i32 10, i32* [[RELOAD]], align 4
+; CHECK-NEXT:    br label
+; CHECK:       :
+; CHECK-NEXT:    ret void
+; CHECK-NEXT:  }
+
+define void @test(i32* %array) {
+entry:
+  %0 = alloca [8 x i8], align 8
+  %buffer = bitcast [8 x i8]* %0 to i8*
+  %prepare = call i8* @llvm.coro.prepare.retcon(i8* bitcast ({i8*, i32} (i8*, i32*)* @f to i8*))
+  %f = bitcast i8* %prepare to {i8*, i32} (i8*, i32*)*
+  %result = call {i8*, i32} %f(i8* %buffer, i32* %array)
+  %value = extractvalue {i8*, i32} %result, 1
+  call void @print(i32 %value)
+  %cont = extractvalue {i8*, i32} %result, 0
+  %cont.cast = bitcast i8* %cont to void (i8*, i8)*
+  call void %cont.cast(i8* %buffer, i8 zeroext 0)
+  ret void
+}
+
+;   Unfortunately, we don't seem to fully optimize this right now due
+;   to some sort of phase-ordering thing.
+; CHECK-LABEL: define void @test(i32* %array)
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[BUFFER:%.*]] = alloca i32*, align 8
+; CHECK-NEXT:    [[BUFFER_CAST:%.*]] = bitcast i32** [[BUFFER]] to i8*
+; CHECK-NEXT:    store i32* %array, i32** [[BUFFER]], align 8
+; CHECK-NEXT:    [[LOAD:%.*]] = load i32, i32* %array, align 4
+; CHECK-NEXT:    [[LOAD_POS:%.*]] = icmp sgt i32 [[LOAD]], 0
+; CHECK-NEXT:    [[CONT:%.*]] = select i1 [[LOAD_POS]], void (i8*, i8)* @f.resume.0, void (i8*, i8)* @f.resume.1
+; CHECK-NEXT:    [[VAL:%.*]] = select i1 [[LOAD_POS]], i32 [[LOAD]], i32 0
+; CHECK-NEXT:    call void @print(i32 [[VAL]])
+; CHECK-NEXT:    call void [[CONT]](i8* nonnull [[BUFFER_CAST]], i8 zeroext 0)
+; CHECK-NEXT:    ret void
+
+declare token @llvm.coro.id.retcon.once(i32, i32, i8*, i8*, i8*, i8*)
+declare i8* @llvm.coro.begin(token, i8*)
+declare i1 @llvm.coro.suspend.retcon(...)
+declare i1 @llvm.coro.end(i8*, i1)
+declare i8* @llvm.coro.prepare.retcon(i8*)
+
+declare void @prototype(i8*, i8 zeroext)
+
+declare noalias i8* @allocate(i32 %size)
+declare void @deallocate(i8* %ptr)
+
+declare void @print(i32)
+

Added: llvm/trunk/test/Transforms/Coroutines/coro-retcon-once-value2.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/Coroutines/coro-retcon-once-value2.ll?rev=368788&view=auto
==============================================================================
--- llvm/trunk/test/Transforms/Coroutines/coro-retcon-once-value2.ll (added)
+++ llvm/trunk/test/Transforms/Coroutines/coro-retcon-once-value2.ll Tue Aug 13 20:53:17 2019
@@ -0,0 +1,73 @@
+; RUN: opt < %s -coro-split -coro-cleanup -S | FileCheck %s
+target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-apple-macosx10.12.0"
+
+define {i8*, i32*} @f(i8* %buffer, i32* %ptr) "coroutine.presplit"="1" {
+entry:
+  %temp = alloca i32, align 4
+  %id = call token @llvm.coro.id.retcon.once(i32 8, i32 8, i8* %buffer, i8* bitcast (void (i8*, i8)* @prototype to i8*), i8* bitcast (i8* (i32)* @allocate to i8*), i8* bitcast (void (i8*)* @deallocate to i8*))
+  %hdl = call i8* @llvm.coro.begin(token %id, i8* null)
+  %oldvalue = load i32, i32* %ptr
+  store i32 %oldvalue, i32* %temp
+  %unwind = call i1 (...) @llvm.coro.suspend.retcon(i32* %temp)
+  br i1 %unwind, label %cleanup, label %cont
+
+cont:
+  %newvalue = load i32, i32* %temp
+  store i32 %newvalue, i32* %ptr
+  br label %cleanup
+
+cleanup:
+  call i1 @llvm.coro.end(i8* %hdl, i1 0)
+  unreachable
+}
+
+; CHECK-LABEL: define { i8*, i32* } @f(i8* %buffer, i32* %ptr)
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[ALLOC:%.*]] = call i8* @allocate(i32 16)
+; CHECK-NEXT:    [[T0:%.*]] = bitcast i8* %buffer to i8**
+; CHECK-NEXT:    store i8* [[ALLOC]], i8** [[T0]]
+; CHECK-NEXT:    [[FRAME:%.*]] = bitcast i8* [[ALLOC]] to [[FRAME_T:%.*]]*
+; CHECK-NEXT:    %temp = getelementptr inbounds [[FRAME_T]], [[FRAME_T]]* [[FRAME]], i32 0, i32 1
+; CHECK-NEXT:    [[SPILL:%.*]] = getelementptr inbounds [[FRAME_T]], [[FRAME_T]]* [[FRAME]], i32 0, i32 0
+; CHECK-NEXT:    store i32* %ptr, i32** [[SPILL]]
+; CHECK-NEXT:    %oldvalue = load i32, i32* %ptr
+; CHECK-NEXT:    store i32 %oldvalue, i32* %temp
+; CHECK-NEXT:    [[T0:%.*]] = insertvalue { i8*, i32* } { i8* bitcast (void (i8*, i8)* @f.resume.0 to i8*), i32* undef }, i32* %temp, 1
+; CHECK-NEXT:    ret { i8*, i32* } [[T0]]
+; CHECK-NEXT:  }
+
+; CHECK-LABEL: define internal void @f.resume.0(i8* noalias nonnull %0, i1 zeroext %1)
+; CHECK-NEXT:  :
+; CHECK-NEXT:    [[T0:%.*]] = bitcast i8* %0 to [[FRAME_T:%.*]]**
+; CHECK-NEXT:    [[FRAME:%.*]] = load [[FRAME_T]]*, [[FRAME_T]]** [[T0]]
+; CHECK-NEXT:    bitcast [[FRAME_T]]* [[FRAME]] to i8*
+; CHECK-NEXT:    [[T0:%.*]] = icmp ne i8 %1, 0
+; CHECK-NEXT:    %temp = getelementptr inbounds [[FRAME_T]], [[FRAME_T]]* [[FRAME]], i32 0, i32 1
+; CHECK-NEXT:    br i1 [[T0]],
+; CHECK:       :
+; CHECK-NEXT:    [[TEMP_SLOT:%.*]] = getelementptr inbounds [[FRAME_T]], [[FRAME_T]]* [[FRAME]], i32 0, i32 1
+; CHECK-NEXT:    [[PTR_SLOT:%.*]] = getelementptr inbounds [[FRAME_T]], [[FRAME_T]]* [[FRAME]], i32 0, i32 0
+; CHECK-NEXT:    [[PTR_RELOAD:%.*]] = load i32*, i32** [[PTR_SLOT]]
+; CHECK-NEXT:    %newvalue = load i32, i32* [[TEMP_SLOT]]
+; CHECK-NEXT:    store i32 %newvalue, i32* [[PTR_RELOAD]]
+; CHECK-NEXT:    br label
+; CHECK:       :
+; CHECK-NEXT:    [[T0:%.*]] = bitcast [[FRAME_T]]* [[FRAME]] to i8*
+; CHECK-NEXT:    call fastcc void @deallocate(i8* [[T0]])
+; CHECK-NEXT:    ret void
+; CHECK-NEXT:  }
+
+declare token @llvm.coro.id.retcon.once(i32, i32, i8*, i8*, i8*, i8*)
+declare i8* @llvm.coro.begin(token, i8*)
+declare i1 @llvm.coro.suspend.retcon(...)
+declare i1 @llvm.coro.end(i8*, i1)
+declare i8* @llvm.coro.prepare.retcon(i8*)
+
+declare void @prototype(i8*, i8 zeroext)
+
+declare noalias i8* @allocate(i32 %size)
+declare fastcc void @deallocate(i8* %ptr)
+
+declare void @print(i32)
+

Added: llvm/trunk/test/Transforms/Coroutines/coro-retcon-value.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/Coroutines/coro-retcon-value.ll?rev=368788&view=auto
==============================================================================
--- llvm/trunk/test/Transforms/Coroutines/coro-retcon-value.ll (added)
+++ llvm/trunk/test/Transforms/Coroutines/coro-retcon-value.ll Tue Aug 13 20:53:17 2019
@@ -0,0 +1,102 @@
+; First example from Doc/Coroutines.rst (two block loop) converted to retcon
+; RUN: opt < %s -enable-coroutines -O2 -S | FileCheck %s
+
+define {i8*, i32} @f(i8* %buffer, i32 %n) {
+entry:
+  %id = call token @llvm.coro.id.retcon(i32 8, i32 4, i8* %buffer, i8* bitcast ({i8*, i32} (i8*, i8)* @prototype to i8*), i8* bitcast (i8* (i32)* @allocate to i8*), i8* bitcast (void (i8*)* @deallocate to i8*))
+  %hdl = call i8* @llvm.coro.begin(token %id, i8* null)
+  br label %loop
+
+loop:
+  %n.val = phi i32 [ %n, %entry ], [ %inc, %resume ]
+  %unwind0 = call i1 (...) @llvm.coro.suspend.retcon(i32 %n.val)
+  br i1 %unwind0, label %cleanup, label %resume
+
+resume:
+  %inc = add i32 %n.val, 1
+  br label %loop
+
+cleanup:
+  call i1 @llvm.coro.end(i8* %hdl, i1 0)
+  unreachable
+}
+
+; CHECK-LABEL: define { i8*, i32 } @f(i8* %buffer, i32 %n)
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[T0:%.*]] = bitcast i8* %buffer to i32*
+; CHECK-NEXT:    store i32 %n, i32* [[T0]], align 4
+; CHECK-NEXT:    [[RET:%.*]] = insertvalue { i8*, i32 } { i8* bitcast ({ i8*, i32 } (i8*, i8)* @f.resume.0 to i8*), i32 undef }, i32 %n, 1
+; CHECK-NEXT:    ret { i8*, i32 } [[RET]]
+; CHECK-NEXT:  }
+
+; CHECK-LABEL: define internal { i8*, i32 } @f.resume.0(i8* noalias nonnull %0, i8 zeroext %1)
+; CHECK-NEXT:  :
+; CHECK-NEXT:    [[T0:%.*]] = icmp eq i8 %1, 0
+; CHECK-NEXT:    br i1 [[T0]],
+; CHECK:       :
+; CHECK-NEXT:    [[T0:%.*]] = bitcast i8* %0 to i32*
+; CHECK-NEXT:    [[T1:%.*]] = load i32, i32* [[T0]], align 4
+; CHECK-NEXT:    %inc = add i32 [[T1]], 1
+; CHECK-NEXT:    store i32 %inc, i32* [[T0]], align 4
+; CHECK-NEXT:    [[RET:%.*]] = insertvalue { i8*, i32 } { i8* bitcast ({ i8*, i32 } (i8*, i8)* @f.resume.0 to i8*), i32 undef }, i32 %inc, 1
+; CHECK-NEXT:    ret { i8*, i32 } [[RET]]
+; CHECK:       :
+; CHECK-NEXT:    ret { i8*, i32 } { i8* null, i32 undef }
+; CHECK-NEXT:  }
+
+define i32 @main() {
+entry:
+  %0 = alloca [8 x i8], align 4
+  %buffer = bitcast [8 x i8]* %0 to i8*
+  %prepare = call i8* @llvm.coro.prepare.retcon(i8* bitcast ({i8*, i32} (i8*, i32)* @f to i8*))
+  %f = bitcast i8* %prepare to {i8*, i32} (i8*, i32)*
+  %result0 = call {i8*, i32} %f(i8* %buffer, i32 4)
+  %value0 = extractvalue {i8*, i32} %result0, 1
+  call void @print(i32 %value0)
+  %cont0 = extractvalue {i8*, i32} %result0, 0
+  %cont0.cast = bitcast i8* %cont0 to {i8*, i32} (i8*, i8)*
+  %result1 = call {i8*, i32} %cont0.cast(i8* %buffer, i8 zeroext 0)
+  %value1 = extractvalue {i8*, i32} %result1, 1
+  call void @print(i32 %value1)
+  %cont1 = extractvalue {i8*, i32} %result1, 0
+  %cont1.cast = bitcast i8* %cont1 to {i8*, i32} (i8*, i8)*
+  %result2 = call {i8*, i32} %cont1.cast(i8* %buffer, i8 zeroext 0)
+  %value2 = extractvalue {i8*, i32} %result2, 1
+  call void @print(i32 %value2)
+  %cont2 = extractvalue {i8*, i32} %result2, 0
+  %cont2.cast = bitcast i8* %cont2 to {i8*, i32} (i8*, i8)*
+  call {i8*, i32} %cont2.cast(i8* %buffer, i8 zeroext 1)
+  ret i32 0
+}
+
+;   Unfortunately, we don't seem to fully optimize this right now due
+;   to some sort of phase-ordering thing.
+; CHECK-LABEL: define i32 @main
+; CHECK-NEXT:  entry:
+; CHECK:         [[BUFFER:%.*]] = alloca [8 x i8], align 4
+; CHECK:         [[SLOT:%.*]] = bitcast [8 x i8]* [[BUFFER]] to i32*
+; CHECK-NEXT:    store i32 4, i32* [[SLOT]], align 4
+; CHECK-NEXT:    call void @print(i32 4)
+; CHECK-NEXT:    [[LOAD:%.*]] = load i32, i32* [[SLOT]], align 4
+; CHECK-NEXT:    [[INC:%.*]] = add i32 [[LOAD]], 1
+; CHECK-NEXT:    store i32 [[INC]], i32* [[SLOT]], align 4
+; CHECK-NEXT:    call void @print(i32 [[INC]])
+; CHECK-NEXT:    [[LOAD:%.*]] = load i32, i32* [[SLOT]], align 4
+; CHECK-NEXT:    [[INC:%.*]] = add i32 [[LOAD]], 1
+; CHECK-NEXT:    store i32 [[INC]], i32* [[SLOT]], align 4
+; CHECK-NEXT:    call void @print(i32 [[INC]])
+; CHECK-NEXT:    ret i32 0
+
+declare token @llvm.coro.id.retcon(i32, i32, i8*, i8*, i8*, i8*)
+declare i8* @llvm.coro.begin(token, i8*)
+declare i1 @llvm.coro.suspend.retcon(...)
+declare i1 @llvm.coro.end(i8*, i1)
+declare i8* @llvm.coro.prepare.retcon(i8*)
+
+declare {i8*, i32} @prototype(i8*, i8 zeroext)
+
+declare noalias i8* @allocate(i32 %size)
+declare void @deallocate(i8* %ptr)
+
+declare void @print(i32)
+

Added: llvm/trunk/test/Transforms/Coroutines/coro-retcon.ll
URL: http://llvm.org/viewvc/llvm-project/llvm/trunk/test/Transforms/Coroutines/coro-retcon.ll?rev=368788&view=auto
==============================================================================
--- llvm/trunk/test/Transforms/Coroutines/coro-retcon.ll (added)
+++ llvm/trunk/test/Transforms/Coroutines/coro-retcon.ll Tue Aug 13 20:53:17 2019
@@ -0,0 +1,94 @@
+; First example from Doc/Coroutines.rst (two block loop) converted to retcon
+; RUN: opt < %s -enable-coroutines -O2 -S | FileCheck %s
+
+define i8* @f(i8* %buffer, i32 %n) {
+entry:
+  %id = call token @llvm.coro.id.retcon(i32 8, i32 4, i8* %buffer, i8* bitcast (i8* (i8*, i8)* @prototype to i8*), i8* bitcast (i8* (i32)* @allocate to i8*), i8* bitcast (void (i8*)* @deallocate to i8*))
+  %hdl = call i8* @llvm.coro.begin(token %id, i8* null)
+  br label %loop
+
+loop:
+  %n.val = phi i32 [ %n, %entry ], [ %inc, %resume ]
+  call void @print(i32 %n.val)
+  %unwind0 = call i1 (...) @llvm.coro.suspend.retcon()
+  br i1 %unwind0, label %cleanup, label %resume
+
+resume:
+  %inc = add i32 %n.val, 1
+  br label %loop
+
+cleanup:
+  call i1 @llvm.coro.end(i8* %hdl, i1 0)
+  unreachable
+}
+
+; CHECK-LABEL: define i8* @f(i8* %buffer, i32 %n)
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    [[T0:%.*]] = bitcast i8* %buffer to i32*
+; CHECK-NEXT:    store i32 %n, i32* [[T0]], align 4
+; CHECK-NEXT:    call void @print(i32 %n)
+; CHECK-NEXT:    ret i8* bitcast (i8* (i8*, i8)* @f.resume.0 to i8*)
+; CHECK-NEXT:  }
+
+; CHECK-LABEL: define internal i8* @f.resume.0(i8* noalias nonnull %0, i1 zeroext %1)
+; CHECK-NEXT:  :
+; CHECK-NEXT:    [[T0:%.*]] = icmp eq i8 %1, 0
+; CHECK-NEXT:    br i1 [[T0]],
+; CHECK:       :
+; CHECK-NEXT:    [[T0:%.*]] = bitcast i8* %0 to i32*
+; CHECK-NEXT:    [[T1:%.*]] = load i32, i32* [[T0]], align 4
+; CHECK-NEXT:    %inc = add i32 [[T1]], 1
+; CHECK-NEXT:    store i32 %inc, i32* [[T0]], align 4
+; CHECK-NEXT:    call void @print(i32 %inc)
+; CHECK-NEXT:    ret i8* bitcast (i8* (i8*, i8)* @f.resume.0 to i8*)
+; CHECK:       :
+; CHECK-NEXT:    ret i8* null
+; CHECK-NEXT:  }
+
+define i32 @main() {
+entry:
+  %0 = alloca [8 x i8], align 4
+  %buffer = bitcast [8 x i8]* %0 to i8*
+  %prepare = call i8* @llvm.coro.prepare.retcon(i8* bitcast (i8* (i8*, i32)* @f to i8*))
+  %f = bitcast i8* %prepare to i8* (i8*, i32)*
+  %cont0 = call i8* %f(i8* %buffer, i32 4)
+  %cont0.cast = bitcast i8* %cont0 to i8* (i8*, i8)*
+  %cont1 = call i8* %cont0.cast(i8* %buffer, i8 zeroext 0)
+  %cont1.cast = bitcast i8* %cont1 to i8* (i8*, i8)*
+  %cont2 = call i8* %cont1.cast(i8* %buffer, i8 zeroext 0)
+  %cont2.cast = bitcast i8* %cont2 to i8* (i8*, i8)*
+  call i8* %cont2.cast(i8* %buffer, i8 zeroext 1)
+  ret i32 0
+}
+
+;   Unfortunately, we don't seem to fully optimize this right now due
+;   to some sort of phase-ordering thing.
+; CHECK-LABEL: define i32 @main
+; CHECK-NEXT:  entry:
+; CHECK:         [[BUFFER:%.*]] = alloca [8 x i8], align 4
+; CHECK:         [[SLOT:%.*]] = bitcast [8 x i8]* [[BUFFER]] to i32*
+; CHECK-NEXT:    store i32 4, i32* [[SLOT]], align 4
+; CHECK-NEXT:    call void @print(i32 4)
+; CHECK-NEXT:    [[LOAD:%.*]] = load i32, i32* [[SLOT]], align 4
+; CHECK-NEXT:    [[INC:%.*]] = add i32 [[LOAD]], 1
+; CHECK-NEXT:    store i32 [[INC]], i32* [[SLOT]], align 4
+; CHECK-NEXT:    call void @print(i32 [[INC]])
+; CHECK-NEXT:    [[LOAD:%.*]] = load i32, i32* [[SLOT]], align 4
+; CHECK-NEXT:    [[INC:%.*]] = add i32 [[LOAD]], 1
+; CHECK-NEXT:    store i32 [[INC]], i32* [[SLOT]], align 4
+; CHECK-NEXT:    call void @print(i32 [[INC]])
+; CHECK-NEXT:    ret i32 0
+
+declare token @llvm.coro.id.retcon(i32, i32, i8*, i8*, i8*, i8*)
+declare i8* @llvm.coro.begin(token, i8*)
+declare i1 @llvm.coro.suspend.retcon(...)
+declare i1 @llvm.coro.end(i8*, i1)
+declare i8* @llvm.coro.prepare.retcon(i8*)
+
+declare i8* @prototype(i8*, i8 zeroext)
+
+declare noalias i8* @allocate(i32 %size)
+declare void @deallocate(i8* %ptr)
+
+declare void @print(i32)
+




More information about the llvm-commits mailing list