[llvm-dev] RFC: LLVM Coroutine Representation, Round 2

Vadim Chugunov via llvm-dev llvm-dev at lists.llvm.org
Thu Jul 21 09:58:19 PDT 2016


cc llvm-dev

On Thu, Jul 21, 2016 at 9:57 AM, Vadim Chugunov <vadimcn at gmail.com> wrote:

> Hi Gor,
> Does you design support resumption with parameter(s)?  (such as Python's
> generator.send(x)).  I suppose the "promise" could be used for passing data
> both ways, but if that's the plan, please mention this explicitly in the
> design doc.
> Also, how is loading/storing to promise going to be lowered?
>
> Vadim
>
> On Mon, Jul 11, 2016 at 6:47 AM, Gor Nishanov via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hi all:
>>
>> Thank you very much for the feedback during the first round! Special
>> thanks to
>> Eli, Sanjoy, Hal and Chandler for their detailed public and private
>> comments.
>> The model is simpler now and the intrinsics have nice symmetry to them:
>> coro.begin + coro.end, coro.alloc + coro.free. Compared to the previous
>> RFC,
>> coro.fork is gone, coro.init and coro.start are collapsed into one
>> coro.begin,
>> coro.suspend is three-way as opposed to two-way. coro.elide and
>> coro.delete
>> renamed coro.alloc and coro.free.
>>
>> All the changes implemented and tested. Full document, (nicely formatted
>> by
>> github is at:
>>
>>   https://github.com/GorNishanov/llvm/blob/coro-rfc/docs/Coroutines.rst
>>
>> Below is a summary of a proposal to add coroutine support to LLVM.
>> The proposal is motivated primarily by the desire to support C++
>> Coroutines [1],
>> however the llvm representation is language neutral and can be used
>> to support coroutines in other languages as well.
>>
>> Clang + llvm coroutines allows you to take this code:
>>
>>   generator<int> range(int from, int to) {
>>      for(int i = from; i < to; ++n)
>>         co_yield i;
>>   }
>>   int main() {
>>      int sum = 0;
>>      for (auto v: range(1,100))
>>         sum += v;
>>      return sum;
>>   }
>>
>> And translate it down to this:
>>
>>   define i32 @main() #5 {
>>   entry:
>>     ret i32 4950
>>   }
>>
>> You can also use coroutines in plain C, by calling the builtins mapping
>> to the
>> intrinsics described by this proposal, so that your coroutine can look
>> like:
>>
>>     #include "Inputs/coro.h"
>>     void print(int v);
>>
>>     void* f(int n) {
>>       CORO_BEGIN(malloc);
>>
>>       while (n-- > 0) {
>>         print(n+1);
>>         CORO_SUSPEND();
>>       }
>>
>>       CORO_END(free);
>>     }
>>
>>     // CHECK-LABEL: @main
>>     int main() {
>>       void* coro = f(4);
>>       CORO_RESUME(coro);
>>       CORO_RESUME(coro);
>>       CORO_DESTROY(coro);
>>     }
>>
>> https://github.com/GorNishanov/clang/blob/coro-rfc/test/Coroutines/coro.c
>>
>> https://github.com/GorNishanov/clang/blob/coro-rfc/test/Coroutines/Inputs/coro.h
>>
>> I prototyped llvm changes to support this proposal and extended clang
>> coroutine
>> implementation [2] to emit proposed intrinsics to smoke test the proposed
>> design. See "More Attention Required" in the doc/Coroutines.rst [4]for
>> details
>> on what additional work is required beyond cleanup and bug fixes.
>>
>> I would like to continue the discussion on the overall design,
>> representation
>> and a roadmap to start upstreaming the changes with the intention to
>> continue
>> the development in tree incrementally.
>>
>> Roadmap:
>> --------
>> 1) Get agreement on coroutine representation and overall direction (this
>> mail).
>>   .. repeat 1) until happy <=== ARE ARE HERE
>> 2) Get agreement on how coroutine transformation passes integrate into the
>>    optimizer pipeline.
>>   .. repeat 2) until happy
>> 3) update IR/Intrinsics.td + doc/Coroutines.rst + doc/LangRef.rst
>> 4) trickle in coroutine transformation passes + tests in digestible
>> chunks.
>> 5) get clang changes in (sometime after step 3).
>> 6) fix bugs / remove limitations / add functionality to coroutine passes
>>  .. repeat 6) until happy
>>
>> Background
>> ==========
>> LLVM coroutines are functions that have one or more `suspend points`.
>> When a suspend point is reached, the execution of a coroutine is
>> suspended and
>> control is returned back to its caller. A suspended coroutine can be
>> resumed
>> to continue execution from the last suspend point or it can be destroyed.
>>
>> In the following example, function `f` (which may or may not be a
>> coroutine
>> itself) returns a handle to a suspended coroutine (**coroutine handle**)
>> that is
>> used by `main` to resume the coroutine twice and then destroy it:
>>
>> .. code-block:: llvm
>>
>>   define i32 @main() {
>>   entry:
>>     %hdl = call i8* @f(i32 4)
>>     call void @llvm.coro.resume(i8* %hdl)
>>     call void @llvm.coro.resume(i8* %hdl)
>>     call void @llvm.coro.destroy(i8* %hdl)
>>     ret i32 0
>>   }
>>
>> In addition to the stack frame which exists when a coroutine is executing,
>> there is an additional region of storage that contains values that keep
>> the
>> coroutine state when a coroutine is suspended. This region of storage
>> is called **coroutine frame**. It is created when a coroutine is called
>> and
>> destroyed when a coroutine runs to completion or destroyed by a call to
>> the `coro.destroy`_ intrinsic. Unlike stackful coroutines [3] which
>> maintains a
>> stack per coroutine, an llvm coroutine frame only contains the values
>> that need
>> to persist across suspend points (there is a path from a use to the
>> definition
>> that crosses a suspend point).
>>
>> Overall idea is that a coroutine is presented to an llvm as an ordinary
>> function
>> with suspension points marked up with intrinsics. We let the optimizer
>> party
>> on the coroutine for awhile. Shortly before the coroutine is eligible to
>> be
>> inlined into its callers, we outline parts of the coroutine that
>> correspond to
>> the code that needs to get executed after the coroutine is resumed or
>> destroyed.
>> The coroutine resumption intrinsics get replaced with direct calls to
>> those
>> outlined functions where possible. Then inliner can inline much leaner
>> and nicer
>> coroutine or any of the outlined parts as desired.
>>
>> If we discover that the lifetime of a coroutine is fully enclosed in the
>> lifetime of its caller, we remove dynamic allocation for coroutine frame
>> and
>> replace it with a static `alloca` on the caller's frame.
>>
>> Please see doc/Coroutines.rst for more details:
>>
>> https://github.com/GorNishanov/llvm/blob/coro-rfc/docs/Coroutines.rst
>>
>> Concerns:
>> =========
>> (This section assumes that you looked at doc/Coroutins.rst)
>>
>> coro.begin: 4 arguments or 5?
>> -----------------------------
>>
>> Background:
>>
>> For standard allocation functions recognized by LLVM, coroutine frame
>> allocation
>> code looks like:
>>
>>   entry:
>>     %size = call i32 @llvm.coro.size.i32(i8* null)
>>     %alloc = call i8* @malloc(i32 %size)
>>     %hdl = call noalias i8* @llvm.coro.begin(i8* %alloc, i32 0, i8* null,
>>                                                                 i8* null)
>>
>> To enable coroutine heap elision with custom allocation functions,
>> `coro.alloc`
>> intrinsic is used to exclude the allocation code when not needed.
>>
>>   entry:
>>     %elide = call i8* @llvm.coro.alloc()
>>     %0 = icmp ne i8* %elide, null
>>     br i1 %0, label %coro.begin, label %coro.alloc
>>
>>   coro.alloc:
>>     %frame.size = call i32 @llvm.coro.size()
>>     %alloc = call i8* @MyAlloc(i32 %frame.size)
>>     br label %coro.begin
>>
>>   coro.begin:
>>     %phi = phi i8* [ %elide, %entry ], [ %alloc, %coro.alloc ]
>>     %frame = call i8* @llvm.coro.begin(i8* %phi, i32 0, i8* null, i8*
>> null)
>>
>> When using a static alloca for coroutine frame, coro.begin and coro.alloc
>> intrinsics are replaced with that very alloca (bitcasted to i8* to match
>> the
>> type).
>>
>> To find a coro.alloc matching a particular coro.begin, I hunt through the
>> %phi
>> until I find definition of coro.alloc. Probably more robust implementation
>> need to use alias analysis to check whether the %phi and %elide may alias.
>>
>> A simpler approach would be just to add another parameter to directly
>> point to
>> `coro.alloc`. With this change, coro.begin block above would look like:
>>
>>   coro.begin:
>>     %phi = phi i8* [ %elide, %entry ], [ %alloc, %coro.alloc ]
>>     %frame = call i8* @llvm.coro.begin(i8* %phi, i8 %elide, i32 0, i8*
>> null,
>>                                                  ^^^^^^^^^         i8*
>> null)
>>
>> All feedback and comments are greatly appreciated!!!
>>
>> Thank you,
>> Gor
>>
>> References:
>> ===========
>>
>> [1] http://open-std.org/JTC1/SC22/WG21/docs/papers/2016/p0057r4.pdf
>> [2] http://llvmweekly.org/issue/95 (Coroutine support added to clang)
>> [3]
>> http://www.boost.org/doc/libs/1_61_0/libs/coroutine/doc/html/index.html
>> [4] https://github.com/GorNishanov/llvm/blob/coro-rfc/docs/Coroutines.rst
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160721/5d3bd168/attachment.html>


More information about the llvm-dev mailing list