[LLVMdev] RFC: GSoC Project

Reid Kleckner reid.kleckner at gmail.com
Mon Apr 11 12:38:03 PDT 2011

This reminds me of Kenneth's "context" proposal from some time back:


I haven't compared the two too closely, but I thought I should throw
it out there as food for thought.


On Mon, Apr 11, 2011 at 3:26 PM, Talin <viridia at gmail.com> wrote:
> On Mon, Apr 11, 2011 at 7:34 AM, Justin Holewinski
> <justin.holewinski at gmail.com> wrote:
>> On Mon, Apr 11, 2011 at 9:07 AM, Sanjoy Das
>> <sanjoy at playingwithpointers.com> wrote:
>>> Hi!
>>> Thanks for the feedback. For context, my implementation plan is here:
>>> http://pastebin.com/e9JMZNCE
>>> First, about unwinding:
>>> In architectures like x86-64, where unwinding based on DWARF info, there
>>> shouldn't be any problems; since the DWARF info will be emitted
>>> correctly. Otherwise, if the unwinding is done by following BP, it
>>> should still be possible to have BP de-reference correctly (ref. "Frame
>>> Pointers" section in the implementation plan). SP will not always have a
>>> correct value - I don't know if this is problem.
>>> About co-routines:
>>> Here is a sketch of how I think co-routines can be implemented (I'll
>>> merge this with the main implementation plan after I get some feedback):
>>> Have a new instruction, called "yield" to return a value from a
>>> co-routine, preserving the state. Thus, we immediately know which
>>> functions are co-routines. Each co-routine will have a new stack.
>>> Associate each co-routine with a thread-local global variable (called
>>> saved_stack here, will have to be mangled with the name of the
>>> co-routine) which points to the start stack block for that co-routine.
>>> This will be the first block in the chain of blocks to follow.
>>> The structure of the block will be similar to the structure of a regular
>>> stack block, except that it will also have space to store two registers
>>> - this_ip and this_sp.
>>> The prologue of a co-routine will jump to a function similar to
>>> setup_new_block (setup_new_block_coroutine) which will work like
>>> setup_new_block, except:
>>> 1. It will first check if saved_stack is NULL. If it is NULL, it will
>>> allocate a new block and save it to saved_stack. It if isn't, it'll
>>> simply restore saved_sp, saved_ip.
>>> 2. In case a new block was allocated, it will pretty much do what
>>> setup_block does, after which it will adjust the SP to make space for
>>> the saved registers.
>>> The destroy_block procedure will also have to be a little different
>>> (mentioned below).
>>> There are four things (relevant to this discussion) a co-routine can do:
>>> Yield
>>> This returns control to the calling function, without forgetting the
>>> current state of the function. To do this, we save saved_ip and
>>> saved_sp. Every yield be padded with instructions pushing and popping
>>> the registers live at that point. Then we set the return value (register
>>> or memory), and restore saved_sp and saved_ip from the current block. We
>>> can't simply return because the actual return value has been hijacked to
>>> provide for block cleanup.
>>> Call to regular function
>>> Just a simple call - the caller's prologue will handle setting up a it's
>>> own stack space etc.
>>> Call to Co-routine
>>> This too should "just work", since all the heavy-lifting is done in the
>>> co-routine's prologue. However, the above approach will not work for
>>> nested co-routines (i.e. calling the same co-routine body with one call
>>> is still active, recursively). I'm not sure if having support for nested
>>> co-routines will add any value.
>>> Return
>>> This will be a regular return. Since the return value has been hijacked
>>> to point to a another block of code (destroy_block_coroutine), control
>>> will jump there instead.
>>> destroy_block_coroutine will free the linked-list of stack blocks (we
>>> have to free this, since we will won't have a reference to this list
>>> anymore), set saved_stack for this co-routine to NULL, and restore
>>> saved_sp and saved_ip.
>> I'm wondering how much of this should be implemented as new LLVM
>> functionality, and how much should be left to the front-end compiler.  With
>> some additional LLVM intrinsics (e.g. llvm.stack.new.block,
>> llvm.stack.delete.block, etc.), a front-end could take care of the details
>> of how co-routines are actually implemented.  This would also give the
>> front-end freedom to implement whatever semantics/conventions are
>> necessary/required for the source language.  I'm just not sure having LLVM
>> dictate how co-routines are implemented is the best way to approach this.
> I'm in agreement with Justin.
> 1) It's much easier to add new intrinsics to LLVM than it is to add new
> instructions. Library functions are even easier - in general you want to use
> intrinsics in cases where the presence of the intrinsic will affect the way
> the code is generated in the calling function.
> 2) The API that I was envisioning for creating stacks was something fairly
> simple (maybe too simple):
>    // Create a new stack, where 'stack_handle' is some opaque object, and
> 'func'
>    // is an LLVM function.
>    stack_handle = llvm.createstack(func, data)
>    // Switch to a different stack (i.e. yield)
>    llvm.switchstack(stack_handle)
>    // deallocate a stack
>    llvm.destroystack(stack_handle)
> The frontend would be responsible for managing the lifetime of the
> stack_handle object. For something like Python generators, the stack_handle
> would be stored in the iterator object that is returned from the generator,
> and deallocated when the generator gets garbage-collected. For cooperative
> threads, the handle would be stored as a private member of a Thread or
> Coroutine class. Different languages would support different ways of using
> coroutines.
> Similarly, the frontend would be responsible for passing data between the
> different coroutines, using either thread-local variables or some other
> method.
> Also I'm thinking the frontend would be responsible for propagating
> exceptions between stacks - so for example, if my coroutine throws an
> exception that is not caught, but instead propagates all the way to the top
> of it's stack, then that coroutine gets terminated, and depending on the
> language semantics, the same exception or another one gets thrown upon
> return from 'yield' in the other context. I would have no objection to
> implementing all of that logic in my frontend.
> Because the internals of the stack_handle may be different on different
> platforms, I'm thinking that the frontend should only get a void* pointer -
> it doesn't need to know what's inside it. I think the frontend only needs a
> small number of operations: create, yield, destroy, and iterate through call
> frames (for garbage collection).
> The 'yield' instruction seems somewhat inspired from Python's 'yield'
> statement, which unfortunately has some substantial drawbacks - such as the
> fact that the yield must be in the top-most call frame in order for the
> compiler to know that the function is a generator / coroutine. In other
> words, you can't in Python have a generator which calls a subroutine that
> does the yield. I'd like to avoid that limitation. Scheme and Lua, to name
> just two examples, have much more powerful models for doing continuations
> and coroutines, and I'd like to be able to support those. I think that can
> be done if we supply only the most minimal support on the LLVM side, and
> leave most of the details to the frontend.
>>> --
>>> Sanjoy Das
>>> http://playingwithpointers.com
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> --
>> Thanks,
>> Justin Holewinski
> --
> -- Talin
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

More information about the llvm-dev mailing list