[llvm-dev] [RFC] Adding CPS call support

Mon Apr 17 11:47:42 PDT 2017

On 4/17/2017 8:30 AM, Kavon Farvardin via llvm-dev wrote:
> Summary
> =======
>
> There is a need for dedicated continuation-passing style (CPS) calls in LLVM to
> support functional languages. Herein I describe the problem and propose a
> solution. Feedback and/or tips are greatly appreciated, as our goal is to
> implement these changes so they can be merged into LLVM trunk.
>
>
> Problem
> =======
>
> Implementations of functional languages like Haskell and ML (e.g., GHC and
> Manticore) use a continuation-passing style (CPS) transformation in order to
> manage the call stack explicitly. This is done prior to generating LLVM IR, so
> the implicit call stack within LLVM is not used for call and return.
>
> When making a non-tail call while in CPS, we initialize a stack frame for the
> return through our own stack pointer, and then pass that stack pointer to the
> callee when we jump to it. It is here when we run into a problem in LLVM.
>
> Consider the following CPS call to @bar and how it will return:
>
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>
> define void @foo (i8** %sp, ...) {
> someBlk:
>      ; ...
>      ; finish stack frame by writing return address
>    %retAddr = blockaddress(@foo, %retpt)
>    store i8* %retAddr, i8** %sp
>      ; jump to @bar
>    tail call void @bar(i8** %sp, ...)
>
>   retpt: 				; <- how can @bar "call" %retpt?
>     %sp2 = ???
>     %val = ???
>     ; ...
>
>   }
>
>   define void @bar (i8** %sp, ...) {
>   	  ; perform a return
>   	%retAddr0 = load i8*, i8** %sp
>   	%retAddr1 = bitcast i8* %retAddr0 to void (i8**, i64)*
>   	%val = bitcast i64 1 to i64
>   	  ; jump back to %retpt in @foo, passing %sp and %val
>   	tail call void %retAddr1(i8** %sp, i64 %val)
>   }
>
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>
> There is currently no way to jump back to %retpt from another function, as block
> addresses have restricted usage in LLVM [1]. Our main difficulty is that we
> cannot jump to a block address without knowing its calling convention, i.e., the
> particular machine registers (or memory locations) that the block expects
> incoming values to be passed in.
>
> The workaround we have been using in GHC for LLVM is to break apart every
> function, placing the code for the continuation of each call into a new
> function. We do this only so that we can store a function pointer instead of a
> block address to our stack. This particularly gross transformation inhibits
> optimizations in both GHC and LLVM, and we would like to remove the need for it.
>
>
> Proposal
> ========
>
> I believe the lowest-impact method of fixing this problem with LLVM is the
> following:
>
> First, we add a special 'cps' call instruction marker to be used on non-tail
> calls. Then, we use a specialized calling convention for these non-tail calls,
> which fix the returned values to specific locations in the machine code [2].
>
> To help illustrate what's going on, let's rewrite the above example using the
> proposed 'cps' call:
>
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>
> define { ... } @foo (i8** %sp, ...) {
> someBlk:
>      ; ...
>      ; finish stack frame by writing return address
>    %retAddr = blockaddress(@foo, %retpt)
>    store i8* %retAddr, i8** %sp
>      ; jump to @bar
>    %retVals = cps call ghccc {i8**, i64} @bar (i8** %sp, ...)
>    br label %retpt

I'm not following how explicitly representing the return address of a 
call in the IR before isel actually solves any relevant issue.  We 
already pass the return address implicitly as an argument to every call; 
you can retrieve it with llvm.returnaddress if you need it.

You can't branch across functions like you're proposing because the 
stack and callee-save registers won't be in the right state.  LLVM will 
inevitably save values to the stack for calls which are not tail calls.  
Therefore, this proposal simply doesn't work.

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project