[llvm-dev] [RFC] Adding CPS call support

Mon Apr 17 15:52:49 PDT 2017

(Sorry for the 2nd email Eli, I forgot to reply-all).

> I'm not following how explicitly representing the return address of a call in the IR before isel actually solves any relevant issue. We already pass the return address implicitly as an argument to every call; you can retrieve it with llvm.returnaddress if you need it.

Unfortunately the @llvm.returnaddress intrinsic does not solve the problem, as it only reads the return address pushed onto the LLVM stack by a call. We would then need a way to move the LLVM stack pointer back to where it was before the call, because a CPS call _must_ not grow the LLVM stack (it will never be popped!), so a 'call' instruction will not do.

> You can't branch across functions like you're proposing because the stack and callee-save registers won't be in the right state.  LLVM will inevitably save values to the stack for calls which are not tail calls.  Therefore, this proposal simply doesn't work.

It might be helpful to think of the CPS transformation as turning the program into a form such that it always moves "forward", in the sense that the tail calls _never_ return. A "return" is simply another tail call (i.e., a jump) to a return address, which looks no different than a function call. 

Thus, we have no callee-saved registers to worry about since the tail calls never come back. In addition, there are no live values in the LLVM stack frame, since they are all passed in the CPS call (which uses a different stack). Thus, retpt receives all of its values from the struct it extracts from. While the 'cps' marked call appears to be non-tail, it will be lowered to a tail call.

It's also worth noting that this proposal is based on something that already works in LLVM, using a special assembly routine. We currently this workaround in Manticore when a thread is preempted. But, it has too much overhead for GHC, since the workaround would penalize general function calls. This is why support is necessary for GHC (and related compilers) in order to use LLVM effectively.

~kavon

> On Apr 17, 2017, at 7:47 PM, Friedman, Eli <efriedma at codeaurora.org> wrote:
> 
> On 4/17/2017 8:30 AM, Kavon Farvardin via llvm-dev wrote:
>> Summary
>> =======
>> 
>> There is a need for dedicated continuation-passing style (CPS) calls in LLVM to
>> support functional languages. Herein I describe the problem and propose a
>> solution. Feedback and/or tips are greatly appreciated, as our goal is to
>> implement these changes so they can be merged into LLVM trunk.
>> 
>> 
>> Problem
>> =======
>> 
>> Implementations of functional languages like Haskell and ML (e.g., GHC and
>> Manticore) use a continuation-passing style (CPS) transformation in order to
>> manage the call stack explicitly. This is done prior to generating LLVM IR, so
>> the implicit call stack within LLVM is not used for call and return.
>> 
>> When making a non-tail call while in CPS, we initialize a stack frame for the
>> return through our own stack pointer, and then pass that stack pointer to the
>> callee when we jump to it. It is here when we run into a problem in LLVM.
>> 
>> Consider the following CPS call to @bar and how it will return:
>> 
>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>> 
>> define void @foo (i8** %sp, ...) {
>> someBlk:
>>     ; ...
>>     ; finish stack frame by writing return address
>>   %retAddr = blockaddress(@foo, %retpt)
>>   store i8* %retAddr, i8** %sp
>>     ; jump to @bar
>>   tail call void @bar(i8** %sp, ...)
>> 
>>  retpt: 				; <- how can @bar "call" %retpt?
>>    %sp2 = ???
>>    %val = ???
>>    ; ...
>> 
>>  }
>> 
>>  define void @bar (i8** %sp, ...) {
>>  	  ; perform a return
>>  	%retAddr0 = load i8*, i8** %sp
>>  	%retAddr1 = bitcast i8* %retAddr0 to void (i8**, i64)*
>>  	%val = bitcast i64 1 to i64
>>  	  ; jump back to %retpt in @foo, passing %sp and %val
>>  	tail call void %retAddr1(i8** %sp, i64 %val)
>>  }
>> 
>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>> 
>> There is currently no way to jump back to %retpt from another function, as block
>> addresses have restricted usage in LLVM [1]. Our main difficulty is that we
>> cannot jump to a block address without knowing its calling convention, i.e., the
>> particular machine registers (or memory locations) that the block expects
>> incoming values to be passed in.
>> 
>> The workaround we have been using in GHC for LLVM is to break apart every
>> function, placing the code for the continuation of each call into a new
>> function. We do this only so that we can store a function pointer instead of a
>> block address to our stack. This particularly gross transformation inhibits
>> optimizations in both GHC and LLVM, and we would like to remove the need for it.
>> 
>> 
>> Proposal
>> ========
>> 
>> I believe the lowest-impact method of fixing this problem with LLVM is the
>> following:
>> 
>> First, we add a special 'cps' call instruction marker to be used on non-tail
>> calls. Then, we use a specialized calling convention for these non-tail calls,
>> which fix the returned values to specific locations in the machine code [2].
>> 
>> To help illustrate what's going on, let's rewrite the above example using the
>> proposed 'cps' call:
>> 
>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>> 
>> define { ... } @foo (i8** %sp, ...) {
>> someBlk:
>>     ; ...
>>     ; finish stack frame by writing return address
>>   %retAddr = blockaddress(@foo, %retpt)
>>   store i8* %retAddr, i8** %sp
>>     ; jump to @bar
>>   %retVals = cps call ghccc {i8**, i64} @bar (i8** %sp, ...)
>>   br label %retpt
> 
> I'm not following how explicitly representing the return address of a call in the IR before isel actually solves any relevant issue.  We already pass the return address implicitly as an argument to every call; you can retrieve it with llvm.returnaddress if you need it.
> 
> You can't branch across functions like you're proposing because the stack and callee-save registers won't be in the right state.  LLVM will inevitably save values to the stack for calls which are not tail calls.  Therefore, this proposal simply doesn't work.
> 
> -Eli
> 
> -- 
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project
>