[llvm-dev] [RFC] Adding CPS call support

Tue Apr 18 03:47:50 PDT 2017

> Most architectures have a call instruction which does not push anything onto the stack; e.g. on ARM, the "BL" instruction saves the return address in LR.  On other architectures, you can emulate this (for example, you could lower an IR "call" to LEA+JMP on x86-64).

This is similar to what I was originally thinking, since the end goal of all of this is the following machine code for a call (I'll use an x86-64 example):

    leaq  _retpt(%rip), %scratchReg
    movq  %scratchReg, (%ghcSPReg)
    jmp   _bar

But, if we want to end up with the above machine code using a custom lowering of just the following IR instruction,

    %retVals = cps call ghccc @bar (... args ...) 

_without_ explicitly taking the block address and writing it to the GHC stack pointer beforehand, there are some things to consider:

1. How do we prevent IR optimizations from eliminating the %retpt block?

  If we do not explicitly take the address of the %retpt block, a pass such as simplifycfg has no reason to preserve the block, and may merge it with its only predecessor.

2. Where in the GHC stack frame do we write the block address? 

We could hardcode the fact that the stack pointer must be pointing to the field where the return address should be written to when making a 'cps' IR call. This is fine for GHC, but is a seemingly unnecessary restriction.

3. How do we know which argument is the GHC stack pointer?  

  Similarly, we could hardcode the fact that, say, the first argument of every 'cps' IR call is the GHC stack pointer. Then we know from the calling convention which register it is in. This again fine for GHC, but reduces the generality of the feature.

> The return address for the current function is fundamentally live across any non-tail call.  

I'm not quite sure what you mean by this.

While there may be a return address handed to the main function (and thus passed to every other function) it's either unused or not known to LLVM.

> And LLVM will hoist other operations across non-tail calls, and in the process introduce values which are live across calls.  You need to save those values somewhere.  The key question is where.  Your proposal tries to address that by explicitly saving/restoring the return address onto the GHC stack, but you're working at the wrong level; you can't get a complete list of which values are live across calls until register allocation.

The part I had omitted from the proposal is the initialization of the rest of the GHC stack frame, which is laid out in GHC by finding all of the values that are live across the call before CPS conversion. 

While the full list of values preserved by LLVM are not known until register allocation, there are some properties of the LLVM IR we will initially generate that are important to consider:

1. There are no IR values live across the non-tail 'cps' IR call, since they are all passed to the callee.
2. All IR values used after the 'cps' IR call come from the struct returned by that call.

Thus, it should not be possible to hoist instructions above this particular non-tail call, as they are all derived from the values returned in the struct. Any register spills allocated to the LLVM stack before the non-tail call are dead, as they have been copied to the GHC stack frame if they were live. 

If you have a particular example in mind that would be great.

~kavon

> On Apr 18, 2017, at 2:32 AM, Friedman, Eli <efriedma at codeaurora.org> wrote:
> 
> On 4/17/2017 3:52 PM, Kavon Farvardin wrote:
>> (Sorry for the 2nd email Eli, I forgot to reply-all).
>> 
>>> I'm not following how explicitly representing the return address of a call in the IR before isel actually solves any relevant issue. We already pass the return address implicitly as an argument to every call; you can retrieve it with llvm.returnaddress if you need it.
>> 
>> Unfortunately the @llvm.returnaddress intrinsic does not solve the problem, as it only reads the return address pushed onto the LLVM stack by a call. We would then need a way to move the LLVM stack pointer back to where it was before the call, because a CPS call _must_ not grow the LLVM stack (it will never be popped!), so a 'call' instruction will not do.
> 
> Most architectures have a call instruction which does not push anything onto the stack; e.g. on ARM, the "BL" instruction saves the return address in LR.  On other architectures, you can emulate this (for example, you could lower an IR "call" to LEA+JMP on x86-64).
> 
>>> You can't branch across functions like you're proposing because the stack and callee-save registers won't be in the right state.  LLVM will inevitably save values to the stack for calls which are not tail calls.  Therefore, this proposal simply doesn't work.
>> It might be helpful to think of the CPS transformation as turning the program into a form such that it always moves "forward", in the sense that the tail calls _never_ return. A "return" is simply another tail call (i.e., a jump) to a return address, which looks no different than a function call.
>> 
>> Thus, we have no callee-saved registers to worry about since the tail calls never come back. In addition, there are no live values in the LLVM stack frame, since they are all passed in the CPS call (which uses a different stack). Thus, retpt receives all of its values from the struct it extracts from. While the 'cps' marked call appears to be non-tail, it will be lowered to a tail call.
> 
> The return address for the current function is fundamentally live across any non-tail call.  And LLVM will hoist other operations across non-tail calls, and in the process introduce values which are live across calls.  You need to save those values somewhere.  The key question is where.  Your proposal tries to address that by explicitly saving/restoring the return address onto the GHC stack, but you're working at the wrong level; you can't get a complete list of which values are live across calls until register allocation.
> 
> -Eli
> 
> -- 
> Employee of Qualcomm Innovation Center, Inc.
> Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project