[llvm-dev] [RFC] Adding CPS call support

Friedman, Eli via llvm-dev llvm-dev at lists.llvm.org
Tue Apr 18 13:50:54 PDT 2017


The cps marker marker itself isn't important.  I'm much more concerned 
that your proposal simply doesn't work in its current form: it will lead 
to miscompiles and won't effectively expose optimization opportunities.

-Eli

On 4/18/2017 1:08 PM, Kavon Farvardin wrote:
> Before I try to respond to all of these points, let's consider limiting the scope of the proposed changes:
>
> Instead of adding a 'cps' call marker, what if I were to add a custom lowering (during isel) for calls marked with the 'ghccc' calling convention? There is already language specific lowering in isel for Swift, so I imagine this would more acceptable?
>
> ~kavon
>
>> On Apr 18, 2017, at 7:27 PM, Philip Reames <listmail at philipreames.com> wrote:
>>
>>
>>
>> On 04/17/2017 04:04 PM, Kavon Farvardin via llvm-dev wrote:
>>>> Is there a reason you can't use the algorithm from the paper "A Correspondence between Continuation Passing Style and Static Single Assignment Form" to convert your IR to LLVM's SSA IR?
>>> Yes, there are a few reasons.
>>>
>>> Undoing the CPS transformation earlier in the pipeline would mean that we are using LLVM's built-in stack. The special layout and usage of the stack in GHC is achieved through CPS, so it is baked the compiler and garbage-collected runtime system.
>> Can you give a bit more detail here?  LLVM does provide support for describing GC frame maps.
>>
>> p.s. You're going to have to justify the design of the runtime a bit here. Extending the IR to workaround a buggy or poorly structured runtime is not going to be sufficient justification.  *Why* does the runtime need the specific runtime stack structure used?  What alternatives exist and why should those be rejected?
>>> ~kavon
>>>
>>>> On Apr 17, 2017, at 8:56 PM, Manuel Jacob <me at manueljacob.de> wrote:
>>>>
>>>> Hi Kavon,
>>>>
>>>> Is there a reason you can't use the algorithm from the paper "A Correspondence between Continuation Passing Style and Static Single Assignment Form" to convert your IR to LLVM's SSA IR?
>>>>
>>>> -Manuel
>>>>
>>>> On 2017-04-17 17:30, Kavon Farvardin via llvm-dev wrote:
>>>>> Summary
>>>>> =======
>>>>> There is a need for dedicated continuation-passing style (CPS) calls in LLVM to
>>>>> support functional languages. Herein I describe the problem and propose a
>>>>> solution. Feedback and/or tips are greatly appreciated, as our goal is to
>>>>> implement these changes so they can be merged into LLVM trunk.
>>>>> Problem
>>>>> =======
>>>>> Implementations of functional languages like Haskell and ML (e.g., GHC and
>>>>> Manticore) use a continuation-passing style (CPS) transformation in order to
>>>>> manage the call stack explicitly. This is done prior to generating LLVM IR, so
>>>>> the implicit call stack within LLVM is not used for call and return.
>>>>> When making a non-tail call while in CPS, we initialize a stack frame for the
>>>>> return through our own stack pointer, and then pass that stack pointer to the
>>>>> callee when we jump to it. It is here when we run into a problem in LLVM.
>>>>> Consider the following CPS call to @bar and how it will return:
>>>>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>>>>> define void @foo (i8** %sp, ...) {
>>>>> someBlk:
>>>>>     ; ...
>>>>>     ; finish stack frame by writing return address
>>>>>   %retAddr = blockaddress(@foo, %retpt)
>>>>>   store i8* %retAddr, i8** %sp
>>>>>     ; jump to @bar
>>>>>   tail call void @bar(i8** %sp, ...)
>>>>> retpt: 				; <- how can @bar "call" %retpt?
>>>>>    %sp2 = ???
>>>>>    %val = ???
>>>>>    ; ...
>>>>> }
>>>>> define void @bar (i8** %sp, ...) {
>>>>> 	  ; perform a return
>>>>> 	%retAddr0 = load i8*, i8** %sp
>>>>> 	%retAddr1 = bitcast i8* %retAddr0 to void (i8**, i64)*
>>>>> 	%val = bitcast i64 1 to i64
>>>>> 	  ; jump back to %retpt in @foo, passing %sp and %val
>>>>> 	tail call void %retAddr1(i8** %sp, i64 %val)
>>>>> }
>>>>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>>>>> There is currently no way to jump back to %retpt from another function, as block
>>>>> addresses have restricted usage in LLVM [1]. Our main difficulty is that we
>>>>> cannot jump to a block address without knowing its calling convention, i.e., the
>>>>> particular machine registers (or memory locations) that the block expects
>>>>> incoming values to be passed in.
>>>>> The workaround we have been using in GHC for LLVM is to break apart every
>>>>> function, placing the code for the continuation of each call into a new
>>>>> function. We do this only so that we can store a function pointer instead of a
>>>>> block address to our stack. This particularly gross transformation inhibits
>>>>> optimizations in both GHC and LLVM, and we would like to remove the need for it.
>>>>> Proposal
>>>>> ========
>>>>> I believe the lowest-impact method of fixing this problem with LLVM is the
>>>>> following:
>>>>> First, we add a special 'cps' call instruction marker to be used on non-tail
>>>>> calls. Then, we use a specialized calling convention for these non-tail calls,
>>>>> which fix the returned values to specific locations in the machine code [2].
>>>>> To help illustrate what's going on, let's rewrite the above example using the
>>>>> proposed 'cps' call:
>>>>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>>>>> define { ... } @foo (i8** %sp, ...) {
>>>>> someBlk:
>>>>>     ; ...
>>>>>     ; finish stack frame by writing return address
>>>>>   %retAddr = blockaddress(@foo, %retpt)
>>>>>   store i8* %retAddr, i8** %sp
>>>>>     ; jump to @bar
>>>>>   %retVals = cps call ghccc {i8**, i64} @bar (i8** %sp, ...)
>>>>>   br label %retpt
>>>>> retpt:
>>>>>    %sp2 = extractvalue {i8**, i64} %retVals, 0
>>>>>    %val = extractvalue {i8**, i64} %retVals, 1
>>>>>    ; ...
>>>>> }
>>>>> define {i8**, i64} @bar (i8** %sp, ...) {
>>>>> 	  ; perform a return
>>>>> 	%retAddr0 = load i8*, i8** %sp
>>>>> 	%retAddr1 = bitcast i8* %retAddr0 to {i8**, i64} (i8**, i64)*
>>>>> 	%val = bitcast i64 1 to i64
>>>>> 	  ; jump back to %retpt in @foo, passing %sp and %val
>>>>> 	tail call ghccc void %retAddr1(i8** %sp, i64 %val)
>>>>> 	unreachable   ; <- ideally this would be our terminator,
>>>>> 	              ; but 'unreachable' breaks TCO, so we will
>>>>> 	              ; emit a ret of the struct "returned" by the call.
>>>>> }
>>>>> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
>>>>> The important point here is that the 'cps' marked call will lower to a jump. The
>>>>> 'cps' call marker means that the callee knows how to return using the arguments
>>>>> explicitly passed to it, i.e., the stack pointer %sp. The callee cannot use a
>>>>> 'ret' instruction if it is 'cps' called.
>>>>> Either before or during 'cps' call lowering, any instructions following the
>>>>> 'cps' call to @bar are sunk into the the block %retpt, and the unconditional
>>>>> branch to %retpt is deleted/ignored. We include that branch to preserve
>>>>> control-flow information for LLVM IR optimization passes.
>>>>> The 'extractvalue' instructions are what ensure the calling convention of
>>>>> %retpt, since the fields of the struct %retVals are returned in physical
>>>>> registers dictated by the (modified) ghccc convention. Those same physical
>>>>> registers are used by the ghccc tail call in @bar when it jumps back to %retpt.
>>>>> So, the call & return convention of ghccc ensures that everything matches up.
>>>>> Interaction with LLVM
>>>>> =====================
>>>>> (1) Caller-saved Values
>>>>> One may wonder how this would work if there are caller-saved values of the 'cps'
>>>>> call. But, in our first example, which closely matches what CPS code looks like,
>>>>> the call to @bar was in tail position. Thus, in the second example, there are no
>>>>> caller-saved values for the 'cps' call to @bar, as all live values were passed
>>>>> as arguments in the call.
>>>>> This caller-saved part is a bit subtle. It works fine in my experience [2] when
>>>>> @bar is a function not visible to LLVM. My impression is that even if @bar is
>>>>> visible to LLVM, there is still no issue, but if you can think of any corner
>>>>> cases that would be great!
>>>>> (2) Inlining
>>>>> My gut feeling is that we cannot inline a 'cps' marked call-site without more
>>>>> effort. This is because we might end up with something odd like this once the
>>>>> dust settles:
>>>>>     %retAddr = blockaddress(@foo, %retpt)
>>>>>     %retAddr1 = bitcast i8* %retAddr to {i8**, i64} (i8**, i64)*
>>>>>     tail call ghccc %retAddr1 ( %sp, ... )
>>>>> We could add a pass that turns the above sequence into just an unconditional
>>>>> branch to %retpt, using a phi-node to replace each 'extractvalue' instruction in
>>>>> that block.
>>>>> I'm not sure whether inlining in LLVM is important for us yet, as we tend to
>>>>> inline quite a lot before generating LLVM IR. I don't think this additional fix-
>>>>> up pass would be too difficult to implement if it's desired.
>>>>> Implementation Sketch and Conclusion
>>>>> ====================================
>>>>> My current plan is to add this special lowering of 'cps' calls during the
>>>>> translation from LLVM IR to the SelectionDAG. I welcome any suggestions or tips
>>>>> on the best way to approach this. An important goal for us is to merge this into
>>>>> trunk since we do not want to bundle a special version of LLVM with GHC.
>>>>> Please let me know soon if you have any objections to this feature.
>>>>> Thanks for reading,
>>>>> Kavon
>>>>> References
>>>>> ==========
>>>>> [1] http://llvm.org/docs/LangRef.html#blockaddress
>>>>> [2] http://kavon.farvard.in/papers/ml16-cwc-llvm.pdf
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> llvm-dev at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev


-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project



More information about the llvm-dev mailing list