[llvm-dev] [RFC] Adding CPS call support

Reid Kleckner via llvm-dev llvm-dev at lists.llvm.org
Wed Apr 19 11:34:02 PDT 2017


I don't think you need a terminator at the IR level, you need one at the MI
level. You can lower a CPS-call to some call pseudo-instruction during ISel
and then break the MBB in two at some later point.

On Wed, Apr 19, 2017 at 10:38 AM, Kavon Farvardin <kavon at farvard.in> wrote:

> > The semantics around inlining alone are problematic enough to warrant
> serious hesitation.
>
> There are nicer ways to embed CPS call/return into LLVM; I just figured
> that there would not be much support for adding a new terminator because it
> would change a lot of code. Ideally we would have a block terminator like:
>
>     cps call ghccc @bar (.. args ..) returnsto label %retpt
>
> Where the "returnsto" is optional to cover both CPS calls and returns.
> Another alternative would be to emulate this terminator using an intrinsic.
>
> So, my question is: would there be more support for a version of this
> terminator (whether as an intrinsic or not) instead of what I was proposing
> earlier?
>
> ~kavon
>
> > On Apr 18, 2017, at 7:24 PM, Philip Reames via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> >
> >
> > On 04/17/2017 08:30 AM, Kavon Farvardin via llvm-dev wrote:
> >> Summary
> >> =======
> >>
> >> There is a need for dedicated continuation-passing style (CPS) calls in
> LLVM to
> >> support functional languages. Herein I describe the problem and propose
> a
> >> solution. Feedback and/or tips are greatly appreciated, as our goal is
> to
> >> implement these changes so they can be merged into LLVM trunk.
> >>
> >>
> >> Problem
> >> =======
> >>
> >> Implementations of functional languages like Haskell and ML (e.g., GHC
> and
> >> Manticore) use a continuation-passing style (CPS) transformation in
> order to
> >> manage the call stack explicitly. This is done prior to generating LLVM
> IR, so
> >> the implicit call stack within LLVM is not used for call and return.
> >>
> >> When making a non-tail call while in CPS, we initialize a stack frame
> for the
> >> return through our own stack pointer, and then pass that stack pointer
> to the
> >> callee when we jump to it. It is here when we run into a problem in
> LLVM.
> >>
> >> Consider the following CPS call to @bar and how it will return:
> >>
> >> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> >>
> >> define void @foo (i8** %sp, ...) {
> >> someBlk:
> >>     ; ...
> >>     ; finish stack frame by writing return address
> >>   %retAddr = blockaddress(@foo, %retpt)
> >>   store i8* %retAddr, i8** %sp
> >>     ; jump to @bar
> >>   tail call void @bar(i8** %sp, ...)
> >>
> >>  retpt:                              ; <- how can @bar "call" %retpt?
> >>    %sp2 = ???
> >>    %val = ???
> >>    ; ...
> >>
> >>  }
> >>
> >>  define void @bar (i8** %sp, ...) {
> >>        ; perform a return
> >>      %retAddr0 = load i8*, i8** %sp
> >>      %retAddr1 = bitcast i8* %retAddr0 to void (i8**, i64)*
> >>      %val = bitcast i64 1 to i64
> >>        ; jump back to %retpt in @foo, passing %sp and %val
> >>      tail call void %retAddr1(i8** %sp, i64 %val)
> >>  }
> >>
> >> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> >>
> >> There is currently no way to jump back to %retpt from another function,
> as block
> >> addresses have restricted usage in LLVM [1]. Our main difficulty is
> that we
> >> cannot jump to a block address without knowing its calling convention,
> i.e., the
> >> particular machine registers (or memory locations) that the block
> expects
> >> incoming values to be passed in.
> >>
> >> The workaround we have been using in GHC for LLVM is to break apart
> every
> >> function, placing the code for the continuation of each call into a new
> >> function. We do this only so that we can store a function pointer
> instead of a
> >> block address to our stack. This particularly gross transformation
> inhibits
> >> optimizations in both GHC and LLVM, and we would like to remove the
> need for it.
> > Can you give a bit more information on what this transformation breaks
> optimization wise?  You're essentially explicitly representing the
> continuation and converting all calls into tail calls with an explicit
> continuation argument. I would expect the LLVM optimizer to do reasonable
> well with such a representation.
> >>
> >>
> >> Proposal
> >> ========
> >>
> >> I believe the lowest-impact method of fixing this problem with LLVM is
> the
> >> following:
> >>
> >> First, we add a special 'cps' call instruction marker to be used on
> non-tail
> >> calls. Then, we use a specialized calling convention for these non-tail
> calls,
> >> which fix the returned values to specific locations in the machine code
> [2].
> >>
> >> To help illustrate what's going on, let's rewrite the above example
> using the
> >> proposed 'cps' call:
> >>
> >> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> >>
> >> define { ... } @foo (i8** %sp, ...) {
> >> someBlk:
> >>     ; ...
> >>     ; finish stack frame by writing return address
> >>   %retAddr = blockaddress(@foo, %retpt)
> >>   store i8* %retAddr, i8** %sp
> >>     ; jump to @bar
> >>   %retVals = cps call ghccc {i8**, i64} @bar (i8** %sp, ...)
> >>   br label %retpt
> >>
> >>  retpt:
> >>    %sp2 = extractvalue {i8**, i64} %retVals, 0
> >>    %val = extractvalue {i8**, i64} %retVals, 1
> >>    ; ...
> >>
> >>  }
> >>
> >>  define {i8**, i64} @bar (i8** %sp, ...) {
> >>        ; perform a return
> >>      %retAddr0 = load i8*, i8** %sp
> >>      %retAddr1 = bitcast i8* %retAddr0 to {i8**, i64} (i8**, i64)*
> >>      %val = bitcast i64 1 to i64
> >>        ; jump back to %retpt in @foo, passing %sp and %val
> >>      tail call ghccc void %retAddr1(i8** %sp, i64 %val)
> >>
> >>      unreachable   ; <- ideally this would be our terminator,
> >>                    ; but 'unreachable' breaks TCO, so we will
> >>                    ; emit a ret of the struct "returned" by the call.
> >>  }
> >>
> >> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> >>
> >> The important point here is that the 'cps' marked call will lower to a
> jump. The
> >> 'cps' call marker means that the callee knows how to return using the
> arguments
> >> explicitly passed to it, i.e., the stack pointer %sp. The callee cannot
> use a
> >> 'ret' instruction if it is 'cps' called.
> >>
> >> Either before or during 'cps' call lowering, any instructions following
> the
> >> 'cps' call to @bar are sunk into the the block %retpt, and the
> unconditional
> >> branch to %retpt is deleted/ignored. We include that branch to preserve
> >> control-flow information for LLVM IR optimization passes.
> >>
> >> The 'extractvalue' instructions are what ensure the calling convention
> of
> >> %retpt, since the fields of the struct %retVals are returned in physical
> >> registers dictated by the (modified) ghccc convention. Those same
> physical
> >> registers are used by the ghccc tail call in @bar when it jumps back to
> %retpt.
> >> So, the call & return convention of ghccc ensures that everything
> matches up.
> >>
> >>
> >> Interaction with LLVM
> >> =====================
> >>
> >> (1) Caller-saved Values
> >>
> >> One may wonder how this would work if there are caller-saved values of
> the 'cps'
> >> call. But, in our first example, which closely matches what CPS code
> looks like,
> >> the call to @bar was in tail position. Thus, in the second example,
> there are no
> >> caller-saved values for the 'cps' call to @bar, as all live values were
> passed
> >> as arguments in the call.
> >>
> >> This caller-saved part is a bit subtle. It works fine in my experience
> [2] when
> >> @bar is a function not visible to LLVM. My impression is that even if
> @bar is
> >> visible to LLVM, there is still no issue, but if you can think of any
> corner
> >> cases that would be great!
> >>
> >> (2) Inlining
> >>
> >> My gut feeling is that we cannot inline a 'cps' marked call-site
> without more
> >> effort. This is because we might end up with something odd like this
> once the
> >> dust settles:
> >>
> >>     %retAddr = blockaddress(@foo, %retpt)
> >>     %retAddr1 = bitcast i8* %retAddr to {i8**, i64} (i8**, i64)*
> >>     tail call ghccc %retAddr1 ( %sp, ... )
> >>
> >> We could add a pass that turns the above sequence into just an
> unconditional
> >> branch to %retpt, using a phi-node to replace each 'extractvalue'
> instruction in
> >> that block.
> >>
> >> I'm not sure whether inlining in LLVM is important for us yet, as we
> tend to
> >> inline quite a lot before generating LLVM IR. I don't think this
> additional fix-
> >> up pass would be too difficult to implement if it's desired.
> >>
> >>
> >> Implementation Sketch and Conclusion
> >> ====================================
> >>
> >> My current plan is to add this special lowering of 'cps' calls during
> the
> >> translation from LLVM IR to the SelectionDAG. I welcome any suggestions
> or tips
> >> on the best way to approach this. An important goal for us is to merge
> this into
> >> trunk since we do not want to bundle a special version of LLVM with GHC.
> >>
> >> Please let me know soon if you have any objections to this feature.
> >>
> >> Thanks for reading,
> >> Kavon
> > Honestly, this proposed design seems seriously problematic.  The
> semantics around inlining alone are problematic enough to warrant serious
> hesitation.  I think you need to take a step back, describe the original
> problem you're trying to solve and why the standard conversion approaches
> don't work out.  You'd need to strongly motivate a change of this magnitude
> and I really don't feel like you've done so so far.
> >>
> >>
> >> References
> >> ==========
> >>
> >> [1] http://llvm.org/docs/LangRef.html#blockaddress
> >> [2] http://kavon.farvard.in/papers/ml16-cwc-llvm.pdf
> >>
> >>
> >> _______________________________________________
> >> LLVM Developers mailing list
> >> llvm-dev at lists.llvm.org
> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> >
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170419/0fcbf034/attachment.html>


More information about the llvm-dev mailing list