[llvm-dev] [RFC] Adding CPS call support

Mon Apr 24 16:47:26 PDT 2017

On Fri, Apr 21, 2017 at 5:47 AM, Kavon Farvardin <kavon at farvard.in> wrote:

> > I don't think you need a terminator at the IR level, you need one at the
> MI level. You can lower a CPS-call to some call pseudo-instruction during
> ISel and then break the MBB in two at some later point.
>
> Thanks for this suggestion, Reid! This has turned out to be a lot easier
> than doing it during isel. I'm currently working on modifying the MBB at
> the CPS pseudo-instruction during ExpandISelPseudos.
>

Glad to hear it! :)

> It seems that what you need to support here is multi-prologue functions.
>
>
> Basically, yes. I think of it more like needing functions that have
> multiple entry blocks. If we could attach a physical-register convention to
> an IR block, that would solve everything.
>

I avoid using the phrase "multi-entrypoint" functions, because there is
still only one entry block that dominates the rest.

You should be able to add new live-in physical registers like we do for
landingpads, but that may be a special case. I could imagine adding a new
MBB flag that allows new live-in physical registers.

> Current coroutines assume a large shared prologue and switch on entry. I
> was speaking with Reid earlier today, and that is something that should be
> possible.
>
>
> Yes, we could also manually generate, in each function, that large switch
> that dispatches to these return points, but of course that's a lot of
> overhead to pay for call/return.
>

Large is relative, though. What's one indirect branch in the prologue
compared to the cost of duplicating the prologue/epilogue pair at every CPS
call point when you use all 5 GPR callee-saved registers and some XMM
callee-saved registers? Especially in the context of C++, where coroutines
are vanishingly rare, and usually occur around I/O points or other
expensive blocking operations.

Anyway, that's why, for now, we haven't seriously considered duplicating
the prologue in LLVM. It just didn't seem worth the complexity (we'd also
need to invent call frame pseudos to unwind the stack...)

> You would still need to do something similar to the coroutine support to
> find out which values are live across a cps call and put them into some
> frame that gets restored in the prologue.
>
>
> Luckily, we've already done this in order to layout our stack (this is
> part of the CPS transformation in GHC), so, we don't have anything live
> across the call once we reach the point where we generate LLVM IR. We'd
> also want to avoid putting anything in LLVM coroutine frames anyway, since
> the garbage collector needs to know the layout of the frame to know which
> slots contain pointers.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20170424/8f7aa715/attachment-0001.html>