[LLVMdev] Starting implementation of 'inalloca' parameter attribute for MS C++ ABI pass-by-value

Tue Oct 22 14:14:58 PDT 2013

I wanted to mention that I'm planning to start writing and sending out
patches for this.

Naming the attribute 'alloca' was really confusing, so I'd like to change
it to 'inalloca', which follows the preposition pattern of inreg and byval.

After discussion, we decided it was silly to add stackbase uses to alloca
instructions.  They should stay simple.

Instead, we'll clarify that it is illegal for an optimization to raise an
alloca used as an inalloca argument across a stacksave, and fix any
transforms that do this.  In particular, I'll look at the inliner, which is
the most likely to move allocas.

Furthermore, any call that uses an inalloca argument must have an
associated stackrestore field, regardless of whether it's callee cleanup
(thiscall) or caller cleanup (cdecl).  The backend will be responsible for
restoring the stack pointer on both return and exception edges.

On Thu, Jul 25, 2013 at 2:38 PM, Reid Kleckner <rnk at google.com> wrote:

> Hi LLVM folks,
>
> To properly implement pass-by-value in the Microsoft C++ ABI, we need to
> be able
> to take the address of an outgoing call argument slot.  This is
> http://llvm.org/PR5064 .
>
> Problem
> -------
>
> On Windows, C structs are pushed right onto the stack in line with the
> other
> arguments.  In LLVM, we use byval to model this, and it works for C
> structs.
> However, C++ records are also passed this way, and reusing byval for C++
> records
> breaks C++ object identity rules.
>
> In order to implement the ABI properly, we need a way to get the address
> of the
> argument slot *before* we start the call, so that we can either construct
> the
> object in place on the stack or at least call its copy constructor.
>
> This is further complicated by the possibility of nested calls passing
> arguments by
> value.  A good general case to think about is a binary tree of calls that
> take
> two arguments by value and return by value:
>
>   struct A { int a; };
>   A foo(A, A);
>   foo(foo(A(), A()), foo(A(), A()));
>
> To complete the outer call to foo, we have to adjust the stack for its
> outgoing
> arguments before the inner calls to foo, and arrange for the sret pointers
> to
> point to those slots.
>
> To make this even more complicated, C++ methods are typically callee
> cleanup (thiscall), but free functions are caller cleanup (cdecl).
>
> Features
> --------
>
> A few weeks ago, I sat down with some folks at Google and we came up with
> this
> proposal, which tries to add the minimum set of LLVM IL features to make
> this
> possible.
>
> 1. Allow alloca instructions to use llvm.stacksave values to indicate
> scoping.
>
> This creates an SSA dependence between the alloca instruction and the
> stackrestore instruction that prevents optimizers from accidentally
> reordering
> them in ways that don't verify.  llvm.stacksave in this case is taking on
> a role
> similar to CALLSEQ_START in the selection dag.
>
> LLVM can also apply this to dynamic allocas from inline functions to
> ensure that
> optimizers don't move them.
>
> 2. Add an 'alloca' attribute for parameters.
>
> Only an alloca value can be passed to a parameter with this attribute.  It
> cannot be bitcasted or GEPed.  An alloca can only be passed in this way
> once.
> It can be passed as a normal pointer to any number of other functions.
>
> Aside from allocas bounded by llvm.stacksave and llvm.stackrestore calls,
> there
> can be no allocas between the creation of an alloca passed with this
> attribute
> and its associated call.
>
> 3. Add a stackrestore field to call and invoke instructions.
>
> This models calling conventions which do their own cleanup, and ensures
> that
> even after optimizations have perturbed the IR, we don't consider the
> allocas to
> be live.  For caller cleanup conventions, while the callee may have called
> destructors on its arguments, the allocas can be considered live until the
> stack
> restore.
>
> Example
> -------
>
> A single call to foo, assuming it is stdcall, would be lowered something
> like:
>
> %res = alloca %struct.A
> %base = llvm.stacksave()
> %arg1 = alloca %struct.A, stackbase %base
> %arg2 = alloca %struct.A, stackbase %base
> call @A_ctor(%arg1)
> call @A_ctor(%arg2)
> call x86_stdcallcc @foo(%res sret, %arg1 alloca, %arg2 alloca),
> stackrestore %base
>
> If control does not flow through a call or invoke with a stackrestore
> field,
> then manual calls to llvm.stackrestore must be emitted before another call
> or
> invoke can use an 'alloca' argument.  The manual stack restore call ends
> the
> lifetime of the allocas.  This is necessary to handle unwind edges from
> argument
> expression evaluation as well as the case where foo is not callee cleanup.
>
> Implementation
> --------------
>
> By starting out with the stack save and restore intrinsics, we can
> hopefully
> approach a slow but working implementation sooner rather than later.  The
> work
> should mostly be in the verifier, the IR, its parser, and the x86 backend.
>
> I don't plan to start working on this immediately, but over the long run
> this will be really important to support well.
>
> ---
>
> That's all!  Please send feedback!  This is admittedly a really complicated
> feature and I'm sorry for inflicting it on the LLVM community, but it's
> obviously beyond my control.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131022/16088b76/attachment.html>