[PATCH] [ms-cxxabi] Use inalloca on win32 when passing non-trivial C++ objects

Tue Jan 28 16:05:26 PST 2014


================
Comment at: lib/CodeGen/CGCall.cpp:2276-2279
@@ +2275,6 @@
+    if (RD->hasNonTrivialDestructor()) {
+      // Create a no-op GEP between the placeholder and the cleanup so we can
+      // RAUW it successfully.  It also serves as a marker of the first
+      // instruction where the cleanup is active.
+      llvm::Value *Addr = Builder.CreateConstInBoundsGEP1_32(Slot.getAddr(), 0);
+      pushFullExprCleanup<DestroyUnpassedArg>(EHCleanup, Addr, type);
----------------
(For posterity...) As discussed off-line, we should defer the RAUW to after we're done emitting IR for the function rather than inserting a dummy GEP here.

================
Comment at: lib/CodeGen/CGDecl.cpp:1671
@@ -1666,1 +1670,3 @@
+  LValue lv = MakeAddrLValue(DeclPtr, Ty, Align);
+  if (hasScalarEvaluationKind(Ty)) {
     Qualifiers qs = Ty.getQualifiers();
----------------
You sometimes also call this on line 1655. Maybe avoid computing it twice on that path?

================
Comment at: lib/CodeGen/CodeGenFunction.cpp:602
@@ -595,1 +601,3 @@
+    llvm::Value *Addr = Builder.CreateStructGEP(EI, Idx);
+    ReturnValue = Builder.CreateLoad(Addr, "agg.result");
   } else {
----------------
I wonder if we can delay this load, or mark this memory as invariant. I'm worried that we won't be able to move the load to the end of the function (or wherever we initialize the return object) in some cases, and if we can't we'll end up wasting a register or a spill slot on it.

================
Comment at: lib/CodeGen/CGCall.cpp:1933-1935
@@ +1932,5 @@
+
+    // FIXME: Either emit a copy constructor call, or figure out how to do
+    // guaranteed tail calls with perfect forwarding in LLVM.
+    CGM.ErrorUnsupported(param, "non-trivial argument copy for thunk");
+    EmitNullInitialization(Slot.getAddr(), type);
----------------
Reid Kleckner wrote:
> Richard Smith wrote:
> > Let's move ahead with this as-is, but... in every case where we emit a delegate call, we (should) have a code path which does the same thing but duplicates the function body instead (in order to support varargs functions), and we should probably use that codepath in any case where we have an inalloca argument.
> > 
> > I say "(should)" because this is slightly broken for vtable thunks, and entirely missing for lambda static invoker delegates.
> This doesn't work in general because we don't always have the definition of the function we're trying to delegate to available in this TU.
> 
> A good example is a pointer to a virtual member function that takes something non-trivial by value.
> 
> When we emit a vtable, we might need to emit vtable thunks without having the definition, and we can't rely on the other TU to emit the thunk for us and still claim to be ABI compatible with MSVC.
> 
> Basically, duplicating the body isn't a complete solution.  The only complete solution is to add something to LLVM that guarantees a tail call with perfect argument forwarding without requiring fastcc.
Oh, yuck. Is this even possible in the fully general case? Do we ever need to insert code after the call (for covariant return thunks maybe)?


http://llvm-reviews.chandlerc.com/D2636