[cfe-commits] [PATCH] Implement C++11 in-class member initializers

Fri Jun 10 11:06:39 PDT 2011

On May 23, 2011, at 5:48 PM, Richard Smith wrote:

> Hi all,
> 
> The attached patch adds support for C++11's in-class initialization of
> non-static data members:
> 
>  struct S {
>    int n = 0;
>  };

Very cool!

> For the most part, this is straightforward: the initializer is stored in
> the FieldDecl, parsing is delayed until the outermost class is complete
> (as is effectively required by the standard), and CXXConstructorInit has
> been extended to optionally pick the initializer from the FieldDecl (this
> allows the constructor definition to be built before the initializer is
> parsed).
> 
> There is a hotch-potch of horrifying holes here, however:
> 
> 1) C++11 doesn't allow in-class initialization of bit-fields. This would
> be a useful extension, and I think it's simply an oversight in the
> standard. I'm currently packing the initializer and bitfield width into
> the same pointer, so some rework is required to support this.
> 
> 2) There's an ordering issue with the implicit exception specification for
> a defaulted default constructor and in-class initializers. Consider:
> 
>  template<bool b> struct T { typedef int f; };
>  template<> struct T<true> { static void f(); };
>  struct S {
>    bool b = T<noexcept(S())>::f();
>  };
> 
> This is well-formed, but is meaningless: if S::S() is noexcept, then it is
> not, and vice versa. With a little tweaking, we can make it impossible to
> parse such things at all. I have implemented an ad-hoc rule to allow clang
> to try to escape from this pit with some dignity: while performing the
> deferred parsing of in-class initializers, the exception specification of
> the constructor (if needed) is temporarily set to a new value,
> EST_Unknown, and any expression which depends on whether it is noexcept is
> considered ill-formed and diagnosed.

EST_Unknown is fairly nebulous; how about EST_Delayed?

> Eventually, the same rule will need to be used while performing deferred
> parsing for default arguments and exception specifications, which suffer
> from the same problem: see test/SemaCXX/implicit-exception-spec.cpp.
> Implicit constexpr on defaulted default constructors will also suffer a
> similar problem, as will exception specifications for inherited
> constructors (if the standard is fixed to compute them correctly).
> 
> 3) Similarly to issue (2), we have a problem with the
> __has_nothrow_constructor type trait within the definition of a class with
> an in-class initializer. The specification for this trait says that it
> shall be true only if the constructor is known to be non-throwing, which
> isn't the case until we have parsed all the in-class initializers.
> Therefore it currently produces 'false' in such cases (and the value can
> subsequently be 'true' once the class is fully parsed). I'm not
> particularly happy with this, and would prefer for us to diagnose such
> uses.
> 
> 4) The implicit exception specifications of defaulted default constructors
> for classes containing in-class initializers, as implemented in this
> patch, do not conform to the standard. The standard says that an implicit
> exception specification throws an exception E iff any function it directly
> invokes does. This is wrong in two ways:
> 
>  void f();
>  struct S { int n = (throw "eep", 0); };
>  struct T { bool b = nothrow(f()); };
> 
> In the first example, S::S() should be implicitly declared as noexcept,
> because it does not invoke any throwing functions (despite containing a
> throw-expression!). In the second example, S::S() should be implicitly
> declared as noexcept(false), since the standard does not require a
> function to be potentially-evaluated for it to be invoked ("invoked" isn't
> formally defined, so this point is debatable, but that itself is a
> defect).
> 
> I think clang should aim to implement the intent of the standard here: the
> exception specification should include an exception E iff the implicit
> body could throw E. This is partially implemented: the default constructor
> is set to nothrow(false) if the in-class initializer can throw. A full
> solution would require extending the Expr::CanThrow mechanism to provide a
> list of potentially-thrown exceptions, but that is beyond the scope of
> this patch (especially considering that it does not match the standard).
> 
> Another corner case we currently get wrong here is that the implicit
> default constructor can invoke functions in a default argument for a base
> or member default constructor, and such functions' exception
> specifications are not currently being considered. This patch does nothing
> to address that. That's PR9918.
> 
> 5) This patch incorporates an interpretation of the proposed resolution to
> C++ core issue 325. Specifically, we reject this:
> 
>  struct S {
>    int n = T<0, 0>::f();
>    template<int, int> struct T { static int f(); };
>  };
> 
> because the ',' is interpreted as the end of the initializer.
> 
> 
> 
> A couple of other notes on the implementation:
> 
> * There's an outstanding issue with initializers being provided for
> typedefs in a class: currently clang asserts. This will be addressed by a
> later patch.
> * There's a pre-existing issue with clang accepting initializers for
> multiple members of a union: PR9996. This issue affects in-class
> initializers too.
> * This patch also addresses in-class initialization of static members of
> auto type.
> * We were parsing __attributes__/__asm__ and initializers in a
> non-gcc-compatible order; I've swapped them over.
> 
> Apologies for the length of this email; I lack the time to make it shorter.
> 
> I'd welcome any and all comments!



Index: test/CXX/class/p6-0x.cpp
===================================================================

--- test/CXX/class/p6-0x.cpp	(revision 0)
+++ test/CXX/class/p6-0x.cpp	(revision 0)
@@ -0,0 +1,15 @@
+// RUN: %clang_cc1 -fsyntax-only -verify %s -std=c++0x
+
+class Trivial { int n; void f(); };
+class NonTrivial1 { NonTrivial1(const NonTrivial1 &); };
+class NonTrivial2 { NonTrivial2(NonTrivial2 &&); };
+class NonTrivial3 { NonTrivial3 operator=(const NonTrivial3 &); };
+class NonTrivial4 { NonTrivial4 operator=(NonTrivial4 &&); };
+class NonTrivial5 { ~NonTrivial5(); };
+
+static_assert(__is_trivial(Trivial), "Trivial is not trivial");
+static_assert(!__is_trivial(NonTrivial1), "NonTrivial1 is trivial");
+static_assert(!__is_trivial(NonTrivial2), "NonTrivial2 is trivial");
+static_assert(!__is_trivial(NonTrivial3), "NonTrivial3 is trivial");
+static_assert(!__is_trivial(NonTrivial4), "NonTrivial4 is trivial");
+static_assert(!__is_trivial(NonTrivial5), "NonTrivial5 is trivial");

This seems like a separate commit?

@@ -2770,10 +2786,9 @@
   //// ActOnCXXThis -  Parse 'this' pointer.
   ExprResult ActOnCXXThis(SourceLocation loc);
 
-  /// tryCaptureCXXThis - Try to capture a 'this' pointer.  Returns a
-  /// pointer to an instance method whose 'this' pointer is
-  /// capturable, or null if this is not possible.
-  CXXMethodDecl *tryCaptureCXXThis();
+  /// tryCaptureCXXThis - Try to capture a 'this' pointer.  Returns the type
+  /// of the 'this' pointer, or a null type if this is not possible.
+  QualType tryCaptureCXXThis();
 
   /// ActOnCXXBoolLiteral - Parse {true,false} literals.
   ExprResult ActOnCXXBoolLiteral(SourceLocation OpLoc, tok::TokenKind Kind);

tryCaptureCXXThis() is a pretty weird name... getCurrentThisType() would make a bit more sense, IMO, since you're changing the signature anyway.

+/// ActOnCXXInClassMemberInitializer - This is invoked after parsing an
+/// in-class initializer for a non-static C++ class member. Such parsing
+/// is deferred until the class is complete.
+void
+Sema::ActOnCXXInClassMemberInitializer(Decl *D, SourceLocation EqualLoc,
+                                       Expr *InitExpr) {
+  FieldDecl *FD = cast<FieldDecl>(D);
+
+  if (!InitExpr) {
+    FD->setInvalidDecl();
+    return;
+  }

I suggest also clearing out the "this field has an initializer" bit in the invalid-expression case, so the AST is consistent.

+  if (FD->getType()->isDependentType() || InitExpr->isTypeDependent()) {
+    // We can't check dependent initializers until instantiation.
+
+    // Erase any temporaries within this evaluation context; we're not
+    // going to track them in the AST, since we'll be rebuilding the
+    // ASTs during template instantiation.
+    ExprTemporaries.erase(
+      ExprTemporaries.begin() + ExprEvalContexts.back().NumTemporaries,
+      ExprTemporaries.end());

Rather than directly poking at ExprTemporaries, please just move the MaybeCreateExprWithCleanups() call (and subsequent Invalid check) out of the next "else" block, so we perform the corresponding checking (e.g., for destructors) in all cases. Plus, we want as few places as possible to modify ExprTemporaries.

Index: lib/Sema/SemaExprCXX.cpp
===================================================================
--- lib/Sema/SemaExprCXX.cpp	(revision 131936)
+++ lib/Sema/SemaExprCXX.cpp	(working copy)
@@ -17,6 +17,7 @@
 #include "clang/Sema/Lookup.h"
 #include "clang/Sema/ParsedTemplate.h"
 #include "clang/Sema/ScopeInfo.h"
+#include "clang/Sema/Scope.h"
 #include "clang/Sema/TemplateDeduction.h"
 #include "clang/AST/ASTContext.h"
 #include "clang/AST/CXXInheritance.h"
@@ -575,7 +576,7 @@
   return Owned(E);
 }
 
-CXXMethodDecl *Sema::tryCaptureCXXThis() {
+QualType Sema::tryCaptureCXXThis() {
   // Ignore block scopes: we can capture through them.
   // Ignore nested enum scopes: we'll diagnose non-constant expressions
   // where they're invalid, and other uses are legitimate.
@@ -587,18 +588,27 @@
     else break;
   }
 
-  // If we're not in an instance method, error out.
-  CXXMethodDecl *method = dyn_cast<CXXMethodDecl>(DC);
-  if (!method || !method->isInstance())
-    return 0;
+  QualType ThisTy;
+  if (CXXMethodDecl *method = dyn_cast<CXXMethodDecl>(DC)) {
+    if (method && method->isInstance())
+      ThisTy = method->getThisType(Context);
+  } else if (CXXRecordDecl *RD = dyn_cast<CXXRecordDecl>(DC)) {
+    // C++0x [expr.prim]p4:
+    //   Otherwise, if a member-declarator declares a non-static data member
+    // of a class X, the expression this is a prvalue of type "pointer to X"
+    // within the optional brace-or-equal-initializer.
+    if (CurScope->getFlags() & Scope::ThisScope)
+      ThisTy = Context.getPointerType(Context.getRecordType(RD));
+  }

Please don't access CurScope directly... it's parser state that may not represent the current state of Sema (e.g., because Sema could be in the middle of instantiation). Instead, Sema routines driven by the parser should pass a Scope* through to tryCaptureCXXThis(). TreeTransform, on the other hand, will probably need to just instantiate the type of CXXThisExpr rather than calling tryCaptureCXXThis(), since we won't have a useful Scope*.

   // Mark that we're closing on 'this' in all the block scopes, if applicable.
-  for (unsigned idx = FunctionScopes.size() - 1;
-       isa<BlockScopeInfo>(FunctionScopes[idx]);
-       --idx)
-    cast<BlockScopeInfo>(FunctionScopes[idx])->CapturesCXXThis = true;
+  if (!ThisTy.isNull())
+    for (unsigned idx = FunctionScopes.size() - 1;
+         isa<BlockScopeInfo>(FunctionScopes[idx]);
+         --idx)
+      cast<BlockScopeInfo>(FunctionScopes[idx])->CapturesCXXThis = true;

We should only be doing this when 'this' is captured because we're in a CXXMethodDecl. We might incorrectly think we're capturing the outer 'this' in crazy code such as:

	struct X {
		void f() {
			^ {
				struct Nested { Nested *ptr = this; };
			}
		};
	};

Incidentally, it seems that we shouldn't be doing this when we're in an unevaluated context. Hrm.

+ExprResult Parser::ParseCXXMemberInitializer(bool IsFunction,
+                                             SourceLocation &EqualLoc) {
+  assert((Tok.is(tok::equal) || Tok.is(tok::l_brace))
+         && "Data member initializer not starting with '=' or '{'");
+
+  if (Tok.is(tok::equal)) {
+    EqualLoc = ConsumeToken();
+    if (Tok.is(tok::kw_delete)) {
+      // In principle, an initializer of '= delete p;' is legal, but it will
+      // never type-check. It's better to diagnose it as an ill-formed expression
+      // than as an ill-formed deleted non-function member.

This logic only holds because if we try to use the comma to provide a non-void expression, e.g.,

	int field = delete p, 17;

the comma is taken to mean that there's another declarator coming, rather than as part of a comma expression. I think it would help if the comment reflected this.

This patch is looking great; I expected that it would be more complicated to implement this feature (but I'm glad it isn't!).

With the changes I've requested above (especially the CurScope and ExprTemporaries ones), this patch looks ready to go in. Thanks!

	- Doug