[cfe-dev] [PATCH] PCH Support for C++ decls

Douglas Gregor dgregor at apple.com
Tue Mar 2 14:13:26 PST 2010


On Mar 2, 2010, at 8:26 AM, Andrew Sutton wrote:

> All,
> 
> Since I noticed Doug commit PCH support for NamespaceDecl's the other day (week?), I thought I'd send an update to my last PCH patch. I've been sitting on this work for quite some time while I've been busy with other things.
> 
> The attached implements serialization for 
> - namespaces, using directives, namespace aliases, using (and shadow) declarations
> - classes
> - methods, constructors, destructors, conversion functions
> - and linkage specifications
> There's some support for serializing friend declarations, but only if it's not a friend type. Fixing that should be relatively easy. There is also some work on serializing templates and dependent types, but it's asserted out for now. I integrated Doug's implementation of VisitNamespaceDecl (I wasn't serializing the AnonymousNamespace) into this patch.

Nifty. 

> The patch adds a large number of getters and setters to AST declarations since the PCH code has to construct them piecemeal. I've had to expose some new functionality for creating CXXRecord::DefinitionData objects (see createDefinition() and shareDefinition() in the attached patch). I think it's kind of nice. 

This didn't seem necessary to me; see comments below.

> I had to add new Create() functions for CXXMethod and its derivations since the normal Create() function asserts when you try to create these objects without names. As per a previous suggestion, these take an "EmptyShell" type, which is added as a public empty class in Decl (should it be private?)

It's fine as a public empty class; that's what we do for expressions.

> WRT to the PCH reader/writer classes, I added some new functions for reading and writing base class specifiers and nested name specifiers. I also added a ReadSourceLocation and ReadSourceRange methods to the PCHReader to make their reading consistent with other operations (and easier in some cases).

Good, good. At some point (but not as part of this patch!) we should go replace all of the manually-deserialized source locations and ranges with uses of these functions.

> I modified PCH/namespaces test to include some of the other namespace and using decls and linkage specs. There should be a new classes test in the patch with (let's just say... "not exhaustive") tests for serializing classes.

I think we really need better tests. Due to PCH's laziness, it is extremely annoying to debug when it breaks, so small tests for each kind of declaration/statement/expression are important for catching problems quickly.

> Please let me know if there is anything that I can do to help get this patch applied.

Some comments:

Index: include/clang/Frontend/PCHBitCodes.h
===================================================================
--- include/clang/Frontend/PCHBitCodes.h	(revision 97565)
+++ include/clang/Frontend/PCHBitCodes.h	(working copy)
+      // FIXME: Serialize these or not?
+      DECL_UNRESOLVED_USING_VALUE,
+      DECL_UNRESOLVED_USING_TYPENAME,

Yes, we will have to serialize these. Every bit in the AST gets serialized.

@@ -524,9 +565,7 @@
       /// associates a declaration name with one or more declaration
       /// IDs. This data is used when performing qualified name lookup
       /// into a DeclContext via DeclContext::lookup.
-      DECL_CONTEXT_VISIBLE,
-      /// \brief A NamespaceDecl record.
-      DECL_NAMESPACE
+      DECL_CONTEXT_VISIBLE
     };

We should keep the existing DECL_* constants in order, since each reshuffle is (unnecessary) breakage in backward-compatibility for the PCH format.

Index: include/clang/AST/DeclCXX.h
===================================================================
--- include/clang/AST/DeclCXX.h	(revision 97565)
+++ include/clang/AST/DeclCXX.h	(working copy)
@@ -369,6 +390,21 @@
 
   bool hasDefinition() const { return DefinitionData != 0; }
 
+  // FIXME: This method is required for PCH reading.
+  // Initialize the definition for this class and propagate the definition
+  // to any previous declarartions.
+  void createDefinition();
+
+  // FIXME: This method ios requried for PCH reading.
+  // Force this class to use the definition of the other. This is an error
+  // if the class is already defined.
+  // FIXME: Are there other conditions we should check? I think it's required
+  // that D be a previous declaration of this. Not sure.
+  void shareDefinition(CXXRecordDecl *D) {
+    assert((D && !DefinitionData) && "class already has definition data");
+    DefinitionData = D->DefinitionData;
+  }

I don't see why we should need this. We already have startDefinition()/completeDefinition(), which takes care of allocating the DefinitionData and linking it in.

Index: lib/Frontend/PCHWriter.cpp
===================================================================
--- lib/Frontend/PCHWriter.cpp	(revision 97565)
+++ lib/Frontend/PCHWriter.cpp	(working copy)
@@ -2188,6 +2193,15 @@
   }
 }
 
+void PCHWriter::AddBaseClassRef(const CXXBaseSpecifier &Base,
+                                RecordData &Record) {
+  AddSourceRange(Base.getSourceRange(), Record);
+  Record.push_back(Base.isBaseOfClass());
+  Record.push_back(Base.isVirtual());
+  Record.push_back(Base.getAccessSpecifierAsWritten());
+  AddTypeRef(Base.getType(), Record);
+}

Please store/load fields in the PCH record in the same order as they occur in the corresponding AST; it makes it easier to verify that everything is there. In this case, BaseOfClass and Virtual should be swapped (the same goes for the reader).

+void PCHWriter::AddNestedNameSpecifier(const NestedNameSpecifier *Spec,
+                                       RecordData &Record) {

Why const-qualify the NestedNameSpecifier? You're just const_cast'ing it away anyway.

+  // We need to serialized the number of NNS components since it doesn't
+  // seem to be possible to read linked lists--we need to know where the list
+  // actually stops.
+  unsigned N = 0;
+  NestedNameSpecifier *P = const_cast<NestedNameSpecifier*>(Spec);
+  while (P) {
+    ++N;
+    P = P->getPrefix();
+  }
+  Record.push_back(N);
+
+  P = const_cast<NestedNameSpecifier*>(Spec);
+  while (P) {
+    NestedNameSpecifier::SpecifierKind Kind = P->getKind();
+    Record.push_back(Kind);
+    switch (Kind) {
+    case NestedNameSpecifier::Identifier:
+      AddIdentifierRef(P->getAsIdentifier(), Record);
+      break;
+
+    case NestedNameSpecifier::Namespace:
+      AddDeclRef(P->getAsNamespace(), Record);
+      break;
+
+    case NestedNameSpecifier::TypeSpec:
+    case NestedNameSpecifier::TypeSpecWithTemplate:
+      AddTypeRef(QualType(P->getAsType(), 0), Record);
+      Record.push_back(Kind == NestedNameSpecifier::TypeSpecWithTemplate);
+      break;
+
+    case NestedNameSpecifier::Global:
+      // Don't need to write an associated value.
+      break;
+    }
+    P = P->getPrefix();
+  }
+}

Why not push all of the NestedNameSpecifiers onto a stack and then emit them in reverse order? That would make it far easier for the reader to reconstruct them properly, since the prefix will come before the nested-name-specifier itself.

+NestedNameSpecifier *
+PCHReader::ReadNestedNameSpecifier(const RecordData &Record, unsigned &Idx) {
+  unsigned N = Record[Idx++];
+  NestedNameSpecifier *NNS = 0, *P = 0;
+  for (unsigned I = 0; I != N; ++I) {
+    NestedNameSpecifier::SpecifierKind Kind
+      = (NestedNameSpecifier::SpecifierKind)Record[Idx++];
+    switch (Kind) {
+    case NestedNameSpecifier::Identifier: {
+      IdentifierInfo *II = GetIdentifierInfo(Record, Idx);
+      P = NestedNameSpecifier::Create(*Context, P, II);
+      break;
+    }
+
+    case NestedNameSpecifier::Namespace: {
+      NamespaceDecl *NS = cast<NamespaceDecl>(GetDecl(Record[Idx++]));
+      P = NestedNameSpecifier::Create(*Context, P, NS);
+      break;
+    }
+
+    case NestedNameSpecifier::TypeSpec:
+    case NestedNameSpecifier::TypeSpecWithTemplate: {
+      Type *T = GetType(Record[Idx++]).getTypePtr();
+      bool Template = Record[Idx++];
+      P = NestedNameSpecifier::Create(*Context, P, Template, T);
+      break;
+    }
+
+    case NestedNameSpecifier::Global: {
+      P = NestedNameSpecifier::GlobalSpecifier(*Context);
+      // No associated value.
+      break;
+    }
+
+    // The "first" prefix will be the return value.
+    if (!NNS)
+      NNS = P;
+    }
+  }
+  return NNS;
+}

This won't work with nested-name-specifiers that have prefixes, because we see the nested-name-specifier before its prefix, and end up trying to build the nested-name-specifier backwards. Use the stack I mentioned previously to address this problem, and you can then get rid of the NNS/P dance, since the last nested-name-specifier you build will be the complete one.

Index: lib/Frontend/PCHReaderDecl.cpp
===================================================================
--- lib/Frontend/PCHReaderDecl.cpp	(revision 97565)
+++ lib/Frontend/PCHReaderDecl.cpp	(working copy)
@@ -39,7 +41,6 @@
     void VisitDecl(Decl *D);
     void VisitTranslationUnitDecl(TranslationUnitDecl *TU);
     void VisitNamedDecl(NamedDecl *ND);
-    void VisitNamespaceDecl(NamespaceDecl *D);
     void VisitTypeDecl(TypeDecl *TD);
     void VisitTypedefDecl(TypedefDecl *TD);
     void VisitTagDecl(TagDecl *TD);
@@ -56,6 +57,31 @@
     void VisitFileScopeAsmDecl(FileScopeAsmDecl *AD);
     void VisitBlockDecl(BlockDecl *BD);
     std::pair<uint64_t, uint64_t> VisitDeclContext(DeclContext *DC);
+
+    void VisitLinkageSpecDecl(LinkageSpecDecl *D);
+    void VisitNamespaceDecl(NamespaceDecl *D);
+    void VisitNamespaceAliasDecl(NamespaceAliasDecl *D);
+    void VisitUsing(UsingDecl *D);
+    void VisitUsingShadow(UsingShadowDecl *D);
+    void VisitUsingDirectiveDecl(UsingDirectiveDecl *D);
+    void VisitUnresolvedUsingValue(UnresolvedUsingValueDecl *D);
+    void VisitUnresolvedUsingTypename(UnresolvedUsingTypenameDecl *D);
+    void VisitCXXRecordDecl(CXXRecordDecl *D);
+    void VisitCXXMethodDecl(CXXMethodDecl *D);
+    void VisitCXXConstructorDecl(CXXConstructorDecl *D);
+    void VisitCXXDestructorDecl(CXXDestructorDecl *D);
+    void VisitCXXConversionDecl(CXXConversionDecl *D);
+    void VisitFriendDecl(FriendDecl *D);
+    void VisitTemplateDecl(TemplateDecl *D);
+    void VisitClassTemplateDecl(ClassTemplateDecl *D);
+    void VisitClassTemplateSpecializationDecl(ClassTemplateSpecializationDecl *D);
+    void VisitClassTemplatePartialSpecializationDecl(ClassTemplatePartialSpecializationDecl *D);
+    void visitFunctionTemplateDecl(FunctionTemplateDecl *D);
+    void VisitTemplateTypeParmDecl(TemplateTypeParmDecl *D);
+    void VisitNonTypeTemplateParmDecl(NonTypeTemplateParmDecl *D);
+    void VisitTemplateTemplateParmDecl(TemplateTemplateParmDecl *D);
+    void VisitStaticAssertDecl(StaticAssertDecl *D);

FWIW, I've typically been ordering the Visit functions the same way as the Decl nodes are ordered in DeclNodes.def. 

+void PCHDeclReader::VisitNamespaceAliasDecl(NamespaceAliasDecl *D) {
+  VisitNamedDecl(D);
+  D->setAliasLoc(Reader.ReadSourceLocation(Record, Idx));
+  D->setTargetNameLoc(Reader.ReadSourceLocation(Record, Idx));

Please serialize in the same order as the declarations in the Decl node (here, TargetNameLoc should follow the qualifier).

+void PCHDeclReader::VisitUsing(UsingDecl *D) {
+  VisitNamedDecl(D);
+  D->setUsingLocation(Reader.ReadSourceLocation(Record, Idx));
+  D->setNestedNameRange(Reader.ReadSourceRange(Record, Idx));

Same request here :)

+void PCHDeclReader::VisitUsingDirectiveDecl(UsingDirectiveDecl *D) {
+  VisitNamedDecl(D);
+  D->setNamespaceKeyLocation(Reader.ReadSourceLocation(Record, Idx));
+  D->setIdentLocation(Reader.ReadSourceLocation(Record, Idx));
+  D->setQualifierRange(Reader.ReadSourceRange(Record, Idx));
+  Reader.ReadNestedNameSpecifier(Record, Idx);
+  D->setNominatedNamespace(cast<NamedDecl>(Reader.GetDecl(Record[Idx++])));
+  D->setCommonAncestor(dyn_cast<DeclContext>(Reader.GetDecl(Record[Idx++])));
+}

... and here.

Also, why use dyn_cast? If the common ancestor can be NULL, use cast_or_null. (Same comment for PCHDeclWriter::VisitUsingDirectiveDecl).

Aside: anonymous namespaces are somewhat interesting for PCH, because we need to make sure that an anonymous namespace defined in the source file gets linked to an anonymous namespace defined in the PCH file, since they are effectively the same anonymous namespace. One fun way to test this is:

  // In PCH:
  namespace {
    class X;
  }

  // In source file (that includes PCH):
  namespace {
    class X;
  }

  X *x; // okay, but will likely fail with an ambiguity if the two anonymous namespaces don't get linked.

+void PCHDeclReader::VisitCXXRecordDecl(CXXRecordDecl *D) {
+  VisitRecordDecl(D);
+  // Injected class names are, I think, actually empty.

Yes, they are. Aside from making sure to get them into the right declaration context and link up their previous declaration chains, there shouldn't be much to serialize for them.

+  if (!D->isInjectedClassName()) {
+    // We may have to explicitly initialize the classe's internal definition
+    // object if it hasn't already been declared.
+    CXXRecordDecl *Prev =
+      dyn_cast_or_null<CXXRecordDecl>(D->getPreviousDeclaration());
+    if (!Prev) {
+      D->createDefinition();
+    } else {
+      D->shareDefinition(Prev);
+    }

isInjectedClassName() is the wrong check for this. Instead, PCHDeclWriter::VisitCXXRecordDecl should serialize all of the bits in the definition when D->getDefinition() == D, i.e., the declaration you're serializing is the definition.

+    // Read in the numerous class properties.
+    D->setAggregate(Record[Idx++]);
+    D->setPOD(Record[Idx++]);
+    D->setEmpty(Record[Idx++]);
+    D->setPolymorphic(Record[Idx++]);
+    D->setAbstract(Record[Idx++]);
+    D->setHasTrivialConstructor(Record[Idx++]);
+    D->setHasTrivialCopyConstructor(Record[Idx++]);
+    D->setHasTrivialCopyAssignment(Record[Idx++]);
+    D->setHasTrivialDestructor(Record[Idx++]);

'twould be good to follow class-declaration order here.

+void PCHDeclReader::VisitCXXConstructorDecl(CXXConstructorDecl *D) {
+  VisitCXXMethodDecl(D);
+  D->setExplicitSpecified(Record[Idx++]);
+  if (D->isThisDeclarationADefinition())
+    D->setImplicitlyDefined(Record[Idx++]);
+  // FIXME: Serialize member initializers?
+}

I suggest that the PCH reader/writer avoid cleverness, and serialize exactly what is in the AST (rather than doing the isThisDeclarationADefinition() check).

As for the FIXME, the answer is definitely "yes". We need to serialize everything.

+void PCHDeclReader::VisitCXXDestructorDecl(CXXDestructorDecl *D) {
+  VisitCXXMethodDecl(D);
+  if (D->isThisDeclarationADefinition())
+    D->setImplicitlyDefined(Record[Idx++]);
+  D->setOperatorDelete(cast_or_null<FunctionDecl>(Reader.GetDecl(Record[Idx++])));
+}

Same comment here: PCH shouldn't be clever w.r.t. isThisDeclarationADefinition().

+void PCHDeclReader::VisitFriendDecl(FriendDecl *D) {
+  VisitDecl(D);
+  D->setFriendLoc(Reader.ReadSourceLocation(Record, Idx));
+  bool IsType = Record[Idx++];
+  if (IsType) {
+    // FIXME: Not sure how to translate a QualType into a Type*.
+    assert(false && "Cannot serialize friend type declarations yet");
+  } else {
+    D->setFriendDecl(cast<NamedDecl>(Reader.GetDecl(Record[Idx++])));
+  }
+}

If you have a QualType, use getTypePtr() to get the Type*. If you have a Type*, use QualType(the-Type*, 0) to build the QualType.

The WasSpecialization bit is a hack, but it still needs to be stored in the PCH file.

Index: lib/AST/Decl.cpp
===================================================================
--- lib/AST/Decl.cpp	(revision 97565)
+++ lib/AST/Decl.cpp	(working copy)
@@ -1363,7 +1363,9 @@
     TagT->decl.setInt(1);
   }
 
-  if (isa<CXXRecordDecl>(this)) {
+  if (CXXRecordDecl *D = dyn_cast<CXXRecordDecl>(this)) {
+    D->createDefinition();
+    /*
     CXXRecordDecl *D = cast<CXXRecordDecl>(this);
     struct CXXRecordDecl::DefinitionData *Data = 
       new (getASTContext()) struct CXXRecordDecl::DefinitionData(D);
@@ -1371,6 +1373,7 @@
       D->DefinitionData = Data;
       D = cast_or_null<CXXRecordDecl>(D->getPreviousDeclaration());
     } while (D);
+  */
   }
 }

???

@@ -94,11 +103,12 @@
     // Keep track of inherited vbases for this base class.
     const CXXBaseSpecifier *Base = Bases[i];
     QualType BaseType = Base->getType();
+
     // Skip dependent types; we can't do any checking on them now.
     if (BaseType->isDependentType())
       continue;
     CXXRecordDecl *BaseClassDecl
-      = cast<CXXRecordDecl>(BaseType->getAs<RecordType>()->getDecl());
+      = cast<CXXRecordDecl>(BaseType->getAs<RecordType>()->getDecl());\

Stray '\'

Thanks for working on this!

	- Doug



More information about the cfe-dev mailing list