[cfe-commits] r68732 - in /cfe/trunk: include/clang/AST/ include/clang/Frontend/ lib/AST/ lib/Frontend/ lib/Sema/ test/ test/PCH/ tools/clang-cc/

Thu Apr 9 22:52:36 PDT 2009

> URL: http://llvm.org/viewvc/llvm-project?rev=68732&view=rev
> Log:
> Implementation of pre-compiled headers (PCH) based on lazy
> de-serialization of abstract syntax trees.

Awesome Doug! Thank you for splitting out the astcontext-everwhere  
part of the patch, it definitely makes it easier to review.

> +++ cfe/trunk/include/clang/AST/DeclContextInternals.h Thu Apr  9  
> 17:27:44 2009
>   /// VectorTy - When in vector form, this is what the Data pointer  
> points to.
> +  typedef llvm::SmallVector<uintptr_t, 4> VectorTy;
> +
> +  /// \brief The stored data, which will be either a declaration  
> ID, a
> +  /// pointer to a NamedDecl, or a pointer to a vector.
> +  uintptr_t Data;

Why the custom bitmangling?

> +  void setFromDeclIDs(const llvm::SmallVectorImpl<unsigned> &Vec) {
> +    if (Vec.size() > 1) {
> +      VectorTy *Vector = getAsVector();
> +      if (!Vector) {
> +        Vector = new VectorTy;
> +        Data = reinterpret_cast<uintptr_t>(Vector) | DK_Decl_Vector;
> +      }
> +
> +      Vector->resize(Vec.size());
> +      std::copy(Vec.begin(), Vec.end(), Vector->begin());
> +      return;
> +    }
> +
> +    if (VectorTy *Vector = getAsVector())
> +      delete Vector;
> +
> +    if (Vec.size() == 0)

Vec.empty() ?

> +++ cfe/trunk/include/clang/Frontend/PCHBitCodes.h Thu Apr  9  
> 17:27:44 2009
> @@ -0,0 +1,239 @@

>
> +#ifndef LLVM_CLANG_FRONTEND_PCHBITCODES_H
> +#define LLVM_CLANG_FRONTEND_PCHBITCODES_H
> +
> +#include "llvm/Bitcode/BitCodes.h"
> +#include "llvm/Support/DataTypes.h"
> +
> +namespace clang {
> +  namespace pch {
> +    const int IDBits = 32;

IDBits is dead?

> +++ cfe/trunk/include/clang/Frontend/PCHWriter.h Thu Apr  9 17:27:44  
> 2009
>
> +  /// The ID numbers of declarations are consecutive (in order of
> +  /// discovery) and start at 2. 1 is reserved for the translation
> +  /// unit, while 0 is reserved for NULL.

Please repeat or move this comment to the definition of the pch::ID  
typedef.

> +  /// \brief Map that provides the ID numbers of each type within the
> +  /// output stream.
> +  ///
> +  /// The ID numbers of types are consecutive (in order of discovery)
> +  /// and start at 1. 0 is reserved for NULL. When types are actually
> +  /// stored in the stream, the ID number is shifted by 3 bits to
> +  /// allow for the const/volatile/restrict qualifiers.

Okay, maybe not.  Would it make sense to have two different typedefs  
for TypeID vs DeclID?  If that happened, you could have an appropriate  
comment for each.

> +++ cfe/trunk/lib/Frontend/PCHReader.cpp Thu Apr  9 17:27:44 2009

> +void PCHDeclReader::VisitVarDecl(VarDecl *VD) {
> +  VisitValueDecl(VD);
> +  VD->setStorageClass((VarDecl::StorageClass)Record[Idx++]);
> +  VD->setThreadSpecified(Record[Idx++]);
> +  VD->setCXXDirectInitializer(Record[Idx++]);
> +  VD->setDeclaredInCondition(Record[Idx++]);

This is a very cute idiom.  One potential problem to think about is  
that a malformed pch file could crash the reader by having a record  
with too-few fields: this would cause invalid indexes of Record to be  
used.
>
> +/// \brief Read the type-offsets block.
> +bool PCHReader::ReadTypeOffsets() {

Would it make sense to just make TYPE_OFFSETS be a record in the top- 
level block instead of a block that only contains one record type?   
Likewise with DECL_OFFSETS.  That would get rid of the boilerplate to  
read a block.

> +/// \brief Read and return the type at the given offset.
> +///
> +/// This routine actually reads the record corresponding to the type
> +/// at the given offset in the bitstream. It is a helper routine for
> +/// GetType, which deals with reading type IDs.
> +QualType PCHReader::ReadTypeRecord(uint64_t Offset) {
> +  Stream.JumpToBit(Offset);
> +  RecordData Record;
> +  unsigned Code = Stream.ReadCode();
> +  switch ((pch::TypeCode)Stream.ReadRecord(Code, Record)) {
> +  case pch::TYPE_FIXED_WIDTH_INT: {
> +    assert(Record.size() == 2 && "Incorrect encoding of fixed-width  
> int type");

Instead of asserts, should these emit an error somehow and return  
something bogus?  Having an invalid PCH file crash the compiler is  
acceptable, but suboptimal.

> +    // FIXME: Several other kinds of types to deserialize here!
> +  default:
> +    assert("Unable to deserialize this type");

0 && "unable...

> +void PCHDeclWriter::VisitDecl(Decl *D) {
> +  Writer.AddDeclRef(cast_or_null<Decl>(D->getDeclContext()), Record);
> +  Writer.AddDeclRef(cast_or_null<Decl>(D->getLexicalDeclContext()),  
> Record);
> +  Writer.AddSourceLocation(D->getLocation(), Record);
> +  Record.push_back(D->isInvalidDecl());
> +  // FIXME: hasAttrs

Interesting question: would it be better to unique attribute lists and  
output an attr ID # with each decl, or would it be better to emit a  
"has attr" bit and then emit a separate declid->attrlist datastructure  
independent of this?

> +/// \brief Write a block containing all of the types.
> +void PCHWriter::WriteTypesBlock(ASTContext &Context) {
> +  // Enter the types block
> +  S.EnterSubblock(pch::TYPES_BLOCK_ID, 2);
> +
> +  // Emit all of the types in the ASTContext
> +  for (std::vector<Type*>::const_iterator T =  
> Context.getTypes().begin(),
> +                                       TEnd =  
> Context.getTypes().end();
> +       T != TEnd; ++T) {
> +    // Builtin types are never serialized.
> +    if (isa<BuiltinType>(*T))
> +      continue;
> +
> +    WriteType(*T);
> +  }

Why not just emit all types referred to by decls?  Is there a  
difference?

> +/// \brief Write the block containing all of the declaration IDs
> +/// lexically declared within the given DeclContext.
> +///
> +/// \returns the offset of the DECL_CONTEXT_LEXICAL block within the
> +/// bistream, or 0 if no block was written.
> +uint64_t PCHWriter::WriteDeclContextLexicalBlock(ASTContext &Context,
> +                                                 DeclContext *DC) {
> +  if (DC->decls_begin(Context) == DC->decls_end(Context))
> +    return 0;

How about a new DC->decls_empty(Context) method?

> +++ cfe/trunk/lib/Sema/SemaDecl.cpp Thu Apr  9 17:27:44 2009
> @@ -524,6 +524,12 @@
>       return false;
>   }
>
> +  // __builtin_va_list gets redeclared in the built-in definitions
> +  // buffer when using PCH. Don't complain about such redefinitions.
> +  if (Context.getExternalSource() &&
> +      strcmp(SourceMgr.getBufferName(New->getLocation()), "<built- 
> in>") == 0)
> +    return false;

Eww.  We should really just fix sema to not do this.  Can you add a  
verbose fixme explaining the issue and the fix we discussed?

Overall, this is great Doug!

-Chris