r233332 - [Modules] Make the AST serialization always use lexicographic order when

Chandler Carruth chandlerc at gmail.com
Thu Mar 26 16:54:15 PDT 2015


Author: chandlerc
Date: Thu Mar 26 18:54:15 2015
New Revision: 233332

URL: http://llvm.org/viewvc/llvm-project?rev=233332&view=rev
Log:
[Modules] Make the AST serialization always use lexicographic order when
traversing the identifier table.

No easy test case as this table is somewhere between hard and impossible
to observe as non-deterministically ordered. The table is a hash table
but we hash the string contents and never remove entries from the table
so the growth pattern, etc, is all completely fixed. However, relying on
the hash function being deterministic is specifically against the
long-term direction of LLVM's hashing datastructures, which are intended
to provide *no* ordering guarantees. As such, this defends against these
things by sorting the identifiers. Sorting identifiers right before we
emit them to a serialized form seems a low cost for predictability here.

Modified:
    cfe/trunk/include/clang/Basic/IdentifierTable.h
    cfe/trunk/lib/Serialization/ASTWriter.cpp

Modified: cfe/trunk/include/clang/Basic/IdentifierTable.h
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/IdentifierTable.h?rev=233332&r1=233331&r2=233332&view=diff
==============================================================================
--- cfe/trunk/include/clang/Basic/IdentifierTable.h (original)
+++ cfe/trunk/include/clang/Basic/IdentifierTable.h Thu Mar 26 18:54:15 2015
@@ -308,7 +308,12 @@ public:
     else
       RecomputeNeedsHandleIdentifier();
   }
-  
+
+  /// \brief Provide less than operator for lexicographical sorting.
+  bool operator<(const IdentifierInfo &RHS) const {
+    return getName() < RHS.getName();
+  }
+
 private:
   /// The Preprocessor::HandleIdentifier does several special (but rare)
   /// things to identifiers of various sorts.  For example, it changes the

Modified: cfe/trunk/lib/Serialization/ASTWriter.cpp
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Serialization/ASTWriter.cpp?rev=233332&r1=233331&r2=233332&view=diff
==============================================================================
--- cfe/trunk/lib/Serialization/ASTWriter.cpp (original)
+++ cfe/trunk/lib/Serialization/ASTWriter.cpp Thu Mar 26 18:54:15 2015
@@ -3504,10 +3504,16 @@ void ASTWriter::WriteIdentifierTable(Pre
     // table to enable checking of the predefines buffer in the case
     // where the user adds new macro definitions when building the AST
     // file.
+    SmallVector<const IdentifierInfo *, 128> IIs;
     for (IdentifierTable::iterator ID = PP.getIdentifierTable().begin(),
                                 IDEnd = PP.getIdentifierTable().end();
          ID != IDEnd; ++ID)
-      getIdentifierRef(ID->second);
+      IIs.push_back(ID->second);
+    // Sort the identifiers lexicographically before getting them references so
+    // that their order is stable.
+    std::sort(IIs.begin(), IIs.end(), llvm::less_ptr<IdentifierInfo>());
+    for (const IdentifierInfo *II : IIs)
+      getIdentifierRef(II);
 
     // Create the on-disk hash table representation. We only store offsets
     // for identifiers that appear here for the first time.
@@ -4504,15 +4510,17 @@ void ASTWriter::WriteASTCore(Sema &SemaR
 
   // Make sure all decls associated with an identifier are registered for
   // serialization.
-  llvm::SmallVector<const IdentifierInfo*, 256> IIsToVisit;
+  llvm::SmallVector<const IdentifierInfo*, 256> IIs;
   for (IdentifierTable::iterator ID = PP.getIdentifierTable().begin(),
                               IDEnd = PP.getIdentifierTable().end();
        ID != IDEnd; ++ID) {
     const IdentifierInfo *II = ID->second;
     if (!Chain || !II->isFromAST() || II->hasChangedSinceDeserialization())
-      IIsToVisit.push_back(II);
+      IIs.push_back(II);
   }
-  for (const IdentifierInfo *II : IIsToVisit) {
+  // Sort the identifiers to visit based on their name.
+  std::sort(IIs.begin(), IIs.end(), llvm::less_ptr<IdentifierInfo>());
+  for (const IdentifierInfo *II : IIs) {
     for (IdentifierResolver::iterator D = SemaRef.IdResolver.begin(II),
                                    DEnd = SemaRef.IdResolver.end();
          D != DEnd; ++D) {





More information about the cfe-commits mailing list