[PATCH] D35923: Fix PR32332 - PCM nondeterminism with builtins caused by global module index

Adrian Prantl via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jul 26 15:54:29 PDT 2017


aprantl created this revision.

Long story short: I think that it is too dangerous to rely on the global module index for anything but diagnostics.

The global module index is only rebuilt in two cases:

  a. When triggered by Sema for diagnosing errors
  b. At the very end of FrontendAction::Execute() 

This patch is deleting a shortcut in ModuleManager::visit() that uses the global module index as a negative cache to determine that it does not make sense to open a module when looking for a specific identifier.

In the test case (see also the PR https://bugs.llvm.org/show_bug.cgi?id=32332)  we get different PCM output for B.pcm when

1. building B.pcm (which also builds the dependency A.pcm in one go)
2. building A.pcm, then building B.pcm in a separate clang invocation

Because of when the global module index is built, (1) does not use the above shortcut, but (2) does. The symbol were this makes a difference is a built-in macro __FLT_EPSILON__ which does not belong to any module but still makes it into the global module index as not being defined in any module. This causes the symbol to be serialized in (1) A.pcm and (2) A.pcm and B.pcm.

I have considered the alternative of hunting down why builtins are serialized into the PCM in the first place, but frankly I feel like this is just the tip of the iceberg, and that relying on the global module index this way is just asking for trouble in concurrent or incremental builds. I have looked at the performance impact of the change, and in the project I built (a large project, ~30min build time) the wall-clock numbers were in the noise.

rdar://problem/31122421


https://reviews.llvm.org/D35923

Files:
  lib/Serialization/ModuleManager.cpp
  test/Modules/Inputs/builtins/a.h
  test/Modules/Inputs/builtins/b.h
  test/Modules/Inputs/builtins/module.modulemap
  test/Modules/builtins-incremental.m


Index: test/Modules/builtins-incremental.m
===================================================================
--- /dev/null
+++ test/Modules/builtins-incremental.m
@@ -0,0 +1,28 @@
+// RUN: rm -rf %T/cache %T/cache1 %T/cache2 
+//
+// Build B which imports A.
+// RUN: %clang_cc1 -cc1 -fsyntax-only -fmodules -fimplicit-module-maps \
+// RUN:   -fmodules-cache-path=%T/cache -fdisable-module-hash \
+// RUN:   -I %S/Inputs/builtins %s
+// RUN: mv %T/cache %T/cache1
+//
+// Build only A.
+// RUN: echo '@import A;' >%t.m
+// RUN: %clang_cc1 -cc1 -fsyntax-only -fmodules -fimplicit-module-maps \
+// RUN:   -fmodules-cache-path=%T/cache -fdisable-module-hash \
+// RUN:   -I %S/Inputs/builtins %t.m
+//
+// Incrementally build B which imports A.
+// RUN: %clang_cc1 -cc1 -fsyntax-only -fmodules -fimplicit-module-maps \
+// RUN:   -fmodules-cache-path=%T/cache -fdisable-module-hash \
+// RUN:   -I %S/Inputs/builtins %s
+// RUN: mv %T/cache %T/cache2
+//
+// RUN: llvm-bcanalyzer %T/cache1/B.pcm >%t.dump
+// RUN: llvm-bcanalyzer %T/cache2/B.pcm >>%t.dump
+// RUN: cat %t.dump | FileCheck %s
+
+// Verify the identifier table has the same size in both versions of B.pcm:
+// CHECK:  1 [[INDENT_SIZE:[0-9]+]] 100.00 IDENTIFIER_TABLE
+// CHECK:  1 [[INDENT_SIZE]] 100.00 IDENTIFIER_TABLE
+ at import B;
Index: test/Modules/Inputs/builtins/module.modulemap
===================================================================
--- /dev/null
+++ test/Modules/Inputs/builtins/module.modulemap
@@ -0,0 +1,7 @@
+module A {
+  header "a.h"
+}
+
+module B {
+  header "b.h"
+}
Index: test/Modules/Inputs/builtins/b.h
===================================================================
--- /dev/null
+++ test/Modules/Inputs/builtins/b.h
@@ -0,0 +1,3 @@
+#include "a.h"
+ at interface B : A
+ at end
Index: test/Modules/Inputs/builtins/a.h
===================================================================
--- /dev/null
+++ test/Modules/Inputs/builtins/a.h
@@ -0,0 +1,3 @@
+#include <float.h>
+ at interface A
+ at end
Index: lib/Serialization/ModuleManager.cpp
===================================================================
--- lib/Serialization/ModuleManager.cpp
+++ lib/Serialization/ModuleManager.cpp
@@ -363,19 +363,6 @@
 
   VisitState *State = allocateVisitState();
   unsigned VisitNumber = State->NextVisitNumber++;
-
-  // If the caller has provided us with a hit-set that came from the global
-  // module index, mark every module file in common with the global module
-  // index that is *not* in that set as 'visited'.
-  if (ModuleFilesHit && !ModulesInCommonWithGlobalIndex.empty()) {
-    for (unsigned I = 0, N = ModulesInCommonWithGlobalIndex.size(); I != N; ++I)
-    {
-      ModuleFile *M = ModulesInCommonWithGlobalIndex[I];
-      if (!ModuleFilesHit->count(M))
-        State->VisitNumber[M->Index] = VisitNumber;
-    }
-  }
-
   for (unsigned I = 0, N = VisitOrder.size(); I != N; ++I) {
     ModuleFile *CurrentModule = VisitOrder[I];
     // Should we skip this module file?


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D35923.108367.patch
Type: text/x-patch
Size: 2983 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/cfe-commits/attachments/20170726/5ab050df/attachment-0001.bin>


More information about the cfe-commits mailing list