[clang-tools-extra] [clangd] [Modules] Support Reusable Modules Builder (PR #106683)

Chuanqi Xu via cfe-commits cfe-commits at lists.llvm.org
Fri Aug 30 01:13:01 PDT 2024


https://github.com/ChuanqiXu9 created https://github.com/llvm/llvm-project/pull/106683

This is the following patch of https://github.com/llvm/llvm-project/pull/66462 to optimize its performance.

# Motivation

To avoid data races, we choose  "per file owns its dependent modules" model. That said, every TU will own all the required module files so that we don't need to worry about thread safety. And it looks like we succeeded that we focus on the interfaces and structure of modules support in clangd. But after all, this model is not good for performance. Image we have 10000 TUs import std, we will have 10000 std.pcm in the memory. That is terrible both in time and space. 

Given the current modules support in clangd works pretty well (almost every issue report I received is more or less a clang's issue), I'd like to improve the performance.

# High Level Changes

After this patch, the built module files will be owned by the module builder and each TU will only have a reference to the built module files. 

The module builder have a map from module names to built module files. When a new TU ask for a module file, the module builder will check if the module file lives in the map  and if the module file are up to date. If yes, the module file will be returned. If no, the module file entry would be erased in the module builder. We use `shared_ptr<>` to track module file here so that the other TU owning the out dated module file won't be affected. The out dated module file will be removed automatically if other TU gets update or closed.

(I know the out dated module file may not exist due to the `CanReuse` mechanism. But the design here is natural and can be seen as a redundant design to make it more robust.)

When we a build a module, we will use the mutex and the condition variable in the working thread to build it exclusively. All other threads that also want the module file would have to wait for that working thread. It might not sounds great but I think if we want to make it asynchronous, we have to refactor TUScheduler as far as I know.

# Code Structure Changes

Thanks for the previous hard working reviewing, the interfaces almost don't change in this patch. Almost all the work are isolated in ModulesBuilder.cpp. A outliner is that we convert `ModulesBuilder` to an abstract class since the implementation class needs to own the module files.

And the core function to review is `ReusableModulesBuilder::getOrBuildModuleFile`. It implements the core logic to fetch the module file from the cache or build it if the module file is not in the cache or out of date. And other important entities are `BuildingModuleMutexes`, `BuildingModuleCVs`, `BuildingModules` and `ModulesBuildingMutex`. These are mutexes and condition variables to make sure the thread safety.

# User experience

I've implemented this in our downstream and ask our users to use it. I also sent it https://github.com/ChuanqiXu9/clangd-for-modules here as pre-version. The feedbacks are pretty good. And I didn't receive any bug reports (about the reusable modules builder) yet.

# Other potential improvement

The are other two potential improvements can be done:
1. Scanning cache and a mechanism to get the required module information more quickly.  (Like the module maps in https://github.com/ChuanqiXu9/clangd-for-modules)
2. Persist the module files. So that after we close the vscode and reopen it, we can reuse the built module files since the last invocation.


>From 1cd6f553f8c8f0f0a8dc691d3dcb413db9f4c051 Mon Sep 17 00:00:00 2001
From: Chuanqi Xu <yedeng.yd at linux.alibaba.com>
Date: Fri, 30 Aug 2024 15:11:07 +0800
Subject: [PATCH] [clangd] [Modules] Support Reusable Modules Builder

---
 clang-tools-extra/clangd/ClangdLSPServer.cpp  |   4 +-
 clang-tools-extra/clangd/ClangdLSPServer.h    |   2 +-
 clang-tools-extra/clangd/ModulesBuilder.cpp   | 357 ++++++++++++++----
 clang-tools-extra/clangd/ModulesBuilder.h     |  11 +-
 clang-tools-extra/clangd/tool/Check.cpp       |   6 +-
 .../unittests/PrerequisiteModulesTest.cpp     |  70 +++-
 6 files changed, 362 insertions(+), 88 deletions(-)

diff --git a/clang-tools-extra/clangd/ClangdLSPServer.cpp b/clang-tools-extra/clangd/ClangdLSPServer.cpp
index 06573a57554245..53a18bd4ecd1a7 100644
--- a/clang-tools-extra/clangd/ClangdLSPServer.cpp
+++ b/clang-tools-extra/clangd/ClangdLSPServer.cpp
@@ -567,8 +567,8 @@ void ClangdLSPServer::onInitialize(const InitializeParams &Params,
               std::move(Mangler));
 
   if (Opts.EnableExperimentalModulesSupport) {
-    ModulesManager.emplace(*CDB);
-    Opts.ModulesManager = &*ModulesManager;
+    ModulesManager = ModulesBuilder::getModulesBuilder(*CDB);
+    Opts.ModulesManager = ModulesManager.get();
   }
 
   {
diff --git a/clang-tools-extra/clangd/ClangdLSPServer.h b/clang-tools-extra/clangd/ClangdLSPServer.h
index 0b8e4720f53236..4b8f048b430fe9 100644
--- a/clang-tools-extra/clangd/ClangdLSPServer.h
+++ b/clang-tools-extra/clangd/ClangdLSPServer.h
@@ -327,7 +327,7 @@ class ClangdLSPServer : private ClangdServer::Callbacks,
   // The ClangdServer is created by the "initialize" LSP method.
   std::optional<ClangdServer> Server;
   // Manages to build module files.
-  std::optional<ModulesBuilder> ModulesManager;
+  std::unique_ptr<ModulesBuilder> ModulesManager;
 };
 } // namespace clangd
 } // namespace clang
diff --git a/clang-tools-extra/clangd/ModulesBuilder.cpp b/clang-tools-extra/clangd/ModulesBuilder.cpp
index 1eeff468ef1236..82aa7d181f3f5e 100644
--- a/clang-tools-extra/clangd/ModulesBuilder.cpp
+++ b/clang-tools-extra/clangd/ModulesBuilder.cpp
@@ -12,6 +12,7 @@
 #include "clang/Frontend/FrontendAction.h"
 #include "clang/Frontend/FrontendActions.h"
 #include "clang/Serialization/ASTReader.h"
+#include "llvm/ADT/ScopeExit.h"
 
 namespace clang {
 namespace clangd {
@@ -173,33 +174,27 @@ bool IsModuleFilesUpToDate(
   });
 }
 
-// StandalonePrerequisiteModules - stands for PrerequisiteModules for which all
+// ReusablePrerequisiteModules - stands for PrerequisiteModules for which all
 // the required modules are built successfully. All the module files
-// are owned by the StandalonePrerequisiteModules class.
-//
-// Any of the built module files won't be shared with other instances of the
-// class. So that we can avoid worrying thread safety.
-//
-// We don't need to worry about duplicated module names here since the standard
-// guarantees the module names should be unique to a program.
-class StandalonePrerequisiteModules : public PrerequisiteModules {
+// are owned by the modules builder.
+class ReusablePrerequisiteModules : public PrerequisiteModules {
 public:
-  StandalonePrerequisiteModules() = default;
+  ReusablePrerequisiteModules() = default;
 
-  StandalonePrerequisiteModules(const StandalonePrerequisiteModules &) = delete;
-  StandalonePrerequisiteModules
-  operator=(const StandalonePrerequisiteModules &) = delete;
-  StandalonePrerequisiteModules(StandalonePrerequisiteModules &&) = delete;
-  StandalonePrerequisiteModules
-  operator=(StandalonePrerequisiteModules &&) = delete;
+  ReusablePrerequisiteModules(const ReusablePrerequisiteModules &) = delete;
+  ReusablePrerequisiteModules
+  operator=(const ReusablePrerequisiteModules &) = delete;
+  ReusablePrerequisiteModules(ReusablePrerequisiteModules &&) = delete;
+  ReusablePrerequisiteModules
+  operator=(ReusablePrerequisiteModules &&) = delete;
 
-  ~StandalonePrerequisiteModules() override = default;
+  ~ReusablePrerequisiteModules() override = default;
 
   void adjustHeaderSearchOptions(HeaderSearchOptions &Options) const override {
     // Appending all built module files.
     for (auto &RequiredModule : RequiredModules)
       Options.PrebuiltModuleFiles.insert_or_assign(
-          RequiredModule.ModuleName, RequiredModule.ModuleFilePath);
+          RequiredModule->ModuleName, RequiredModule->ModuleFilePath);
   }
 
   bool canReuse(const CompilerInvocation &CI,
@@ -209,54 +204,31 @@ class StandalonePrerequisiteModules : public PrerequisiteModules {
     return BuiltModuleNames.contains(ModuleName);
   }
 
-  void addModuleFile(llvm::StringRef ModuleName,
-                     llvm::StringRef ModuleFilePath) {
-    RequiredModules.emplace_back(ModuleName, ModuleFilePath);
-    BuiltModuleNames.insert(ModuleName);
+  void addModuleFile(std::shared_ptr<ModuleFile> BMI) {
+    BuiltModuleNames.insert(BMI->ModuleName);
+    RequiredModules.emplace_back(std::move(BMI));
   }
 
 private:
-  llvm::SmallVector<ModuleFile, 8> RequiredModules;
+  mutable llvm::SmallVector<std::shared_ptr<ModuleFile>, 8> RequiredModules;
   // A helper class to speedup the query if a module is built.
   llvm::StringSet<> BuiltModuleNames;
 };
 
-// Build a module file for module with `ModuleName`. The information of built
-// module file are stored in \param BuiltModuleFiles.
-llvm::Error buildModuleFile(llvm::StringRef ModuleName,
+/// Build a module file for module with `ModuleName`. The information of built
+/// module file are stored in \param BuiltModuleFiles.
+llvm::Expected<ModuleFile> buildModuleFile(llvm::StringRef ModuleName,
+                            PathRef ModuleUnitFileName,
                             const GlobalCompilationDatabase &CDB,
-                            const ThreadsafeFS &TFS, ProjectModules &MDB,
+                            const ThreadsafeFS &TFS,
                             PathRef ModuleFilesPrefix,
-                            StandalonePrerequisiteModules &BuiltModuleFiles) {
-  if (BuiltModuleFiles.isModuleUnitBuilt(ModuleName))
-    return llvm::Error::success();
-
-  PathRef ModuleUnitFileName = MDB.getSourceForModuleName(ModuleName);
-  // It is possible that we're meeting third party modules (modules whose
-  // source are not in the project. e.g, the std module may be a third-party
-  // module for most projects) or something wrong with the implementation of
-  // ProjectModules.
-  // FIXME: How should we treat third party modules here? If we want to ignore
-  // third party modules, we should return true instead of false here.
-  // Currently we simply bail out.
-  if (ModuleUnitFileName.empty())
-    return llvm::createStringError("Failed to get the primary source");
-
+                            const ReusablePrerequisiteModules &BuiltModuleFiles) {
   // Try cheap operation earlier to boil-out cheaply if there are problems.
   auto Cmd = CDB.getCompileCommand(ModuleUnitFileName);
   if (!Cmd)
     return llvm::createStringError(
         llvm::formatv("No compile command for {0}", ModuleUnitFileName));
 
-  for (auto &RequiredModuleName : MDB.getRequiredModules(ModuleUnitFileName)) {
-    // Return early if there are errors building the module file.
-    if (llvm::Error Err = buildModuleFile(RequiredModuleName, CDB, TFS, MDB,
-                                          ModuleFilesPrefix, BuiltModuleFiles))
-      return llvm::createStringError(
-          llvm::formatv("Failed to build dependency {0}: {1}",
-                        RequiredModuleName, llvm::toString(std::move(Err))));
-  }
-
   Cmd->Output = getModuleFilePath(ModuleName, ModuleFilesPrefix);
 
   ParseInputs Inputs;
@@ -297,14 +269,164 @@ llvm::Error buildModuleFile(llvm::StringRef ModuleName,
   if (Clang->getDiagnostics().hasErrorOccurred())
     return llvm::createStringError("Compilation failed");
 
-  BuiltModuleFiles.addModuleFile(ModuleName, Inputs.CompileCommand.Output);
-  return llvm::Error::success();
+  return ModuleFile{ModuleName, Inputs.CompileCommand.Output};
+}
+
+class ReusableModulesBuilder : public ModulesBuilder {
+public:
+  ReusableModulesBuilder(const GlobalCompilationDatabase &CDB) : CDB(CDB) {}
+
+  ReusableModulesBuilder(const ReusableModulesBuilder &) = delete;
+  ReusableModulesBuilder(ReusableModulesBuilder &&) = delete;
+
+  ReusableModulesBuilder &operator=(const ReusableModulesBuilder &) = delete;
+  ReusableModulesBuilder &operator=(ReusableModulesBuilder &&) = delete;
+
+  std::unique_ptr<PrerequisiteModules>
+  buildPrerequisiteModulesFor(PathRef File, const ThreadsafeFS &TFS) override;
+
+private:
+  llvm::Error getOrBuildModuleFile(StringRef ModuleName, const ThreadsafeFS &TFS,
+                            ProjectModules &MDB,
+                            ReusablePrerequisiteModules &RequiredModules);
+
+  std::shared_ptr<ModuleFile>
+  getValidModuleFile(StringRef ModuleName, ProjectModules &MDB,
+                     const ThreadsafeFS &TFS,
+                     PrerequisiteModules &BuiltModuleFiles);
+  /// This should only be called by getValidModuleFile. This is unlocked version
+  /// of getValidModuleFile. The function is extracted to avoid dead locks when
+  /// recursing.
+  std::shared_ptr<ModuleFile>
+  isValidModuleFileUnlocked(StringRef ModuleName, ProjectModules &MDB,
+                            const ThreadsafeFS &TFS,
+                            PrerequisiteModules &BuiltModuleFiles);
+
+  llvm::StringMap<std::shared_ptr<ModuleFile>> ModuleFiles;
+  // Mutex to guard accesses to ModuleFiles.
+  std::mutex ModuleFilesMutex;
+
+  // We should only build a unique module at most at the same time.
+  // When we want to build a module, use the mutex to lock it and use the
+  // condition variable to notify other threads the status of the build results.
+  //
+  // Store the mutex and the condition_variable in shared_ptr since they may be
+  // accessed by many threads.
+  llvm::StringMap<std::shared_ptr<std::mutex>> BuildingModuleMutexes;
+  llvm::StringMap<std::shared_ptr<std::condition_variable>> BuildingModuleCVs;
+  // The building modules set. A successed built module or a failed module or
+  // an unbuilt module shouldn't be in this set.
+  // This set is helpful to control the behavior of the condition variables.
+  llvm::StringSet<> BuildingModules;
+  // Lock when we access BuildingModules, BuildingModuleMutexes and BuildingModuleCVs.
+  std::mutex ModulesBuildingMutex;
+
+  void startBuildingModule(StringRef ModuleName) {
+    std::lock_guard<std::mutex> _(ModulesBuildingMutex);
+    BuildingModules.insert(ModuleName);
+  }
+  void endBuildingModule(StringRef ModuleName) {
+    std::lock_guard<std::mutex> _(ModulesBuildingMutex);
+    BuildingModules.erase(ModuleName);
+  }
+  bool isBuildingModule(StringRef ModuleName) {
+    std::lock_guard<std::mutex> _(ModulesBuildingMutex);
+    return BuildingModules.contains(ModuleName);
+  }
+
+  /// An RAII object to guard the process to build a specific module.
+  struct ModuleBuildingSharedOwner {
+  public:
+    ModuleBuildingSharedOwner(StringRef ModuleName,
+                              std::shared_ptr<std::mutex> &Mutex,
+                              std::shared_ptr<std::condition_variable> &CV,
+                              ReusableModulesBuilder &Builder)
+        : ModuleName(ModuleName), Mutex(Mutex), CV(CV), Builder(Builder) {
+      IsFirstTask = (Mutex.use_count() == 2);
+    }
+
+    ~ModuleBuildingSharedOwner();
+
+    bool isUniqueBuildingOwner() { return IsFirstTask; }
+
+    std::mutex &getMutex() { return *Mutex; }
+
+    std::condition_variable &getCV() { return *CV; }
+
+  private:
+    StringRef ModuleName;
+    std::shared_ptr<std::mutex> Mutex;
+    std::shared_ptr<std::condition_variable> CV;
+    ReusableModulesBuilder &Builder;
+    bool IsFirstTask;
+  };
+
+  ModuleBuildingSharedOwner
+  getOrCreateModuleBuildingOwner(StringRef ModuleName);
+
+  const GlobalCompilationDatabase &CDB;
+};
+
+ReusableModulesBuilder::ModuleBuildingSharedOwner::
+    ~ModuleBuildingSharedOwner() {
+  std::lock_guard<std::mutex> _(Builder.ModulesBuildingMutex);
+
+  Mutex.reset();
+  CV.reset();
+
+  // Try to release the memory in builder if possible.
+  if (auto Iter = Builder.BuildingModuleCVs.find(ModuleName);
+      Iter != Builder.BuildingModuleCVs.end() &&
+      Iter->getValue().use_count() == 1)
+    Builder.BuildingModuleCVs.erase(Iter);
+
+  if (auto Iter = Builder.BuildingModuleMutexes.find(ModuleName);
+      Iter != Builder.BuildingModuleMutexes.end() &&
+      Iter->getValue().use_count() == 1)
+    Builder.BuildingModuleMutexes.erase(Iter);
+}
+
+std::shared_ptr<ModuleFile> ReusableModulesBuilder::isValidModuleFileUnlocked(
+    StringRef ModuleName, ProjectModules &MDB, const ThreadsafeFS &TFS,
+    PrerequisiteModules &BuiltModuleFiles) {
+  auto Iter = ModuleFiles.find(ModuleName);
+  if (Iter != ModuleFiles.end()) {
+    if (!IsModuleFileUpToDate(Iter->second->ModuleFilePath, BuiltModuleFiles)) {
+      log("Found not-up-date module file {0} for module {1} in cache",
+          Iter->second->ModuleFilePath, ModuleName);
+      ModuleFiles.erase(Iter);
+      return nullptr;
+    }
+
+    if (llvm::any_of(
+            MDB.getRequiredModules(MDB.getSourceForModuleName(ModuleName)),
+            [&MDB, &TFS, &BuiltModuleFiles, this](auto &&RequiredModuleName) {
+              return !isValidModuleFileUnlocked(RequiredModuleName, MDB, TFS,
+                                                BuiltModuleFiles);
+            })) {
+      ModuleFiles.erase(Iter);
+      return nullptr;
+    }
+
+    return Iter->second;
+  }
+
+  log("Don't find {0} in cache", ModuleName);
+
+  return nullptr;
+}
+
+std::shared_ptr<ModuleFile> ReusableModulesBuilder::getValidModuleFile(
+    StringRef ModuleName, ProjectModules &MDB, const ThreadsafeFS &TFS,
+    PrerequisiteModules &BuiltModuleFiles) {
+  std::lock_guard<std::mutex> _(ModuleFilesMutex);
+
+  return isValidModuleFileUnlocked(ModuleName, MDB, TFS, BuiltModuleFiles);
 }
-} // namespace
 
 std::unique_ptr<PrerequisiteModules>
-ModulesBuilder::buildPrerequisiteModulesFor(PathRef File,
-                                            const ThreadsafeFS &TFS) const {
+ReusableModulesBuilder::buildPrerequisiteModulesFor(PathRef File,
+                                            const ThreadsafeFS &TFS) {
   std::unique_ptr<ProjectModules> MDB = CDB.getProjectModules(File);
   if (!MDB) {
     elog("Failed to get Project Modules information for {0}", File);
@@ -313,20 +435,19 @@ ModulesBuilder::buildPrerequisiteModulesFor(PathRef File,
 
   std::vector<std::string> RequiredModuleNames = MDB->getRequiredModules(File);
   if (RequiredModuleNames.empty())
-    return std::make_unique<StandalonePrerequisiteModules>();
+    return std::make_unique<ReusablePrerequisiteModules>();
 
   llvm::SmallString<256> ModuleFilesPrefix = getUniqueModuleFilesPath(File);
 
   log("Trying to build required modules for {0} in {1}", File,
       ModuleFilesPrefix);
 
-  auto RequiredModules = std::make_unique<StandalonePrerequisiteModules>();
+  auto RequiredModules = std::make_unique<ReusablePrerequisiteModules>();
 
   for (llvm::StringRef RequiredModuleName : RequiredModuleNames) {
     // Return early if there is any error.
-    if (llvm::Error Err =
-            buildModuleFile(RequiredModuleName, CDB, TFS, *MDB.get(),
-                            ModuleFilesPrefix, *RequiredModules.get())) {
+    if (llvm::Error Err = 
+        getOrBuildModuleFile(RequiredModuleName, TFS, *MDB.get(), *RequiredModules.get())) {
       elog("Failed to build module {0}; due to {1}", RequiredModuleName,
            toString(std::move(Err)));
       return std::make_unique<FailedPrerequisiteModules>();
@@ -338,7 +459,111 @@ ModulesBuilder::buildPrerequisiteModulesFor(PathRef File,
   return std::move(RequiredModules);
 }
 
-bool StandalonePrerequisiteModules::canReuse(
+ReusableModulesBuilder::ModuleBuildingSharedOwner
+ReusableModulesBuilder::getOrCreateModuleBuildingOwner(
+    StringRef ModuleName) {
+  std::lock_guard<std::mutex> _(ModulesBuildingMutex);
+
+  auto MutexIter = BuildingModuleMutexes.find(ModuleName);
+  if (MutexIter == BuildingModuleMutexes.end())
+    MutexIter = BuildingModuleMutexes
+                    .try_emplace(ModuleName, std::make_shared<std::mutex>())
+                    .first;
+
+  auto CVIter = BuildingModuleCVs.find(ModuleName);
+  if (CVIter == BuildingModuleCVs.end())
+    CVIter = BuildingModuleCVs
+                 .try_emplace(ModuleName,
+                              std::make_shared<std::condition_variable>())
+                 .first;
+
+  return ModuleBuildingSharedOwner(ModuleName, MutexIter->getValue(),
+                                   CVIter->getValue(), *this);
+}
+
+llvm::Error ReusableModulesBuilder::getOrBuildModuleFile(
+    StringRef ModuleName, const ThreadsafeFS &TFS, ProjectModules &MDB,
+    ReusablePrerequisiteModules &BuiltModuleFiles) {
+  if (BuiltModuleFiles.isModuleUnitBuilt(ModuleName))
+    return llvm::Error::success();
+
+  PathRef ModuleUnitFileName = MDB.getSourceForModuleName(ModuleName);
+  /// It is possible that we're meeting third party modules (modules whose
+  /// source are not in the project. e.g, the std module may be a third-party
+  /// module for most project) or something wrong with the implementation of
+  /// ProjectModules.
+  /// FIXME: How should we treat third party modules here? If we want to ignore
+  /// third party modules, we should return true instead of false here.
+  /// Currently we simply bail out.
+  if (ModuleUnitFileName.empty())
+    return llvm::createStringError(llvm::formatv("Don't get the module unit for module {0}", ModuleName));
+
+  for (auto &RequiredModuleName : MDB.getRequiredModules(ModuleUnitFileName))
+    // Return early if there are errors building the module file.
+    if (!getOrBuildModuleFile(RequiredModuleName, TFS, MDB, BuiltModuleFiles))
+      return llvm::createStringError(llvm::formatv("Failed to build module {0}",
+                                                   RequiredModuleName));
+
+  if (std::shared_ptr<ModuleFile> Cached =
+          getValidModuleFile(ModuleName, MDB, TFS, BuiltModuleFiles)) {
+    log("Reusing module {0} from {1}", ModuleName, Cached->ModuleFilePath);
+    BuiltModuleFiles.addModuleFile(Cached);
+    return llvm::Error::success();
+  }
+
+  ModuleBuildingSharedOwner ModuleBuildingOwner =
+      getOrCreateModuleBuildingOwner(ModuleName);
+
+  std::condition_variable &CV = ModuleBuildingOwner.getCV();
+  std::unique_lock<std::mutex> lk(ModuleBuildingOwner.getMutex());
+  if (!ModuleBuildingOwner.isUniqueBuildingOwner()) {
+    log("Waiting other task for module {0}", ModuleName);
+    CV.wait(lk, [this, ModuleName] { return !isBuildingModule(ModuleName); });
+
+    // Try to access the built module files from other threads manually.
+    // We don't call getValidModuleFile here since it may be too heavy.
+    std::lock_guard<std::mutex> _(ModuleFilesMutex);
+    auto Iter = ModuleFiles.find(ModuleName);
+    if (Iter != ModuleFiles.end()) {
+      log("Got module file from other task building {0}", ModuleName);
+      BuiltModuleFiles.addModuleFile(Iter->second);
+      return llvm::Error::success();
+    }
+
+    // If the module file is not in the cache, it indicates that the building
+    // from other thread failed, so we give up earlier in this case to avoid
+    // wasting time.
+    return llvm::createStringError(llvm::formatv("The module file {0} may be failed to build in other thread.",
+                                   ModuleName));
+  }
+
+  log("Building module {0}", ModuleName);
+  startBuildingModule(ModuleName);
+
+  auto _ = llvm::make_scope_exit([&]() {
+    endBuildingModule(ModuleName);
+    CV.notify_all();
+  });
+
+  llvm::SmallString<256> ModuleFilesPrefix =
+      getUniqueModuleFilesPath(ModuleUnitFileName);
+
+  llvm::Expected<ModuleFile> MF = buildModuleFile(ModuleName, ModuleUnitFileName, CDB, TFS,
+                      ModuleFilesPrefix, BuiltModuleFiles);
+  if (llvm::Error Err = MF.takeError())
+    return Err;
+
+  log("Built module {0} to {1}", ModuleName, MF->ModuleFilePath);
+
+  std::lock_guard<std::mutex> __(ModuleFilesMutex);
+  auto BuiltModuleFile = std::make_shared<ModuleFile>(std::move(*MF));
+  ModuleFiles.insert_or_assign(ModuleName, BuiltModuleFile);
+  BuiltModuleFiles.addModuleFile(std::move(BuiltModuleFile));
+
+  return llvm::Error::success();
+}
+
+bool ReusablePrerequisiteModules::canReuse(
     const CompilerInvocation &CI,
     llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem> VFS) const {
   if (RequiredModules.empty())
@@ -346,9 +571,15 @@ bool StandalonePrerequisiteModules::canReuse(
 
   SmallVector<StringRef> BMIPaths;
   for (auto &MF : RequiredModules)
-    BMIPaths.push_back(MF.ModuleFilePath);
+    BMIPaths.push_back(MF->ModuleFilePath);
   return IsModuleFilesUpToDate(BMIPaths, *this);
 }
+} // namespace
+
+std::unique_ptr<ModulesBuilder>
+ModulesBuilder::getModulesBuilder(const GlobalCompilationDatabase &CDB) {
+  return std::make_unique<ReusableModulesBuilder>(CDB);
+}
 
 } // namespace clangd
 } // namespace clang
diff --git a/clang-tools-extra/clangd/ModulesBuilder.h b/clang-tools-extra/clangd/ModulesBuilder.h
index 0514e7486475d0..4ccefbed1e8807 100644
--- a/clang-tools-extra/clangd/ModulesBuilder.h
+++ b/clang-tools-extra/clangd/ModulesBuilder.h
@@ -85,7 +85,8 @@ class PrerequisiteModules {
 /// different versions and different source files.
 class ModulesBuilder {
 public:
-  ModulesBuilder(const GlobalCompilationDatabase &CDB) : CDB(CDB) {}
+  ModulesBuilder() = default;
+  virtual ~ModulesBuilder() = default;
 
   ModulesBuilder(const ModulesBuilder &) = delete;
   ModulesBuilder(ModulesBuilder &&) = delete;
@@ -93,11 +94,11 @@ class ModulesBuilder {
   ModulesBuilder &operator=(const ModulesBuilder &) = delete;
   ModulesBuilder &operator=(ModulesBuilder &&) = delete;
 
-  std::unique_ptr<PrerequisiteModules>
-  buildPrerequisiteModulesFor(PathRef File, const ThreadsafeFS &TFS) const;
+  virtual std::unique_ptr<PrerequisiteModules>
+  buildPrerequisiteModulesFor(PathRef File, const ThreadsafeFS &TFS) = 0;
 
-private:
-  const GlobalCompilationDatabase &CDB;
+  static std::unique_ptr<ModulesBuilder>
+  getModulesBuilder(const GlobalCompilationDatabase &CDB);
 };
 
 } // namespace clangd
diff --git a/clang-tools-extra/clangd/tool/Check.cpp b/clang-tools-extra/clangd/tool/Check.cpp
index bc2eaa77a66eec..88638905a6b4de 100644
--- a/clang-tools-extra/clangd/tool/Check.cpp
+++ b/clang-tools-extra/clangd/tool/Check.cpp
@@ -152,7 +152,7 @@ class Checker {
   ParseInputs Inputs;
   std::unique_ptr<CompilerInvocation> Invocation;
   format::FormatStyle Style;
-  std::optional<ModulesBuilder> ModulesManager;
+  std::unique_ptr<ModulesBuilder> ModulesManager;
   // from buildAST
   std::shared_ptr<const PreambleData> Preamble;
   std::optional<ParsedAST> AST;
@@ -218,8 +218,8 @@ class Checker {
     }
     if (Opts.EnableExperimentalModulesSupport) {
       if (!ModulesManager)
-        ModulesManager.emplace(*CDB);
-      Inputs.ModulesManager = &*ModulesManager;
+        ModulesManager = ModulesBuilder::getModulesBuilder(*CDB);
+      Inputs.ModulesManager = ModulesManager.get();
     }
     log("Parsing command...");
     Invocation =
diff --git a/clang-tools-extra/clangd/unittests/PrerequisiteModulesTest.cpp b/clang-tools-extra/clangd/unittests/PrerequisiteModulesTest.cpp
index 7bbb95c8b8d67f..3ad11c47158464 100644
--- a/clang-tools-extra/clangd/unittests/PrerequisiteModulesTest.cpp
+++ b/clang-tools-extra/clangd/unittests/PrerequisiteModulesTest.cpp
@@ -147,11 +147,12 @@ void use() {
 }
   )cpp");
 
-  ModulesBuilder Builder(CDB);
+  std::unique_ptr<ModulesBuilder> Builder =
+      ModulesBuilder::getModulesBuilder(CDB);
 
   // NonModular.cpp is not related to modules. So nothing should be built.
   auto NonModularInfo =
-      Builder.buildPrerequisiteModulesFor(getFullPath("NonModular.cpp"), FS);
+      Builder->buildPrerequisiteModulesFor(getFullPath("NonModular.cpp"), FS);
   EXPECT_TRUE(NonModularInfo);
 
   HeaderSearchOptions HSOpts;
@@ -176,9 +177,10 @@ module;
 export module M;
   )cpp");
 
-  ModulesBuilder Builder(CDB);
+  std::unique_ptr<ModulesBuilder> Builder =
+      ModulesBuilder::getModulesBuilder(CDB);
 
-  auto MInfo = Builder.buildPrerequisiteModulesFor(getFullPath("M.cppm"), FS);
+  auto MInfo = Builder->buildPrerequisiteModulesFor(getFullPath("M.cppm"), FS);
   EXPECT_TRUE(MInfo);
 
   // Nothing should be built since M doesn't dependent on anything.
@@ -215,9 +217,10 @@ import M;
 export module N:Part;
   )cpp");
 
-  ModulesBuilder Builder(CDB);
+  std::unique_ptr<ModulesBuilder> Builder =
+      ModulesBuilder::getModulesBuilder(CDB);
 
-  auto NInfo = Builder.buildPrerequisiteModulesFor(getFullPath("N.cppm"), FS);
+  auto NInfo = Builder->buildPrerequisiteModulesFor(getFullPath("N.cppm"), FS);
   EXPECT_TRUE(NInfo);
 
   ParseInputs NInput = getInputs("N.cppm", CDB);
@@ -276,9 +279,10 @@ import M;
 export module N:Part;
   )cpp");
 
-  ModulesBuilder Builder(CDB);
+  std::unique_ptr<ModulesBuilder> Builder =
+      ModulesBuilder::getModulesBuilder(CDB);
 
-  auto NInfo = Builder.buildPrerequisiteModulesFor(getFullPath("N.cppm"), FS);
+  auto NInfo = Builder->buildPrerequisiteModulesFor(getFullPath("N.cppm"), FS);
   EXPECT_TRUE(NInfo);
   EXPECT_TRUE(NInfo);
 
@@ -314,7 +318,7 @@ export int mm = 44;
   )cpp");
     EXPECT_FALSE(NInfo->canReuse(*Invocation, FS.view(TestDir)));
 
-    NInfo = Builder.buildPrerequisiteModulesFor(getFullPath("N.cppm"), FS);
+    NInfo = Builder->buildPrerequisiteModulesFor(getFullPath("N.cppm"), FS);
     EXPECT_TRUE(NInfo->canReuse(*Invocation, FS.view(TestDir)));
 
     CDB.addFile("foo.h", R"cpp(
@@ -323,7 +327,7 @@ inline void foo(int) {}
   )cpp");
     EXPECT_FALSE(NInfo->canReuse(*Invocation, FS.view(TestDir)));
 
-    NInfo = Builder.buildPrerequisiteModulesFor(getFullPath("N.cppm"), FS);
+    NInfo = Builder->buildPrerequisiteModulesFor(getFullPath("N.cppm"), FS);
     EXPECT_TRUE(NInfo->canReuse(*Invocation, FS.view(TestDir)));
   }
 
@@ -333,7 +337,7 @@ export module N:Part;
 export int NPart = 4LIdjwldijaw
   )cpp");
   EXPECT_FALSE(NInfo->canReuse(*Invocation, FS.view(TestDir)));
-  NInfo = Builder.buildPrerequisiteModulesFor(getFullPath("N.cppm"), FS);
+  NInfo = Builder->buildPrerequisiteModulesFor(getFullPath("N.cppm"), FS);
   EXPECT_TRUE(NInfo);
   EXPECT_FALSE(NInfo->canReuse(*Invocation, FS.view(TestDir)));
 
@@ -343,7 +347,7 @@ export int NPart = 43;
   )cpp");
   EXPECT_TRUE(NInfo);
   EXPECT_FALSE(NInfo->canReuse(*Invocation, FS.view(TestDir)));
-  NInfo = Builder.buildPrerequisiteModulesFor(getFullPath("N.cppm"), FS);
+  NInfo = Builder->buildPrerequisiteModulesFor(getFullPath("N.cppm"), FS);
   EXPECT_TRUE(NInfo);
   EXPECT_TRUE(NInfo->canReuse(*Invocation, FS.view(TestDir)));
 
@@ -379,10 +383,11 @@ export void printA();
 import A;
 )cpp");
 
-  ModulesBuilder Builder(CDB);
+  std::unique_ptr<ModulesBuilder> Builder =
+      ModulesBuilder::getModulesBuilder(CDB);
 
   ParseInputs Use = getInputs("Use.cpp", CDB);
-  Use.ModulesManager = &Builder;
+  Use.ModulesManager = Builder.get();
 
   std::unique_ptr<CompilerInvocation> CI =
       buildCompilerInvocation(Use, DiagConsumer);
@@ -402,6 +407,43 @@ import A;
   EXPECT_TRUE(D.isFromASTFile());
 }
 
+TEST_F(PrerequisiteModulesTests, ReusablePrerequisiteModulesTest) {
+  MockDirectoryCompilationDatabase CDB(TestDir, FS);
+
+  CDB.addFile("M.cppm", R"cpp(
+export module M;
+export int M = 43;
+  )cpp");
+  CDB.addFile("A.cppm", R"cpp(
+export module A;
+import M;
+export int A = 43 + M;
+  )cpp");
+  CDB.addFile("B.cppm", R"cpp(
+export module B;
+import M;
+export int B = 44 + M;
+  )cpp");
+
+  std::unique_ptr<ModulesBuilder> Builder =
+      ModulesBuilder::getModulesBuilder(CDB);
+
+  auto AInfo = Builder->buildPrerequisiteModulesFor(getFullPath("A.cppm"), FS);
+  EXPECT_TRUE(AInfo);
+  auto BInfo = Builder->buildPrerequisiteModulesFor(getFullPath("B.cppm"), FS);
+  EXPECT_TRUE(BInfo);
+  HeaderSearchOptions HSOptsA(TestDir);
+  HeaderSearchOptions HSOptsB(TestDir);
+  AInfo->adjustHeaderSearchOptions(HSOptsA);
+  BInfo->adjustHeaderSearchOptions(HSOptsB);
+
+  EXPECT_FALSE(HSOptsA.PrebuiltModuleFiles.empty());
+  EXPECT_FALSE(HSOptsB.PrebuiltModuleFiles.empty());
+
+  // Check that we're reusing the module files.
+  EXPECT_EQ(HSOptsA.PrebuiltModuleFiles, HSOptsB.PrebuiltModuleFiles);
+}
+
 } // namespace
 } // namespace clang::clangd
 



More information about the cfe-commits mailing list