[clang] 6a1f50b - [clang][deps] Prune unused header search paths

Jan Svoboda via cfe-commits cfe-commits at lists.llvm.org
Tue Oct 12 03:39:28 PDT 2021


Author: Jan Svoboda
Date: 2021-10-12T12:39:23+02:00
New Revision: 6a1f50b84ae8f8a8087fcdbe5f27dae8c76878f1

URL: https://github.com/llvm/llvm-project/commit/6a1f50b84ae8f8a8087fcdbe5f27dae8c76878f1
DIFF: https://github.com/llvm/llvm-project/commit/6a1f50b84ae8f8a8087fcdbe5f27dae8c76878f1.diff

LOG: [clang][deps] Prune unused header search paths

To reduce the number of explicit builds of a single module, we can try to squash multiple occurrences of the module with different command-lines (and context hashes) by removing benign command-line options. The greatest contributors to benign differences between command-lines are the header search paths.

In this patch, the lookup cache in `HeaderSearch` is used to identify paths that were actually used when implicitly building the module during scanning. This information is serialized into the unhashed control block of the implicitly-built PCM. The dependency scanner then loads this and may use it to prune the header search paths before computing the context hash of the module and generating the command-line.

We could also prune the header search paths when serializing `HeaderSearchOptions` into the PCM. That way, we could do it only once instead of every load of the PCM file by dependency scanner. However, that would result in a PCM file whose contents don't produce the same context hash as the original build, which is probably highly surprising.

There is an alternative approach to storing extra information into the PCM: wire up preprocessor callbacks to capture the used header search paths on-the-fly during preprocessing of modularized headers (similar to what we currently do for the main source file and textual headers). Right now, that's not compatible with the fact that we do an actual implicit build producing PCM files during dependency scanning. The second run of dependency scanner loads the PCM from the first run, skipping the preprocessing altogether, which would result in different results between runs. We can revisit this approach when we stop building implicitly during dependency scanning.

Depends on D102923.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D102488

Added: 
    clang/test/ClangScanDeps/Inputs/header-search-pruning/a/a.h
    clang/test/ClangScanDeps/Inputs/header-search-pruning/b/b.h
    clang/test/ClangScanDeps/Inputs/header-search-pruning/begin/begin.h
    clang/test/ClangScanDeps/Inputs/header-search-pruning/cdb.json
    clang/test/ClangScanDeps/Inputs/header-search-pruning/end/end.h
    clang/test/ClangScanDeps/Inputs/header-search-pruning/mod.h
    clang/test/ClangScanDeps/Inputs/header-search-pruning/module.modulemap
    clang/test/ClangScanDeps/header-search-pruning.cpp

Modified: 
    clang/include/clang/Serialization/ASTBitCodes.h
    clang/include/clang/Serialization/ModuleFile.h
    clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
    clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
    clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
    clang/lib/Serialization/ASTReader.cpp
    clang/lib/Serialization/ASTWriter.cpp
    clang/lib/Tooling/DependencyScanning/DependencyScanningService.cpp
    clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
    clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
    clang/tools/clang-scan-deps/ClangScanDeps.cpp

Removed: 
    


################################################################################
diff  --git a/clang/include/clang/Serialization/ASTBitCodes.h b/clang/include/clang/Serialization/ASTBitCodes.h
index e771aa3d07aa5..68520cd9b3e36 100644
--- a/clang/include/clang/Serialization/ASTBitCodes.h
+++ b/clang/include/clang/Serialization/ASTBitCodes.h
@@ -402,6 +402,9 @@ enum UnhashedControlBlockRecordTypes {
 
   /// Record code for \#pragma diagnostic mappings.
   DIAG_PRAGMA_MAPPINGS,
+
+  /// Record code for the indices of used header search entries.
+  HEADER_SEARCH_ENTRY_USAGE,
 };
 
 /// Record code for extension blocks.

diff  --git a/clang/include/clang/Serialization/ModuleFile.h b/clang/include/clang/Serialization/ModuleFile.h
index b1c8a8c8e72b6..b275f8b8db5d3 100644
--- a/clang/include/clang/Serialization/ModuleFile.h
+++ b/clang/include/clang/Serialization/ModuleFile.h
@@ -20,6 +20,7 @@
 #include "clang/Serialization/ASTBitCodes.h"
 #include "clang/Serialization/ContinuousRangeMap.h"
 #include "clang/Serialization/ModuleFileExtension.h"
+#include "llvm/ADT/BitVector.h"
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/PointerIntPair.h"
 #include "llvm/ADT/SetVector.h"
@@ -173,6 +174,9 @@ class ModuleFile {
   /// unique module files based on AST contents.
   ASTFileSignature ASTBlockHash;
 
+  /// The bit vector denoting usage of each header search entry (true = used).
+  llvm::BitVector SearchPathUsage;
+
   /// Whether this module has been directly imported by the
   /// user.
   bool DirectlyImported = false;

diff  --git a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
index 76edf150dbeee..d58e736ab6a66 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningService.h
@@ -48,7 +48,8 @@ class DependencyScanningService {
 public:
   DependencyScanningService(ScanningMode Mode, ScanningOutputFormat Format,
                             bool ReuseFileManager = true,
-                            bool SkipExcludedPPRanges = true);
+                            bool SkipExcludedPPRanges = true,
+                            bool OptimizeArgs = false);
 
   ScanningMode getMode() const { return Mode; }
 
@@ -58,6 +59,8 @@ class DependencyScanningService {
 
   bool canSkipExcludedPPRanges() const { return SkipExcludedPPRanges; }
 
+  bool canOptimizeArgs() const { return OptimizeArgs; }
+
   DependencyScanningFilesystemSharedCache &getSharedCache() {
     return SharedCache;
   }
@@ -70,6 +73,8 @@ class DependencyScanningService {
   /// ranges by bumping the buffer pointer in the lexer instead of lexing the
   /// tokens in the range until reaching the corresponding directive.
   const bool SkipExcludedPPRanges;
+  /// Whether to optimize the modules' command-line arguments.
+  const bool OptimizeArgs;
   /// The global file system cache.
   DependencyScanningFilesystemSharedCache SharedCache;
 };

diff  --git a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
index fb0d923a53ac9..0f3a5369a0213 100644
--- a/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
+++ b/clang/include/clang/Tooling/DependencyScanning/DependencyScanningWorker.h
@@ -83,6 +83,8 @@ class DependencyScanningWorker {
   /// worker. If null, the file manager will not be reused.
   llvm::IntrusiveRefCntPtr<FileManager> Files;
   ScanningOutputFormat Format;
+  /// Whether to optimize the modules' command-line arguments.
+  bool OptimizeArgs;
 };
 
 } // end namespace dependencies

diff  --git a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
index c539d7a586476..a15353dbf11b6 100644
--- a/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
+++ b/clang/include/clang/Tooling/DependencyScanning/ModuleDepCollector.h
@@ -194,7 +194,7 @@ class ModuleDepCollector final : public DependencyCollector {
 public:
   ModuleDepCollector(std::unique_ptr<DependencyOutputOptions> Opts,
                      CompilerInstance &I, DependencyConsumer &C,
-                     CompilerInvocation &&OriginalCI);
+                     CompilerInvocation &&OriginalCI, bool OptimizeArgs);
 
   void attachToPreprocessor(Preprocessor &PP) override;
   void attachToASTReader(ASTReader &R) override;
@@ -219,6 +219,8 @@ class ModuleDepCollector final : public DependencyCollector {
   std::unique_ptr<DependencyOutputOptions> Opts;
   /// The original Clang invocation passed to dependency scanner.
   CompilerInvocation OriginalInvocation;
+  /// Whether to optimize the modules' command-line arguments.
+  bool OptimizeArgs;
 
   /// Checks whether the module is known as being prebuilt.
   bool isPrebuiltModule(const Module *M);
@@ -226,8 +228,9 @@ class ModuleDepCollector final : public DependencyCollector {
   /// Constructs a CompilerInvocation that can be used to build the given
   /// module, excluding paths to discovered modular dependencies that are yet to
   /// be built.
-  CompilerInvocation
-  makeInvocationForModuleBuildWithoutPaths(const ModuleDeps &Deps) const;
+  CompilerInvocation makeInvocationForModuleBuildWithoutPaths(
+      const ModuleDeps &Deps,
+      llvm::function_ref<void(CompilerInvocation &)> Optimize) const;
 };
 
 } // end namespace dependencies

diff  --git a/clang/lib/Serialization/ASTReader.cpp b/clang/lib/Serialization/ASTReader.cpp
index ebb58d623348b..563a76cd380a3 100644
--- a/clang/lib/Serialization/ASTReader.cpp
+++ b/clang/lib/Serialization/ASTReader.cpp
@@ -4726,7 +4726,9 @@ ASTReader::ASTReadResult ASTReader::readUnhashedControlBlockImpl(
 
     // Read and process a record.
     Record.clear();
-    Expected<unsigned> MaybeRecordType = Stream.readRecord(Entry.ID, Record);
+    StringRef Blob;
+    Expected<unsigned> MaybeRecordType =
+        Stream.readRecord(Entry.ID, Record, &Blob);
     if (!MaybeRecordType) {
       // FIXME this drops the error.
       return Failure;
@@ -4758,6 +4760,17 @@ ASTReader::ASTReadResult ASTReader::readUnhashedControlBlockImpl(
         F->PragmaDiagMappings.insert(F->PragmaDiagMappings.end(),
                                      Record.begin(), Record.end());
       break;
+    case HEADER_SEARCH_ENTRY_USAGE:
+      if (!F)
+        break;
+      unsigned Count = Record[0];
+      const char *Byte = Blob.data();
+      F->SearchPathUsage = llvm::BitVector(Count, 0);
+      for (unsigned I = 0; I < Count; ++Byte)
+        for (unsigned Bit = 0; Bit < 8 && I < Count; ++Bit, ++I)
+          if (*Byte & (1 << Bit))
+            F->SearchPathUsage[I] = 1;
+      break;
     }
   }
 }

diff  --git a/clang/lib/Serialization/ASTWriter.cpp b/clang/lib/Serialization/ASTWriter.cpp
index 66c207ad9243d..7c500f30e271e 100644
--- a/clang/lib/Serialization/ASTWriter.cpp
+++ b/clang/lib/Serialization/ASTWriter.cpp
@@ -132,6 +132,18 @@ static StringRef bytes(const SmallVectorImpl<T> &v) {
                          sizeof(T) * v.size());
 }
 
+static std::string bytes(const std::vector<bool> &V) {
+  std::string Str;
+  Str.reserve(V.size() / 8);
+  for (unsigned I = 0, E = V.size(); I < E;) {
+    char Byte = 0;
+    for (unsigned Bit = 0; Bit < 8 && I < E; ++Bit, ++I)
+      Byte |= V[I] << Bit;
+    Str += Byte;
+  }
+  return Str;
+}
+
 //===----------------------------------------------------------------------===//
 // Type serialization
 //===----------------------------------------------------------------------===//
@@ -1050,6 +1062,8 @@ ASTWriter::createSignature(StringRef AllBytes, StringRef ASTBlockBytes) {
 
 ASTFileSignature ASTWriter::writeUnhashedControlBlock(Preprocessor &PP,
                                                       ASTContext &Context) {
+  using namespace llvm;
+
   // Flush first to prepare the PCM hash (signature).
   Stream.FlushToWord();
   auto StartOfUnhashedControl = Stream.GetCurrentBitNo() >> 3;
@@ -1093,10 +1107,24 @@ ASTFileSignature ASTWriter::writeUnhashedControlBlock(Preprocessor &PP,
   // Note: we don't serialize the log or serialization file names, because they
   // are generally transient files and will almost always be overridden.
   Stream.EmitRecord(DIAGNOSTIC_OPTIONS, Record);
+  Record.clear();
 
   // Write out the diagnostic/pragma mappings.
   WritePragmaDiagnosticMappings(Diags, /* isModule = */ WritingModule);
 
+  // Header search entry usage.
+  auto HSEntryUsage = PP.getHeaderSearchInfo().computeUserEntryUsage();
+  auto Abbrev = std::make_shared<BitCodeAbbrev>();
+  Abbrev->Add(BitCodeAbbrevOp(HEADER_SEARCH_ENTRY_USAGE));
+  Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 32)); // Number of bits.
+  Abbrev->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));      // Bit vector.
+  unsigned HSUsageAbbrevCode = Stream.EmitAbbrev(std::move(Abbrev));
+  {
+    RecordData::value_type Record[] = {HEADER_SEARCH_ENTRY_USAGE,
+                                       HSEntryUsage.size()};
+    Stream.EmitRecordWithBlob(HSUsageAbbrevCode, Record, bytes(HSEntryUsage));
+  }
+
   // Leave the options block.
   Stream.ExitBlock();
   return Signature;

diff  --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningService.cpp b/clang/lib/Tooling/DependencyScanning/DependencyScanningService.cpp
index 4f3e574719d2b..4b6c87aba62f1 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningService.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningService.cpp
@@ -15,9 +15,9 @@ using namespace dependencies;
 
 DependencyScanningService::DependencyScanningService(
     ScanningMode Mode, ScanningOutputFormat Format, bool ReuseFileManager,
-    bool SkipExcludedPPRanges)
+    bool SkipExcludedPPRanges, bool OptimizeArgs)
     : Mode(Mode), Format(Format), ReuseFileManager(ReuseFileManager),
-      SkipExcludedPPRanges(SkipExcludedPPRanges) {
+      SkipExcludedPPRanges(SkipExcludedPPRanges), OptimizeArgs(OptimizeArgs) {
   // Initialize targets for object file support.
   llvm::InitializeAllTargets();
   llvm::InitializeAllTargetMCs();

diff  --git a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
index e4d49756ce6b4..2a0943c16d88c 100644
--- a/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
+++ b/clang/lib/Tooling/DependencyScanning/DependencyScanningWorker.cpp
@@ -151,10 +151,11 @@ class DependencyScanningAction : public tooling::ToolAction {
       StringRef WorkingDirectory, DependencyConsumer &Consumer,
       llvm::IntrusiveRefCntPtr<DependencyScanningWorkerFilesystem> DepFS,
       ExcludedPreprocessorDirectiveSkipMapping *PPSkipMappings,
-      ScanningOutputFormat Format, llvm::Optional<StringRef> ModuleName = None)
+      ScanningOutputFormat Format, bool OptimizeArgs,
+      llvm::Optional<StringRef> ModuleName = None)
       : WorkingDirectory(WorkingDirectory), Consumer(Consumer),
         DepFS(std::move(DepFS)), PPSkipMappings(PPSkipMappings), Format(Format),
-        ModuleName(ModuleName) {}
+        OptimizeArgs(OptimizeArgs), ModuleName(ModuleName) {}
 
   bool runInvocation(std::shared_ptr<CompilerInvocation> Invocation,
                      FileManager *FileMgr,
@@ -243,15 +244,16 @@ class DependencyScanningAction : public tooling::ToolAction {
       break;
     case ScanningOutputFormat::Full:
       Compiler.addDependencyCollector(std::make_shared<ModuleDepCollector>(
-          std::move(Opts), Compiler, Consumer, std::move(OriginalInvocation)));
+          std::move(Opts), Compiler, Consumer, std::move(OriginalInvocation),
+          OptimizeArgs));
       break;
     }
 
     // Consider 
diff erent header search and diagnostic options to create
     // 
diff erent modules. This avoids the unsound aliasing of module PCMs.
     //
-    // TODO: Implement diagnostic bucketing and header search pruning to reduce
-    // the impact of strict context hashing.
+    // TODO: Implement diagnostic bucketing to reduce the impact of strict
+    // context hashing.
     Compiler.getHeaderSearchOpts().ModulesStrictContextHash = true;
 
     std::unique_ptr<FrontendAction> Action;
@@ -273,6 +275,7 @@ class DependencyScanningAction : public tooling::ToolAction {
   llvm::IntrusiveRefCntPtr<DependencyScanningWorkerFilesystem> DepFS;
   ExcludedPreprocessorDirectiveSkipMapping *PPSkipMappings;
   ScanningOutputFormat Format;
+  bool OptimizeArgs;
   llvm::Optional<StringRef> ModuleName;
 };
 
@@ -280,7 +283,7 @@ class DependencyScanningAction : public tooling::ToolAction {
 
 DependencyScanningWorker::DependencyScanningWorker(
     DependencyScanningService &Service)
-    : Format(Service.getFormat()) {
+    : Format(Service.getFormat()), OptimizeArgs(Service.canOptimizeArgs()) {
   PCHContainerOps = std::make_shared<PCHContainerOperations>();
   PCHContainerOps->registerReader(
       std::make_unique<ObjectFilePCHContainerReader>());
@@ -352,7 +355,8 @@ llvm::Error DependencyScanningWorker::computeDependencies(
                       [&](DiagnosticConsumer &DC, DiagnosticOptions &DiagOpts) {
                         DependencyScanningAction Action(
                             WorkingDirectory, Consumer, DepFS,
-                            PPSkipMappings.get(), Format, ModuleName);
+                            PPSkipMappings.get(), Format, OptimizeArgs,
+                            ModuleName);
                         // Create an invocation that uses the underlying file
                         // system to ensure that any file system requests that
                         // are made by the driver do not go through the

diff  --git a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
index 1e23ad945c839..919c7d175362d 100644
--- a/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
+++ b/clang/lib/Tooling/DependencyScanning/ModuleDepCollector.cpp
@@ -17,8 +17,20 @@ using namespace clang;
 using namespace tooling;
 using namespace dependencies;
 
+static void optimizeHeaderSearchOpts(HeaderSearchOptions &Opts,
+                                     ASTReader &Reader,
+                                     const serialization::ModuleFile &MF) {
+  // Only preserve search paths that were used during the dependency scan.
+  std::vector<HeaderSearchOptions::Entry> Entries = Opts.UserEntries;
+  Opts.UserEntries.clear();
+  for (unsigned I = 0; I < Entries.size(); ++I)
+    if (MF.SearchPathUsage[I])
+      Opts.UserEntries.push_back(Entries[I]);
+}
+
 CompilerInvocation ModuleDepCollector::makeInvocationForModuleBuildWithoutPaths(
-    const ModuleDeps &Deps) const {
+    const ModuleDeps &Deps,
+    llvm::function_ref<void(CompilerInvocation &)> Optimize) const {
   // Make a deep copy of the original Clang invocation.
   CompilerInvocation CI(OriginalInvocation);
 
@@ -41,6 +53,8 @@ CompilerInvocation ModuleDepCollector::makeInvocationForModuleBuildWithoutPaths(
     CI.getFrontendOpts().ModuleMapFiles.push_back(PrebuiltModule.ModuleMapFile);
   }
 
+  Optimize(CI);
+
   return CI;
 }
 
@@ -235,7 +249,12 @@ ModuleID ModuleDepCollectorPP::handleTopLevelModule(const Module *M) {
   llvm::DenseSet<const Module *> SeenModules;
   addAllSubmodulePrebuiltDeps(M, MD, SeenModules);
 
-  MD.Invocation = MDC.makeInvocationForModuleBuildWithoutPaths(MD);
+  MD.Invocation = MDC.makeInvocationForModuleBuildWithoutPaths(
+      MD, [&](CompilerInvocation &CI) {
+        if (MDC.OptimizeArgs)
+          optimizeHeaderSearchOpts(CI.getHeaderSearchOpts(),
+                                   *MDC.Instance.getASTReader(), *MF);
+      });
   MD.ID.ContextHash = MD.Invocation.getModuleHash();
 
   llvm::DenseSet<const Module *> AddedModules;
@@ -287,9 +306,9 @@ void ModuleDepCollectorPP::addModuleDep(
 
 ModuleDepCollector::ModuleDepCollector(
     std::unique_ptr<DependencyOutputOptions> Opts, CompilerInstance &I,
-    DependencyConsumer &C, CompilerInvocation &&OriginalCI)
+    DependencyConsumer &C, CompilerInvocation &&OriginalCI, bool OptimizeArgs)
     : Instance(I), Consumer(C), Opts(std::move(Opts)),
-      OriginalInvocation(std::move(OriginalCI)) {}
+      OriginalInvocation(std::move(OriginalCI)), OptimizeArgs(OptimizeArgs) {}
 
 void ModuleDepCollector::attachToPreprocessor(Preprocessor &PP) {
   PP.addPPCallbacks(std::make_unique<ModuleDepCollectorPP>(Instance, *this));

diff  --git a/clang/test/ClangScanDeps/Inputs/header-search-pruning/a/a.h b/clang/test/ClangScanDeps/Inputs/header-search-pruning/a/a.h
new file mode 100644
index 0000000000000..e69de29bb2d1d

diff  --git a/clang/test/ClangScanDeps/Inputs/header-search-pruning/b/b.h b/clang/test/ClangScanDeps/Inputs/header-search-pruning/b/b.h
new file mode 100644
index 0000000000000..e69de29bb2d1d

diff  --git a/clang/test/ClangScanDeps/Inputs/header-search-pruning/begin/begin.h b/clang/test/ClangScanDeps/Inputs/header-search-pruning/begin/begin.h
new file mode 100644
index 0000000000000..e69de29bb2d1d

diff  --git a/clang/test/ClangScanDeps/Inputs/header-search-pruning/cdb.json b/clang/test/ClangScanDeps/Inputs/header-search-pruning/cdb.json
new file mode 100644
index 0000000000000..f7e16773e16d3
--- /dev/null
+++ b/clang/test/ClangScanDeps/Inputs/header-search-pruning/cdb.json
@@ -0,0 +1,7 @@
+[
+  {
+    "directory": "DIR",
+    "command": "clang -E DIR/header-search-pruning.cpp -Ibegin -I1 -Ia -I3 -I4 -I5 -I6 -Ib -I8 -Iend DEFINES -fmodules -fcxx-modules -fmodules-cache-path=DIR/module-cache -fimplicit-modules -fmodule-map-file=DIR/module.modulemap",
+    "file": "DIR/header-search-pruning.cpp"
+  }
+]

diff  --git a/clang/test/ClangScanDeps/Inputs/header-search-pruning/end/end.h b/clang/test/ClangScanDeps/Inputs/header-search-pruning/end/end.h
new file mode 100644
index 0000000000000..e69de29bb2d1d

diff  --git a/clang/test/ClangScanDeps/Inputs/header-search-pruning/mod.h b/clang/test/ClangScanDeps/Inputs/header-search-pruning/mod.h
new file mode 100644
index 0000000000000..539ee5b3a05ae
--- /dev/null
+++ b/clang/test/ClangScanDeps/Inputs/header-search-pruning/mod.h
@@ -0,0 +1,11 @@
+#include "begin.h"
+
+#ifdef INCLUDE_A
+#include "a.h"
+#endif
+
+#ifdef INCLUDE_B
+#include "b.h"
+#endif
+
+#include "end.h"

diff  --git a/clang/test/ClangScanDeps/Inputs/header-search-pruning/module.modulemap b/clang/test/ClangScanDeps/Inputs/header-search-pruning/module.modulemap
new file mode 100644
index 0000000000000..30de4cda76c82
--- /dev/null
+++ b/clang/test/ClangScanDeps/Inputs/header-search-pruning/module.modulemap
@@ -0,0 +1,4 @@
+module mod {
+    header "mod.h"
+    export *
+}

diff  --git a/clang/test/ClangScanDeps/header-search-pruning.cpp b/clang/test/ClangScanDeps/header-search-pruning.cpp
new file mode 100644
index 0000000000000..c966c20b4a4c8
--- /dev/null
+++ b/clang/test/ClangScanDeps/header-search-pruning.cpp
@@ -0,0 +1,85 @@
+// RUN: rm -rf %t && mkdir -p %t
+// RUN: cp -r %S/Inputs/header-search-pruning/* %t
+// RUN: cp %S/header-search-pruning.cpp %t/header-search-pruning.cpp
+// RUN: sed -e "s|DIR|%/t|g" -e "s|DEFINES|-DINCLUDE_A|g"             %S/Inputs/header-search-pruning/cdb.json > %t/cdb_a.json
+// RUN: sed -e "s|DIR|%/t|g" -e "s|DEFINES|-DINCLUDE_B|g"             %S/Inputs/header-search-pruning/cdb.json > %t/cdb_b.json
+// RUN: sed -e "s|DIR|%/t|g" -e "s|DEFINES|-DINCLUDE_A -DINCLUDE_B|g" %S/Inputs/header-search-pruning/cdb.json > %t/cdb_ab.json
+//
+// RUN: clang-scan-deps -compilation-database %t/cdb_a.json -format experimental-full -optimize-args >> %t/result_a.json
+// RUN: cat %t/result_a.json | sed 's/\\/\//g' | FileCheck --check-prefixes=CHECK_A %s
+//
+// RUN: clang-scan-deps -compilation-database %t/cdb_b.json -format experimental-full -optimize-args >> %t/result_b.json
+// RUN: cat %t/result_b.json | sed 's/\\/\//g' | FileCheck --check-prefixes=CHECK_B %s
+//
+// RUN: clang-scan-deps -compilation-database %t/cdb_ab.json -format experimental-full -optimize-args >> %t/result_ab.json
+// RUN: cat %t/result_ab.json | sed 's/\\/\//g' | FileCheck --check-prefixes=CHECK_AB %s
+
+#include "mod.h"
+
+// CHECK_A:        {
+// CHECK_A-NEXT:     "modules": [
+// CHECK_A-NEXT:       {
+// CHECK_A-NEXT:         "clang-module-deps": [],
+// CHECK_A-NEXT:         "clang-modulemap-file": "{{.*}}",
+// CHECK_A-NEXT:         "command-line": [
+// CHECK_A-NEXT:           "-cc1"
+// CHECK_A:                "-I",
+// CHECK_A-NEXT:           "begin",
+// CHECK_A-NEXT:           "-I",
+// CHECK_A-NEXT:           "a",
+// CHECK_A-NEXT:           "-I",
+// CHECK_A-NEXT:           "end"
+// CHECK_A:              ],
+// CHECK_A-NEXT:         "context-hash": "{{.*}}",
+// CHECK_A-NEXT:         "file-deps": [
+// CHECK_A:              ],
+// CHECK_A-NEXT:         "name": "mod"
+// CHECK_A-NEXT:       }
+// CHECK_A-NEXT:     ]
+// CHECK_A:        }
+
+// CHECK_B:        {
+// CHECK_B-NEXT:     "modules": [
+// CHECK_B-NEXT:       {
+// CHECK_B-NEXT:         "clang-module-deps": [],
+// CHECK_B-NEXT:         "clang-modulemap-file": "{{.*}}",
+// CHECK_B-NEXT:         "command-line": [
+// CHECK_B-NEXT:           "-cc1"
+// CHECK_B:                "-I",
+// CHECK_B-NEXT:           "begin",
+// CHECK_B-NEXT:           "-I",
+// CHECK_B-NEXT:           "b",
+// CHECK_B-NEXT:           "-I",
+// CHECK_B-NEXT:           "end"
+// CHECK_B:              ],
+// CHECK_B-NEXT:         "context-hash": "{{.*}}",
+// CHECK_B-NEXT:         "file-deps": [
+// CHECK_B:              ],
+// CHECK_B-NEXT:         "name": "mod"
+// CHECK_B-NEXT:       }
+// CHECK_B-NEXT:     ]
+// CHECK_B:        }
+
+// CHECK_AB:       {
+// CHECK_AB-NEXT:    "modules": [
+// CHECK_AB-NEXT:      {
+// CHECK_AB-NEXT:        "clang-module-deps": [],
+// CHECK_AB-NEXT:        "clang-modulemap-file": "{{.*}}",
+// CHECK_AB-NEXT:        "command-line": [
+// CHECK_AB-NEXT:          "-cc1"
+// CHECK_AB:               "-I",
+// CHECK_AB-NEXT:          "begin",
+// CHECK_AB-NEXT:          "-I",
+// CHECK_AB-NEXT:          "a",
+// CHECK_AB-NEXT:          "-I",
+// CHECK_AB-NEXT:          "b",
+// CHECK_AB-NEXT:          "-I",
+// CHECK_AB-NEXT:          "end"
+// CHECK_AB:             ],
+// CHECK_AB-NEXT:        "context-hash": "{{.*}}",
+// CHECK_AB-NEXT:        "file-deps": [
+// CHECK_AB:             ],
+// CHECK_AB-NEXT:        "name": "mod"
+// CHECK_AB-NEXT:      }
+// CHECK_AB-NEXT:    ]
+// CHECK_AB:       }

diff  --git a/clang/tools/clang-scan-deps/ClangScanDeps.cpp b/clang/tools/clang-scan-deps/ClangScanDeps.cpp
index 6595efa182ce9..b77abeacc195f 100644
--- a/clang/tools/clang-scan-deps/ClangScanDeps.cpp
+++ b/clang/tools/clang-scan-deps/ClangScanDeps.cpp
@@ -170,6 +170,11 @@ static llvm::cl::opt<std::string> ModuleFilesDir(
                    "specified directory instead the module cache directory."),
     llvm::cl::cat(DependencyScannerCategory));
 
+static llvm::cl::opt<bool> OptimizeArgs(
+    "optimize-args",
+    llvm::cl::desc("Whether to optimize command-line arguments of modules."),
+    llvm::cl::init(false), llvm::cl::cat(DependencyScannerCategory));
+
 llvm::cl::opt<unsigned>
     NumThreads("j", llvm::cl::Optional,
                llvm::cl::desc("Number of worker threads to use (default: use "
@@ -507,7 +512,7 @@ int main(int argc, const char **argv) {
   SharedStream DependencyOS(llvm::outs());
 
   DependencyScanningService Service(ScanMode, Format, ReuseFileManager,
-                                    SkipExcludedPPRanges);
+                                    SkipExcludedPPRanges, OptimizeArgs);
   llvm::ThreadPool Pool(llvm::hardware_concurrency(NumThreads));
   std::vector<std::unique_ptr<DependencyScanningTool>> WorkerTools;
   for (unsigned I = 0; I < Pool.getThreadCount(); ++I)


        


More information about the cfe-commits mailing list