[clang] [clang][modules] Avoid calling expensive `SourceManager::translateFile()` (PR #86216)

Jan Svoboda via cfe-commits cfe-commits at lists.llvm.org
Mon Mar 25 15:53:55 PDT 2024


================
@@ -1337,9 +1337,24 @@ static bool compileModule(CompilerInstance &ImportingInstance,
   // Get or create the module map that we'll use to build this module.
   ModuleMap &ModMap
     = ImportingInstance.getPreprocessor().getHeaderSearchInfo().getModuleMap();
+  SourceManager &SourceMgr = ImportingInstance.getSourceManager();
   bool Result;
-  if (OptionalFileEntryRef ModuleMapFile =
-          ModMap.getContainingModuleMapFile(Module)) {
+  if (FileID ModuleMapFID = ModMap.getContainingModuleMapFileID(Module);
+      ModuleMapFID.isValid()) {
+    // We want to use the top-level module map. If we don't, the compiling
----------------
jansvoboda11 wrote:

Consider that `SourceManager::translateFile()` always prefers local `SLocEntries`. Now let's say that a module is defined in a module map that gets included via the `extern` directive from a top-level module map Clang find via header search. Before loading PCM for this module, the following code will resolve to the local `SLocEntry` (there are no loaded entries yet).

```
SourceManager::translateFile(
  SourceManager::getFileEntryRefForID( // ModuleMap::getContainingModuleMapFile(Module)
    SourceManager::getFileID(          //
      Module::DefinitionLoc)))         //
```

After compiling and loading the PCM for such module, `Module::DefinitionLoc` gets updated and now points into the loaded `SLocEntry`. Therefore, `ModuleMap::getContainingModuleMapFile(Module)` returns the loaded `SLocEntry`. Then, `SourceManager::translateFile()` maps that to the **local** `SLocEntry`, so we end up with a result identical to the previous one.

This patch gets rid of the unnecessary conversions and does this instead.

```
SourceManager::getFileID(Module::DefinitionLoc)))
```

This means that after loading the PCM for the module, this points into the table of loaded `SLocEntries`. The importer knows that the containing module map is not a top level module map. When it attempts to walk up the include chain to get the the top-level module map for this module (using `SourceManager::getIncludeLoc()`) it gets an invalid `SourceLocation`. This is because the PCM was not compiled from the top-level module map, its `SourceManager` never knew about the file, and therefore can't walk up.

This means that the true top-level module map never makes it into the set of affecting module maps we compute in `ASTWriter`. It's in turn left out from the `INPUT_FILE` block that only has the containing module map that's marked as not-top-level file. This ultimately breaks the dependency scanner. When generating `-fmodule-map-file=` arguments for the explicit compilation, it only uses files that made it into the `INPUT_FILE` block (are affecting) and are top-level (to avoid exploding the command line with transitive module maps). This then breaks the explicit compile, which doesn't have any (neither the top-level nor the containing module map) for the module in question.

So this piece of code is here to ensure implicit builds are always invoked with the top-level module map, not the containing one. This fixes the walk over the include chain of loaded `SLocEntries` and fixes `clang/test/ClangScanDeps/modules-extern-unrelated.m`.

https://github.com/llvm/llvm-project/pull/86216


More information about the cfe-commits mailing list