[PATCH] D111543: [clang][modules] Stop creating `IdentifierInfo` for names of explicit modules

Richard Smith - zygoloid via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Oct 15 14:42:27 PDT 2021


rsmith added a comment.

I've stepped through this in a debugger. The problem is that the `CompilerInstance` is setting up an `ASTReaderListener` that is informed whenever a module is loaded, and that listener is called when the AST file has been added to the module manager but hasn't had its header read yet. That listener then then attempts to create an `IdentifierInfo`, which won't work, because the `ASTReader` is in a weird state, with modules that are loaded but not yet set up properly. Here's a backtrace:

  #5  0x0000000009ddff20 in clang::IdentifierTable::get (this=0xffb60e8, Name=...) at include/clang/Basic/IdentifierTable.h:600
  #6  0x000000000a81d41c in clang::Preprocessor::getIdentifierInfo (this=0xffb5ef8, Name=...) at clang/include/clang/Lex/Preprocessor.h:1244
  #7  0x000000000a819464 in (anonymous namespace)::ReadModuleNames::ReadModuleName (this=0xffc3100, ModuleName=...) at clang/lib/Frontend/CompilerInstance.cpp:573
  #8  0x000000000ab3d0f6 in clang::ChainedASTReaderListener::ReadModuleName (this=0x1000bdd0, ModuleName=...) at clang/lib/Serialization/ASTReader.cpp:158
  #9  0x000000000ab5dd42 in clang::ASTReader::ReadControlBlock (this=0xffd2fe0, F=..., Loaded=..., ImportedBy=0x0, ClientLoadCapabilities=0) at clang/lib/Serialization/ASTReader.cpp:2906
  #10 0x000000000ab5ec50 in clang::ASTReader::ReadASTCore (this=0xffd2fe0, FileName=..., Type=clang::serialization::MK_ExplicitModule, ImportLoc=..., ImportedBy=0x0, Loaded=..., ExpectedSize=0, ExpectedModTime=0, ExpectedSignature=..., ClientLoadCapabilities=0) at clang/lib/Serialization/ASTReader.cpp:4572
  #11 0x000000000ab6684b in clang::ASTReader::ReadAST (this=0xffd2fe0, FileName=..., Type=clang::serialization::MK_ExplicitModule, ImportLoc=..., ClientLoadCapabilities=0, Imported=0x0)

What actually fails here is that we've not yet loaded the information about the identifier table in the `Interface` module, so we mark that slot in the identifier resolver as being up to date even though it absolutely isn't.

So, given that `ReadModuleName` needs to be robust against being called when half-way through loading modules, when it's not safe to touch any part of the AST or identifier table, I agree that it shouldn't be calling `getIdentifierInfo`. (I wish we didn't have these incredibly fragile callbacks that are run at a time when it's not safe to do most things, but it's not clear that's really fixable.) But I think we can still avoid the two different forms of cache here. What do you think about reverting the changes to `ModuleMap.h` and localizing the fix to the `ASTReaderListener`:

  --- a/clang/lib/Frontend/CompilerInstance.cpp
  +++ b/clang/lib/Frontend/CompilerInstance.cpp
  @@ -565,25 +565,28 @@ namespace {
   // the files we were handed.
   struct ReadModuleNames : ASTReaderListener {
     Preprocessor &PP;
  -  llvm::SmallVector<IdentifierInfo*, 8> LoadedModules;
  +  llvm::SmallVector<std::string, 8> LoadedModules;
   
     ReadModuleNames(Preprocessor &PP) : PP(PP) {}
   
     void ReadModuleName(StringRef ModuleName) override {
  -    LoadedModules.push_back(PP.getIdentifierInfo(ModuleName));
  +    // Keep the module name as a string for now. It's not safe to create a new
  +    // IdentifierInfo from an ASTReader callback.
  +    LoadedModules.push_back(ModuleName.str());
     }
   
     void registerAll() {
       ModuleMap &MM = PP.getHeaderSearchInfo().getModuleMap();
  -    for (auto *II : LoadedModules)
  -      MM.cacheModuleLoad(*II, MM.findModule(II->getName()));
  +    for (const std::string &ModuleName : LoadedModules)
  +      MM.cacheModuleLoad(*PP.getIdentifierInfo(ModuleName),
  +                         MM.findModule(ModuleName));
       LoadedModules.clear();
     }
   
     void markAllUnavailable() {
  -    for (auto *II : LoadedModules) {
  -      if (Module *M = PP.getHeaderSearchInfo().getModuleMap().findModule(
  -              II->getName())) {
  +    for (const std::string &ModuleName : LoadedModules) {
  +      if (Module *M =
  +              PP.getHeaderSearchInfo().getModuleMap().findModule(ModuleName)) {
           M->HasIncompatibleModuleFile = true;
   
           // Mark module as available if the only reason it was unavailable


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111543/new/

https://reviews.llvm.org/D111543



More information about the cfe-commits mailing list