[flang-commits] [flang] [flang] Use module file hashes for more checking and disambiguation (PR #80354)

Peter Klausler via flang-commits flang-commits at lists.llvm.org
Thu Feb 29 08:22:01 PST 2024


https://github.com/klausler updated https://github.com/llvm/llvm-project/pull/80354

>From 9f850db066512fa92fa615da912f3178f9f35332 Mon Sep 17 00:00:00 2001
From: Peter Klausler <pklausler at nvidia.com>
Date: Wed, 31 Jan 2024 14:58:58 -0800
Subject: [PATCH] [flang] Use module file hashes for more checking and
 disambiguation

f18's module files are Fortran with a leading header comment
containing the module file format version and a hash of the
following contents.  This hash is currently used only to protect
module files against corruption and truncation.

Extend the use of these hashes to catch or avoid some error
cases.  When one module file depends upon another, note its
hash in additional module file header comments.  This allows
the compiler to detect when the module dependency is on a
module file that has been updated.  Further, it allows the
compiler to find the right module file dependency when the
same module file name appears in multiple directories on the
module search path.

The order in which module files are written, when multiple
modules appear in a source file, is such that every dependency
is written before the module(s) that depend upon it, so that
their hashes are known.

A warning is emitted when a module file is not the first hit
on the module file search path.

Further work is needed to add a compiler option that emits
(larger) stand-alone module files that incorporate copies of
their dependencies rather than relying on search paths.
This will be desirable for application libraries that want
to ship only "top-level" module files without needing to
include their dependencies.

Another future work item would be to admit multiple modules
in the same compilation with the same name if they have
distinct hashes.
---
 flang/include/flang/Parser/source.h           |   2 +
 .../flang/Semantics/module-dependences.h      |  48 ++++
 flang/include/flang/Semantics/semantics.h     |   3 +
 flang/include/flang/Semantics/symbol.h        |   8 +-
 flang/lib/Parser/source.cpp                   |  18 ++
 flang/lib/Semantics/mod-file.cpp              | 207 +++++++++++++++---
 flang/lib/Semantics/mod-file.h                |  11 +-
 flang/lib/Semantics/resolve-names.cpp         |   3 +-
 flang/lib/Semantics/resolve-names.h           |   4 -
 flang/lib/Semantics/semantics.cpp             |   2 +-
 .../test/Semantics/Inputs/dir1/modfile63a.mod |   6 +
 .../test/Semantics/Inputs/dir1/modfile63b.mod |   8 +
 .../test/Semantics/Inputs/dir2/modfile63a.mod |   6 +
 .../test/Semantics/Inputs/dir2/modfile63b.mod |   8 +
 flang/test/Semantics/getsymbols02.f90         |   2 +-
 flang/test/Semantics/modfile63.f90            |  19 ++
 flang/test/Semantics/test_modfile.py          |   2 +-
 17 files changed, 316 insertions(+), 41 deletions(-)
 create mode 100644 flang/include/flang/Semantics/module-dependences.h
 create mode 100644 flang/test/Semantics/Inputs/dir1/modfile63a.mod
 create mode 100644 flang/test/Semantics/Inputs/dir1/modfile63b.mod
 create mode 100644 flang/test/Semantics/Inputs/dir2/modfile63a.mod
 create mode 100644 flang/test/Semantics/Inputs/dir2/modfile63b.mod
 create mode 100644 flang/test/Semantics/modfile63.f90

diff --git a/flang/include/flang/Parser/source.h b/flang/include/flang/Parser/source.h
index f0ae97a3ef0485..a6efdf9546c7f3 100644
--- a/flang/include/flang/Parser/source.h
+++ b/flang/include/flang/Parser/source.h
@@ -36,6 +36,8 @@ namespace Fortran::parser {
 std::string DirectoryName(std::string path);
 std::optional<std::string> LocateSourceFile(
     std::string name, const std::list<std::string> &searchPath);
+std::vector<std::string> LocateSourceFileAll(
+    std::string name, const std::vector<std::string> &searchPath);
 
 class SourceFile;
 
diff --git a/flang/include/flang/Semantics/module-dependences.h b/flang/include/flang/Semantics/module-dependences.h
new file mode 100644
index 00000000000000..b9ed9fcc1d83b5
--- /dev/null
+++ b/flang/include/flang/Semantics/module-dependences.h
@@ -0,0 +1,48 @@
+//===-- include/flang/Semantics/module-dependences.h ------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef FORTRAN_SEMANTICS_MODULE_DEPENDENCES_H_
+#define FORTRAN_SEMANTICS_MODULE_DEPENDENCES_H_
+
+#include <cinttypes>
+#include <map>
+#include <optional>
+#include <string>
+
+namespace Fortran::semantics {
+
+using ModuleCheckSumType = std::uint64_t;
+
+class ModuleDependences {
+public:
+  void AddDependence(std::string &&name, bool intrinsic, ModuleCheckSumType hash) {
+    if (intrinsic) {
+      intrinsicMap_.emplace(std::move(name), hash);
+    } else {
+      nonIntrinsicMap_.emplace(std::move(name), hash);
+    }
+  }
+  std::optional<ModuleCheckSumType> GetRequiredHash(const std::string &name, bool intrinsic) {
+    if (intrinsic) {
+      if (auto iter{intrinsicMap_.find(name)}; iter != intrinsicMap_.end()) {
+        return iter->second;
+      }
+    } else {
+      if (auto iter{nonIntrinsicMap_.find(name)}; iter != nonIntrinsicMap_.end()) {
+        return iter->second;
+      }
+    }
+    return std::nullopt;
+  }
+
+private:
+  std::map<std::string, ModuleCheckSumType> intrinsicMap_, nonIntrinsicMap_;
+};
+
+} // namespace Fortran::semantics
+#endif // FORTRAN_SEMANTICS_MODULE_DEPENDENCES_H_
diff --git a/flang/include/flang/Semantics/semantics.h b/flang/include/flang/Semantics/semantics.h
index 4e8b71fa652f5c..c8ee71945d8bde 100644
--- a/flang/include/flang/Semantics/semantics.h
+++ b/flang/include/flang/Semantics/semantics.h
@@ -16,6 +16,7 @@
 #include "flang/Evaluate/intrinsics.h"
 #include "flang/Evaluate/target.h"
 #include "flang/Parser/message.h"
+#include "flang/Semantics/module-dependences.h"
 #include <iosfwd>
 #include <set>
 #include <string>
@@ -108,6 +109,7 @@ class SemanticsContext {
   parser::Messages &messages() { return messages_; }
   evaluate::FoldingContext &foldingContext() { return foldingContext_; }
   parser::AllCookedSources &allCookedSources() { return allCookedSources_; }
+  ModuleDependences &moduleDependences() { return moduleDependences_; }
 
   SemanticsContext &set_location(
       const std::optional<parser::CharBlock> &location) {
@@ -293,6 +295,7 @@ class SemanticsContext {
   const Scope *ppcBuiltinsScope_{nullptr}; // module __ppc_intrinsics
   std::list<parser::Program> modFileParseTrees_;
   std::unique_ptr<CommonBlockMap> commonBlockMap_;
+  ModuleDependences moduleDependences_;
 };
 
 class Semantics {
diff --git a/flang/include/flang/Semantics/symbol.h b/flang/include/flang/Semantics/symbol.h
index 4535a92ce3dd8e..d8b372bea10f04 100644
--- a/flang/include/flang/Semantics/symbol.h
+++ b/flang/include/flang/Semantics/symbol.h
@@ -14,6 +14,7 @@
 #include "flang/Common/enum-set.h"
 #include "flang/Common/reference.h"
 #include "flang/Common/visit.h"
+#include "flang/Semantics/module-dependences.h"
 #include "llvm/ADT/DenseMapInfo.h"
 
 #include <array>
@@ -86,11 +87,16 @@ class ModuleDetails : public WithOmpDeclarative {
   void set_scope(const Scope *);
   bool isDefaultPrivate() const { return isDefaultPrivate_; }
   void set_isDefaultPrivate(bool yes = true) { isDefaultPrivate_ = yes; }
+  std::optional<ModuleCheckSumType> moduleFileHash() const {
+    return moduleFileHash_;
+  }
+  void set_moduleFileHash(ModuleCheckSumType x) { moduleFileHash_ = x; }
 
 private:
   bool isSubmodule_;
   bool isDefaultPrivate_{false};
   const Scope *scope_{nullptr};
+  std::optional<ModuleCheckSumType> moduleFileHash_;
 };
 
 class MainProgramDetails : public WithOmpDeclarative {
@@ -1035,7 +1041,7 @@ struct SymbolAddressCompare {
 // Symbol comparison is usually based on the order of cooked source
 // stream creation and, when both are from the same cooked source,
 // their positions in that cooked source stream.
-// Don't use this comparator or OrderedSymbolSet to hold
+// Don't use this comparator or SourceOrderedSymbolSet to hold
 // Symbols that might be subject to ReplaceName().
 struct SymbolSourcePositionCompare {
   // These functions are implemented in Evaluate/tools.cpp to
diff --git a/flang/lib/Parser/source.cpp b/flang/lib/Parser/source.cpp
index 4b4fed64a1a40a..ae834dc2416529 100644
--- a/flang/lib/Parser/source.cpp
+++ b/flang/lib/Parser/source.cpp
@@ -75,6 +75,24 @@ std::optional<std::string> LocateSourceFile(
   return std::nullopt;
 }
 
+std::vector<std::string> LocateSourceFileAll(
+    std::string name, const std::vector<std::string> &searchPath) {
+  if (name == "-" || llvm::sys::path::is_absolute(name)) {
+    return {name};
+  }
+  std::vector<std::string> result;
+  for (const std::string &dir : searchPath) {
+    llvm::SmallString<128> path{dir};
+    llvm::sys::path::append(path, name);
+    bool isDir{false};
+    auto er = llvm::sys::fs::is_directory(path, isDir);
+    if (!er && !isDir) {
+      result.emplace_back(path.str().str());
+    }
+  }
+  return result;
+}
+
 std::size_t RemoveCarriageReturns(llvm::MutableArrayRef<char> buf) {
   std::size_t wrote{0};
   char *buffer{buf.data()};
diff --git a/flang/lib/Semantics/mod-file.cpp b/flang/lib/Semantics/mod-file.cpp
index 7072ddee18ebef..840c78d5372103 100644
--- a/flang/lib/Semantics/mod-file.cpp
+++ b/flang/lib/Semantics/mod-file.cpp
@@ -41,11 +41,13 @@ struct ModHeader {
   static constexpr const char magic[magicLen + 1]{"!mod$ v1 sum:"};
   static constexpr char terminator{'\n'};
   static constexpr int len{magicLen + 1 + sumLen};
+  static constexpr int needLen{7};
+  static constexpr const char need[needLen + 1]{"!need$ "};
 };
 
 static std::optional<SourceName> GetSubmoduleParent(const parser::Program &);
 static void CollectSymbols(const Scope &, SymbolVector &, SymbolVector &,
-    std::map<const Symbol *, SourceName> &);
+    std::map<const Symbol *, SourceName> &, UnorderedSymbolSet &);
 static void PutPassName(llvm::raw_ostream &, const std::optional<SourceName> &);
 static void PutInit(llvm::raw_ostream &, const Symbol &, const MaybeExpr &,
     const parser::Expr *, const std::map<const Symbol *, SourceName> &);
@@ -58,11 +60,12 @@ static void PutShape(
 static llvm::raw_ostream &PutAttr(llvm::raw_ostream &, Attr);
 static llvm::raw_ostream &PutType(llvm::raw_ostream &, const DeclTypeSpec &);
 static llvm::raw_ostream &PutLower(llvm::raw_ostream &, std::string_view);
-static std::error_code WriteFile(
-    const std::string &, const std::string &, bool = true);
+static std::error_code WriteFile(const std::string &, const std::string &,
+    ModuleCheckSumType &, bool debug = true);
 static bool FileContentsMatch(
     const std::string &, const std::string &, const std::string &);
-static std::string CheckSum(const std::string_view &);
+static ModuleCheckSumType ComputeCheckSum(const std::string_view &);
+static std::string CheckSumString(ModuleCheckSumType);
 
 // Collect symbols needed for a subprogram interface
 class SubprogramSymbolCollector {
@@ -129,17 +132,23 @@ static std::string ModFileName(const SourceName &name,
 
 // Write the module file for symbol, which must be a module or submodule.
 void ModFileWriter::Write(const Symbol &symbol) {
-  auto *ancestor{symbol.get<ModuleDetails>().ancestor()};
+  auto &module{symbol.get<ModuleDetails>()};
+  if (module.moduleFileHash()) {
+    return; // already written
+  }
+  auto *ancestor{module.ancestor()};
   isSubmodule_ = ancestor != nullptr;
   auto ancestorName{ancestor ? ancestor->GetName().value().ToString() : ""s};
   auto path{context_.moduleDirectory() + '/' +
       ModFileName(symbol.name(), ancestorName, context_.moduleFileSuffix())};
   PutSymbols(DEREF(symbol.scope()));
-  if (std::error_code error{
-          WriteFile(path, GetAsString(symbol), context_.debugModuleWriter())}) {
+  ModuleCheckSumType checkSum;
+  if (std::error_code error{WriteFile(
+          path, GetAsString(symbol), checkSum, context_.debugModuleWriter())}) {
     context_.Say(
         symbol.name(), "Error writing %s: %s"_err_en_US, path, error.message());
   }
+  const_cast<ModuleDetails &>(module).set_moduleFileHash(checkSum);
 }
 
 // Return the entire body of the module file
@@ -147,6 +156,8 @@ void ModFileWriter::Write(const Symbol &symbol) {
 std::string ModFileWriter::GetAsString(const Symbol &symbol) {
   std::string buf;
   llvm::raw_string_ostream all{buf};
+  all << needs_.str();
+  needs_.str().clear();
   auto &details{symbol.get<ModuleDetails>()};
   if (!details.isSubmodule()) {
     all << "module " << symbol.name();
@@ -258,7 +269,17 @@ void ModFileWriter::PutSymbols(const Scope &scope) {
   SymbolVector sorted;
   SymbolVector uses;
   PrepareRenamings(scope);
-  CollectSymbols(scope, sorted, uses, renamings_);
+  UnorderedSymbolSet modules;
+  CollectSymbols(scope, sorted, uses, renamings_, modules);
+  // Write module files for dependencies first so that their
+  // hashes are known.
+  for (auto ref : modules) {
+    Write(*ref);
+    needs_ << ModHeader::need
+           << CheckSumString(ref->get<ModuleDetails>().moduleFileHash().value())
+           << (ref->owner().IsIntrinsicModules() ? " i " : " n ")
+           << ref->name().ToString() << '\n';
+  }
   std::string buf; // stuff after CONTAINS in derived type
   llvm::raw_string_ostream typeBindings{buf};
   for (const Symbol &symbol : sorted) {
@@ -730,16 +751,26 @@ static inline SourceName NameInModuleFile(const Symbol &symbol) {
 // Collect the symbols of this scope sorted by their original order, not name.
 // Generics and namelists are exceptions: they are sorted after other symbols.
 void CollectSymbols(const Scope &scope, SymbolVector &sorted,
-    SymbolVector &uses, std::map<const Symbol *, SourceName> &renamings) {
+    SymbolVector &uses, std::map<const Symbol *, SourceName> &renamings,
+    UnorderedSymbolSet &modules) {
   SymbolVector namelist, generics;
   auto symbols{scope.GetSymbols()};
   std::size_t commonSize{scope.commonBlocks().size()};
   sorted.reserve(symbols.size() + commonSize);
   for (SymbolRef symbol : symbols) {
+    const auto *generic{symbol->detailsIf<GenericDetails>()};
+    if (generic) {
+      uses.insert(uses.end(), generic->uses().begin(), generic->uses().end());
+      for (auto ref : generic->uses()) {
+        modules.insert(GetUsedModule(ref->get<UseDetails>()));
+      }
+    } else if (const auto *use{symbol->detailsIf<UseDetails>()}) {
+      modules.insert(GetUsedModule(*use));
+    }
     if (symbol->test(Symbol::Flag::ParentComp)) {
     } else if (symbol->has<NamelistDetails>()) {
       namelist.push_back(symbol);
-    } else if (const auto *generic{symbol->detailsIf<GenericDetails>()}) {
+    } else if (generic) {
       if (generic->specific() &&
           &generic->specific()->owner() == &symbol->owner()) {
         sorted.push_back(*generic->specific());
@@ -751,9 +782,6 @@ void CollectSymbols(const Scope &scope, SymbolVector &sorted,
     } else {
       sorted.push_back(symbol);
     }
-    if (const auto *details{symbol->detailsIf<GenericDetails>()}) {
-      uses.insert(uses.end(), details->uses().begin(), details->uses().end());
-    }
   }
   // Sort most symbols by name: use of Symbol::ReplaceName ensures the source
   // location of a symbol's name is the first "real" use.
@@ -1100,10 +1128,11 @@ static llvm::ErrorOr<Temp> MkTemp(const std::string &path) {
 
 // Write the module file at path, prepending header. If an error occurs,
 // return errno, otherwise 0.
-static std::error_code WriteFile(
-    const std::string &path, const std::string &contents, bool debug) {
+static std::error_code WriteFile(const std::string &path,
+    const std::string &contents, ModuleCheckSumType &checkSum, bool debug) {
+  checkSum = ComputeCheckSum(contents);
   auto header{std::string{ModHeader::bom} + ModHeader::magic +
-      CheckSum(contents) + ModHeader::terminator};
+      CheckSumString(checkSum) + ModHeader::terminator};
   if (debug) {
     llvm::dbgs() << "Processing module " << path << ": ";
   }
@@ -1155,12 +1184,16 @@ static bool FileContentsMatch(const std::string &path,
 // Compute a simple hash of the contents of a module file and
 // return it as a string of hex digits.
 // This uses the Fowler-Noll-Vo hash function.
-static std::string CheckSum(const std::string_view &contents) {
-  std::uint64_t hash{0xcbf29ce484222325ull};
+static ModuleCheckSumType ComputeCheckSum(const std::string_view &contents) {
+  ModuleCheckSumType hash{0xcbf29ce484222325ull};
   for (char c : contents) {
     hash ^= c & 0xff;
     hash *= 0x100000001b3;
   }
+  return hash;
+}
+
+static std::string CheckSumString(ModuleCheckSumType hash) {
   static const char *digits = "0123456789abcdef";
   std::string result(ModHeader::sumLen, '0');
   for (size_t i{ModHeader::sumLen}; hash != 0; hash >>= 4) {
@@ -1169,18 +1202,74 @@ static std::string CheckSum(const std::string_view &contents) {
   return result;
 }
 
-static bool VerifyHeader(llvm::ArrayRef<char> content) {
+std::optional<ModuleCheckSumType> ExtractCheckSum(const std::string_view &str) {
+  if (str.size() == ModHeader::sumLen) {
+    ModuleCheckSumType hash{0};
+    for (size_t j{0}; j < ModHeader::sumLen; ++j) {
+      hash <<= 4;
+      char ch{str.at(j)};
+      if (ch >= '0' && ch <= '9') {
+        hash += ch - '0';
+      } else if (ch >= 'a' && ch <= 'f') {
+        hash += ch - 'a' + 10;
+      } else {
+        return std::nullopt;
+      }
+    }
+    return hash;
+  }
+  return std::nullopt;
+}
+
+static std::optional<ModuleCheckSumType> VerifyHeader(
+    llvm::ArrayRef<char> content) {
   std::string_view sv{content.data(), content.size()};
   if (sv.substr(0, ModHeader::magicLen) != ModHeader::magic) {
-    return false;
+    return std::nullopt;
   }
+  ModuleCheckSumType checkSum{ComputeCheckSum(sv.substr(ModHeader::len))};
   std::string_view expectSum{sv.substr(ModHeader::magicLen, ModHeader::sumLen)};
-  std::string actualSum{CheckSum(sv.substr(ModHeader::len))};
-  return expectSum == actualSum;
+  if (auto extracted{ExtractCheckSum(expectSum)};
+      extracted && *extracted == checkSum) {
+    return checkSum;
+  } else {
+    return std::nullopt;
+  }
 }
 
-Scope *ModFileReader::Read(const SourceName &name,
-    std::optional<bool> isIntrinsic, Scope *ancestor, bool silent) {
+static void GetModuleDependences(
+    ModuleDependences &dependences, llvm::ArrayRef<char> content) {
+  std::size_t limit{content.size()};
+  std::string_view str{content.data(), limit};
+  for (std::size_t j{ModHeader::len};
+       str.substr(j, ModHeader::needLen) == ModHeader::need;) {
+    j += 7;
+    auto checkSum{ExtractCheckSum(str.substr(j, ModHeader::sumLen))};
+    if (!checkSum) {
+      break;
+    }
+    j += ModHeader::sumLen;
+    bool intrinsic{false};
+    if (str.substr(j, 3) == " i ") {
+      intrinsic = true;
+    } else if (str.substr(j, 3) != " n ") {
+      break;
+    }
+    j += 3;
+    std::size_t start{j};
+    for (; j < limit && str.at(j) != '\n'; ++j) {
+    }
+    if (j > start && j < limit && str.at(j) == '\n') {
+      dependences.AddDependence(
+          std::string{str.substr(start, j - start)}, intrinsic, *checkSum);
+    } else {
+      break;
+    }
+  }
+}
+
+Scope *ModFileReader::Read(SourceName name, std::optional<bool> isIntrinsic,
+    Scope *ancestor, bool silent) {
   std::string ancestorName; // empty for module
   Symbol *notAModule{nullptr};
   bool fatalError{false};
@@ -1190,12 +1279,26 @@ Scope *ModFileReader::Read(const SourceName &name,
     }
     ancestorName = ancestor->GetName().value().ToString();
   }
+  auto requiredHash{
+      context_.moduleDependences().GetRequiredHash(name.ToString(), isIntrinsic.value_or(false))};
   if (!isIntrinsic.value_or(false) && !ancestor) {
     // Already present in the symbol table as a usable non-intrinsic module?
     auto it{context_.globalScope().find(name)};
     if (it != context_.globalScope().end()) {
       Scope *scope{it->second->scope()};
       if (scope->kind() == Scope::Kind::Module) {
+        if (requiredHash) {
+          if (const Symbol * foundModule{scope->symbol()}) {
+            if (const auto *module{foundModule->detailsIf<ModuleDetails>()};
+                module && module->moduleFileHash() &&
+                *requiredHash != *module->moduleFileHash()) {
+              Say(name, ancestorName,
+                  "Multiple versions of the module '%s' cannot be required by the same compilation"_err_en_US,
+                  name.ToString());
+              return nullptr;
+            }
+          }
+        }
         return scope;
       } else {
         notAModule = scope->symbol();
@@ -1249,7 +1352,49 @@ Scope *ModFileReader::Read(const SourceName &name,
     for (const auto &dir : context_.intrinsicModuleDirectories()) {
       options.searchDirectories.push_back(dir);
     }
+    if (!requiredHash) {
+      requiredHash = context_.moduleDependences().GetRequiredHash(name.ToString(), true);
+    }
   }
+
+  // Look for the right module file if its hash is known
+  if (requiredHash && !fatalError) {
+    std::vector<std::string> misses;
+    for (const std::string &maybe :
+        parser::LocateSourceFileAll(path, options.searchDirectories)) {
+      if (const auto *srcFile{context_.allCookedSources().allSources().Open(
+              maybe, llvm::errs())}) {
+        if (auto checkSum{VerifyHeader(srcFile->content())}) {
+          if (*checkSum == *requiredHash) {
+            path = maybe;
+            if (!misses.empty()) {
+              auto &msg{context_.Say(name,
+                  "Module file for '%s' appears later in the module search path than conflicting modules with different checksums"_warn_en_US,
+                  name.ToString())};
+              for (const std::string &m : misses) {
+                msg.Attach(
+                    name, "Module file with a conflicting name: '%s'"_en_US, m);
+              }
+            }
+            misses.clear();
+            break;
+          } else {
+            misses.emplace_back(maybe);
+          }
+        }
+      }
+    }
+    if (!misses.empty()) {
+      auto &msg{Say(name, ancestorName,
+          "Could not find a module file for '%s' in the module search path with the expected checksum"_err_en_US,
+          name.ToString())};
+      for (const std::string &m : misses) {
+        msg.Attach(name, "Module file with different checksum: '%s'"_en_US, m);
+      }
+      return nullptr;
+    }
+  }
+
   const auto *sourceFile{fatalError ? nullptr : parsing.Prescan(path, options)};
   if (fatalError || parsing.messages().AnyFatalError()) {
     if (!silent) {
@@ -1270,10 +1415,17 @@ Scope *ModFileReader::Read(const SourceName &name,
     return nullptr;
   }
   CHECK(sourceFile);
-  if (!VerifyHeader(sourceFile->content())) {
+  std::optional<ModuleCheckSumType> checkSum{
+      VerifyHeader(sourceFile->content())};
+  if (!checkSum) {
     Say(name, ancestorName, "File has invalid checksum: %s"_warn_en_US,
         sourceFile->path());
     return nullptr;
+  } else if (requiredHash && *requiredHash != *checkSum) {
+    Say(name, ancestorName,
+        "File is not the right module file for %s"_warn_en_US,
+        "'"s + name.ToString() + "': "s + sourceFile->path());
+    return nullptr;
   }
   llvm::raw_null_ostream NullStream;
   parsing.Parse(NullStream);
@@ -1316,6 +1468,7 @@ Scope *ModFileReader::Read(const SourceName &name,
   // Process declarations from the module file
   bool wasInModuleFile{context_.foldingContext().inModuleFile()};
   context_.foldingContext().set_inModuleFile(true);
+  GetModuleDependences(context_.moduleDependences(), sourceFile->content());
   ResolveNames(context_, parseTree, topScope);
   context_.foldingContext().set_inModuleFile(wasInModuleFile);
   if (!moduleSymbol) {
@@ -1331,8 +1484,8 @@ Scope *ModFileReader::Read(const SourceName &name,
     }
   }
   if (moduleSymbol) {
-    CHECK(moduleSymbol->has<ModuleDetails>());
     CHECK(moduleSymbol->test(Symbol::Flag::ModFile));
+    moduleSymbol->get<ModuleDetails>().set_moduleFileHash(checkSum.value());
     if (isIntrinsic.value_or(false)) {
       moduleSymbol->attrs().set(Attr::INTRINSIC);
     }
@@ -1342,7 +1495,7 @@ Scope *ModFileReader::Read(const SourceName &name,
   }
 }
 
-parser::Message &ModFileReader::Say(const SourceName &name,
+parser::Message &ModFileReader::Say(SourceName name,
     const std::string &ancestor, parser::MessageFixedText &&msg,
     const std::string &arg) {
   return context_.Say(name, "Cannot read module file for %s: %s"_err_en_US,
diff --git a/flang/lib/Semantics/mod-file.h b/flang/lib/Semantics/mod-file.h
index 5be117153dd4d1..b4ece4018c054d 100644
--- a/flang/lib/Semantics/mod-file.h
+++ b/flang/lib/Semantics/mod-file.h
@@ -38,7 +38,8 @@ class ModFileWriter {
 
 private:
   SemanticsContext &context_;
-  // Buffer to use with raw_string_ostream
+  // Buffers to use with raw_string_ostream
+  std::string needsBuf_;
   std::string usesBuf_;
   std::string useExtraAttrsBuf_;
   std::string declsBuf_;
@@ -46,6 +47,7 @@ class ModFileWriter {
   // Tracks nested DEC structures and fields of that type
   UnorderedSymbolSet emittedDECStructures_, emittedDECFields_;
 
+  llvm::raw_string_ostream needs_{needsBuf_};
   llvm::raw_string_ostream uses_{usesBuf_};
   llvm::raw_string_ostream useExtraAttrs_{
       useExtraAttrsBuf_}; // attrs added to used entity
@@ -83,18 +85,17 @@ class ModFileWriter {
 
 class ModFileReader {
 public:
-  // directories specifies where to search for module files
   ModFileReader(SemanticsContext &context) : context_{context} {}
   // Find and read the module file for a module or submodule.
   // If ancestor is specified, look for a submodule of that module.
   // Return the Scope for that module/submodule or nullptr on error.
-  Scope *Read(const SourceName &, std::optional<bool> isIntrinsic,
-      Scope *ancestor, bool silent = false);
+  Scope *Read(SourceName, std::optional<bool> isIntrinsic, Scope *ancestor,
+      bool silent);
 
 private:
   SemanticsContext &context_;
 
-  parser::Message &Say(const SourceName &, const std::string &,
+  parser::Message &Say(SourceName, const std::string &,
       parser::MessageFixedText &&, const std::string &);
 };
 
diff --git a/flang/lib/Semantics/resolve-names.cpp b/flang/lib/Semantics/resolve-names.cpp
index 0cbe0b492fa44a..7b40e2bbc53328 100644
--- a/flang/lib/Semantics/resolve-names.cpp
+++ b/flang/lib/Semantics/resolve-names.cpp
@@ -3375,7 +3375,8 @@ void ModuleVisitor::BeginModule(const parser::Name &name, bool isSubmodule) {
 Scope *ModuleVisitor::FindModule(const parser::Name &name,
     std::optional<bool> isIntrinsic, Scope *ancestor) {
   ModFileReader reader{context()};
-  Scope *scope{reader.Read(name.source, isIntrinsic, ancestor)};
+  Scope *scope{
+      reader.Read(name.source, isIntrinsic, ancestor, /*silent=*/false)};
   if (!scope) {
     return nullptr;
   }
diff --git a/flang/lib/Semantics/resolve-names.h b/flang/lib/Semantics/resolve-names.h
index 78fdc2edc54a99..a6797b45635936 100644
--- a/flang/lib/Semantics/resolve-names.h
+++ b/flang/lib/Semantics/resolve-names.h
@@ -9,10 +9,6 @@
 #ifndef FORTRAN_SEMANTICS_RESOLVE_NAMES_H_
 #define FORTRAN_SEMANTICS_RESOLVE_NAMES_H_
 
-#include <iosfwd>
-#include <string>
-#include <vector>
-
 namespace llvm {
 class raw_ostream;
 }
diff --git a/flang/lib/Semantics/semantics.cpp b/flang/lib/Semantics/semantics.cpp
index a76c42ae4f44f5..e58a8f3b22c06c 100644
--- a/flang/lib/Semantics/semantics.cpp
+++ b/flang/lib/Semantics/semantics.cpp
@@ -515,7 +515,7 @@ bool SemanticsContext::IsTempName(const std::string &name) {
 
 Scope *SemanticsContext::GetBuiltinModule(const char *name) {
   return ModFileReader{*this}.Read(SourceName{name, std::strlen(name)},
-      true /*intrinsic*/, nullptr, true /*silence errors*/);
+      true /*intrinsic*/, nullptr, /*silent=*/true);
 }
 
 void SemanticsContext::UseFortranBuiltinsModule() {
diff --git a/flang/test/Semantics/Inputs/dir1/modfile63a.mod b/flang/test/Semantics/Inputs/dir1/modfile63a.mod
new file mode 100644
index 00000000000000..acaa125819b39e
--- /dev/null
+++ b/flang/test/Semantics/Inputs/dir1/modfile63a.mod
@@ -0,0 +1,6 @@
+!mod$ v1 sum:cbe36d213d935559
+module modfile63a
+contains
+subroutine s1()
+end
+end
diff --git a/flang/test/Semantics/Inputs/dir1/modfile63b.mod b/flang/test/Semantics/Inputs/dir1/modfile63b.mod
new file mode 100644
index 00000000000000..af5fec9e69bf64
--- /dev/null
+++ b/flang/test/Semantics/Inputs/dir1/modfile63b.mod
@@ -0,0 +1,8 @@
+!mod$ v1 sum:ddea620dc2aa0520
+!need$ cbe36d213d935559 n modfile63a
+module modfile63b
+use modfile63a,only:s1
+contains
+subroutine s2()
+end
+end
diff --git a/flang/test/Semantics/Inputs/dir2/modfile63a.mod b/flang/test/Semantics/Inputs/dir2/modfile63a.mod
new file mode 100644
index 00000000000000..8236d36c575888
--- /dev/null
+++ b/flang/test/Semantics/Inputs/dir2/modfile63a.mod
@@ -0,0 +1,6 @@
+!mod$ v1 sum:00761f8b3a4c5780
+module modfile63a
+contains
+subroutine s1a()
+end
+end
diff --git a/flang/test/Semantics/Inputs/dir2/modfile63b.mod b/flang/test/Semantics/Inputs/dir2/modfile63b.mod
new file mode 100644
index 00000000000000..af5fec9e69bf64
--- /dev/null
+++ b/flang/test/Semantics/Inputs/dir2/modfile63b.mod
@@ -0,0 +1,8 @@
+!mod$ v1 sum:ddea620dc2aa0520
+!need$ cbe36d213d935559 n modfile63a
+module modfile63b
+use modfile63a,only:s1
+contains
+subroutine s2()
+end
+end
diff --git a/flang/test/Semantics/getsymbols02.f90 b/flang/test/Semantics/getsymbols02.f90
index 25a4c30809fb29..2605a593e814de 100644
--- a/flang/test/Semantics/getsymbols02.f90
+++ b/flang/test/Semantics/getsymbols02.f90
@@ -11,4 +11,4 @@ PROGRAM helloworld
 ! RUN: %flang_fc1 -fsyntax-only %S/Inputs/getsymbols02-b.f90
 ! RUN: %flang_fc1 -fget-symbols-sources %s 2>&1 | FileCheck %s
 ! CHECK: callget5: .{{[/\\]}}mm2b.mod,
-! CHECK: get5: .{{[/\\]}}mm2a.mod,
+! CHECK: get5: .{{[/\\]}}.{{[/\\]}}mm2a.mod,
diff --git a/flang/test/Semantics/modfile63.f90 b/flang/test/Semantics/modfile63.f90
new file mode 100644
index 00000000000000..aaf1f7beaa48fa
--- /dev/null
+++ b/flang/test/Semantics/modfile63.f90
@@ -0,0 +1,19 @@
+! RUN: %flang_fc1 -fsyntax-only -I%S/Inputs/dir1 %s
+! RUN: not %flang_fc1 -fsyntax-only -I%S/Inputs/dir2 %s 2>&1 | FileCheck --check-prefix=ERROR %s
+! RUN: %flang_fc1 -Werror -fsyntax-only -I%S/Inputs/dir1 -I%S/Inputs/dir2 %s
+! RUN: not %flang_fc1 -Werror -fsyntax-only -I%S/Inputs/dir2 -I%S/Inputs/dir1 %s 2>&1 | FileCheck  --check-prefix=WARNING %s
+
+! Inputs/dir1 and Inputs/dir2 each have identical copies of modfile63b.mod.
+! modfile63b.mod depends on Inputs/dir1/modfile63a.mod - the version in
+! Inputs/dir2/modfile63a.mod has a distinct checksum and should be
+! ignored with a warning.
+
+! If it becomes necessary to recompile those modules, just use the
+! module files as Fortran source.
+
+use modfile63b
+call s2
+end
+
+! ERROR: Could not find a module file for 'modfile63a' in the module search path with the expected checksum
+! WARNING: Module file for 'modfile63a' appears later in the module search path than conflicting modules with different checksums
diff --git a/flang/test/Semantics/test_modfile.py b/flang/test/Semantics/test_modfile.py
index 87bd7dd0b55b80..0e7806f27aa90f 100755
--- a/flang/test/Semantics/test_modfile.py
+++ b/flang/test/Semantics/test_modfile.py
@@ -65,7 +65,7 @@
                 sys.exit(1)
             with open(mod, "r", encoding="utf-8", errors="strict") as f:
                 for line in f:
-                    if "!mod$" in line:
+                    if "!mod$" in line or "!need$" in line:
                         continue
                     actual += line
 



More information about the flang-commits mailing list