[all-commits] [llvm/llvm-project] 84df7a: [clang][modules] Do not resolve `HeaderFileInfo` e...
Jan Svoboda via All-commits
all-commits at lists.llvm.org
Thu Apr 11 14:46:10 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 84df7a09f8da5809b85fd097015e5ac6cc8a3f88
https://github.com/llvm/llvm-project/commit/84df7a09f8da5809b85fd097015e5ac6cc8a3f88
Author: Jan Svoboda <jan_svoboda at apple.com>
Date: 2024-04-11 (Thu, 11 Apr 2024)
Changed paths:
M clang-tools-extra/clangd/index/SymbolCollector.cpp
M clang/include/clang/Lex/HeaderSearch.h
M clang/lib/Lex/HeaderSearch.cpp
M clang/lib/Serialization/ASTWriter.cpp
Log Message:
-----------
[clang][modules] Do not resolve `HeaderFileInfo` externally in `ASTWriter` (#87848)
Clang uses the `HeaderFileInfo` struct to track bits of information on
header files, which gets used throughout the compiler. We also use this
to compute the set of affecting module maps in `ASTWriter` and in the
end serialize the information into the `HEADER_SEARCH_TABLE` record of a
PCM file, allowing clients to learn about headers from the module. In
doing so, Clang asks for existing `HeaderFileInfo` for all known
`FileEntries`. Note that this asks the loaded PCM files for the
information they have on each header file in question. This seems
unnecessary: we only want to serialize information on header files that
either belong to the current module or that got included textually.
Loaded PCM files can't provide us with any useful information.
For explicit modules with lazy loading (using `-fmodule-map-file=<path>`
with `-fmodule-file=<name>=<path>`) the compiler knows about header
files listed in the module map files on the command-line. This can be a
large number.
Asking for existing `HeaderFileInfo` can trigger deserialization of
`HEADER_SEARCH_TABLE` from loaded PCM files. Keys of the on-disk hash
table consist of the header file size and modification time. However,
with explicit modules Clang zeroes out the modification time. Moreover,
if you import lots of modules, some of their header files end up having
identical sizes. This means lots of hash collisions that can only be
resolved by running the serialized filename through `FileManager` and
comparing equality of the `FileEntry`. This ends up being super
expensive, essentially re-stating lots of the transitively loaded SDK
header files.
This patch cleans up the API for getting `HeaderFileInfo` and makes sure
`ASTWriter` uses the version that doesn't ask loaded PCM files for more
information. This removes the excessive stat traffic coming from
`ASTWriter` hopefully without changing observable behavior.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list