[all-commits] [llvm/llvm-project] f02b1c: [ASTWriter] Detect more non-affecting FileIDs to r...

Ilya Biryukov via All-commits all-commits at lists.llvm.org
Fri Nov 8 00:10:58 PST 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: f02b1cc99e12ac0147d5c334f130a305d85e477a
      https://github.com/llvm/llvm-project/commit/f02b1cc99e12ac0147d5c334f130a305d85e477a
  Author: Ilya Biryukov <ibiryukov at google.com>
  Date:   2024-11-08 (Fri, 08 Nov 2024)

  Changed paths:
    M clang/include/clang/Serialization/ASTWriter.h
    M clang/lib/Serialization/ASTReader.cpp
    M clang/lib/Serialization/ASTWriter.cpp
    A clang/test/Modules/prune-non-affecting-module-map-repeated.cpp

  Log Message:
  -----------
  [ASTWriter] Detect more non-affecting FileIDs to reduce source location duplication (#112015)

Currently, any FileID that references a module map file that was
required for a compilation is considered as affecting. This misses an
important opportunity to reduce the source location space taken by the
resulting PCM.

In particular, consider the situation where the same module map file is
passed multiple times in the dependency chain:

```shell
$ clang -fmodule-map-file=foo.modulemap ... -o mod1.pcm
$ clang -fmodule-map-file=foo.modulemap -fmodule-file=mod1.pcm ... -o mod2.pcm
...
$ clang -fmodule-map-file=foo.modulemap -fmodule-file=mod$((N-1)).pcm ... -o mod$N.pcm
```

Because `foo.modulemap` is read before reading any of the `.pcm` files,
we have to create a unique `FileID` for it when creating each module.
However, when reading the `.pcm` files, we will reuse the `FileID`
loaded from it for the same module map file and the `FileID` we created
can never be used again, but we will still mark it as affecting and it
will take the source location space in the output PCM.

For a chain of N dependencies, this results in the file taking `N *
(size of file)` source location space, which could be significant. For
examples, we observer internally that some targets that run out of 2GB
of source location space end up wasting up to 20% of that space in
module maps as described above.

I take extra care to still write the InputFile entries for those files that occupied
source location space before. It is required for correctness of clang-scan-deps.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list