[all-commits] [llvm/llvm-project] 5a0181: [serialization] no transitive decl change (#92083)

Chuanqi Xu via All-commits all-commits at lists.llvm.org
Fri Jun 7 05:22:53 PDT 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 5a0181f568e56e37df80d0f74eca4775776fa8cd
      https://github.com/llvm/llvm-project/commit/5a0181f568e56e37df80d0f74eca4775776fa8cd
  Author: Chuanqi Xu <yedeng.yd at linux.alibaba.com>
  Date:   2024-06-07 (Fri, 07 Jun 2024)

  Changed paths:
    M clang/include/clang/AST/ASTUnresolvedSet.h
    M clang/include/clang/AST/DeclAccessPair.h
    M clang/include/clang/AST/DeclBase.h
    M clang/include/clang/AST/DeclID.h
    M clang/include/clang/AST/UnresolvedSet.h
    M clang/include/clang/Serialization/ASTBitCodes.h
    M clang/include/clang/Serialization/ASTReader.h
    M clang/include/clang/Serialization/ModuleFile.h
    M clang/include/clang/Serialization/ModuleManager.h
    M clang/lib/AST/DeclBase.cpp
    M clang/lib/AST/DeclCXX.cpp
    M clang/lib/Serialization/ASTReader.cpp
    M clang/lib/Serialization/ASTReaderDecl.cpp
    M clang/lib/Serialization/ASTWriter.cpp
    M clang/lib/Serialization/ModuleFile.cpp
    A clang/test/Modules/no-transitive-decls-change.cppm

  Log Message:
  -----------
  [serialization] no transitive decl change (#92083)

Following of https://github.com/llvm/llvm-project/pull/86912

The motivation of the patch series is that, for a module interface unit
`X`, when the dependent modules of `X` changes, if the changes is not
relevant with `X`, we hope the BMI of `X` won't change. For the specific
patch, we hope if the changes was about irrelevant declaration changes,
we hope the BMI of `X` won't change. **However**, I found the patch
itself is not very useful in practice, since the adding or removing
declarations, will change the state of identifiers and types in most
cases.

That said, for the most simple example,

```
// partA.cppm
export module m:partA;

// partA.v1.cppm
export module m:partA;
export void a() {}

// partB.cppm
export module m:partB;
export void b() {}

// m.cppm
export module m;
export import :partA;
export import :partB;

// onlyUseB;
export module onlyUseB;
import m;
export inline void onluUseB() {
    b();
}
```

the BMI of `onlyUseB` will change after we change the implementation of
`partA.cppm` to `partA.v1.cppm`. Since `partA.v1.cppm` introduces new
identifiers and types (the function prototype).

So in this patch, we have to write the tests as:

```
// partA.cppm
export module m:partA;
export int getA() { ... }
export int getA2(int) { ... }

// partA.v1.cppm
export module m:partA;
export int getA() { ... }
export int getA(int) { ... }
export int getA2(int) { ... }

// partB.cppm
export module m:partB;
export void b() {}

// m.cppm
export module m;
export import :partA;
export import :partB;

// onlyUseB;
export module onlyUseB;
import m;
export inline void onluUseB() {
    b();
}
```

so that the new introduced declaration `int getA(int)` doesn't introduce
new identifiers and types, then the BMI of `onlyUseB` can keep
unchanged.

While it looks not so great, the patch should be the base of the patch
to erase the transitive change for identifiers and types since I don't
know how can we introduce new types and identifiers without introducing
new declarations. Given how tightly the relationship between
declarations, types and identifiers, I think we can only reach the ideal
state after we made the series for all of the three entties.

The design of the patch is similar to
https://github.com/llvm/llvm-project/pull/86912, which extends the
32-bit DeclID to 64-bit and use the higher bits to store the module file
index and the lower bits to store the Local Decl ID.

A slight difference is that we only use 48 bits to store the new DeclID
since we try to use the higher 16 bits to store the module ID in the
prefix of Decl class. Previously, we use 32 bits to store the module ID
and 32 bits to store the DeclID. I don't want to allocate additional
space so I tried to make the additional space the same as 64 bits. An
potential interesting thing here is about the relationship between the
module ID and the module file index. I feel we can get the module file
index by the module ID. But I didn't prove it or implement it. Since I
want to make the patch itself as small as possible. We can make it in
the future if we want.

Another change in the patch is the new concept Decl Index, which means
the index of the very big array `DeclsLoaded` in ASTReader. Previously,
the index of a loaded declaration is simply the Decl ID minus
PREDEFINED_DECL_NUMs. So there are some places they got used
ambiguously. But this patch tried to split these two concepts.

As https://github.com/llvm/llvm-project/pull/86912 did, the change will
increase the on-disk PCM file sizes. As the declaration ID may be the
most IDs in the PCM file, this can have the biggest impact on the size.
In my experiments, this change will bring 6.6% increase of the on-disk
PCM size. No compile-time performance regression observed. Given the
benefits in the motivation example, I think the cost is worthwhile.



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list