[PATCH] D142092: [include-mapping] Allow multiple headers for the same symbol. Choose the first header of available ones.
Haojian Wu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Mon Jan 23 06:06:48 PST 2023
hokein added a comment.
Can you add a unittest in the `StandardLibraryTest.cpp`?
================
Comment at: clang/include/clang/Tooling/Inclusions/StdSymbolMap.inc:1053
SYMBOL(remainder, std::, <cmath>)
+SYMBOL(remove, std::, <cstdio>)
SYMBOL(remove_all_extents, std::, <type_traits>)
----------------
I think `std::remove` should not be in the scope of the patch, it has two variants:
- std::remove from `<cstdio>`
- and std::remove from `<algorithm>`.
================
Comment at: clang/include/clang/Tooling/Inclusions/StdSymbolMap.inc:100
SYMBOL(atoll, std::, <cstdlib>)
+SYMBOL(atomic, std::, <atomic>)
+SYMBOL(atomic, std::, <memory>)
----------------
Conceptually, this (and other `atomic_*` symbols) doesn't feel correct:
- `<atomic>` provides the generic template `template<class T> struct atomic;`
- `<memory>` provides partial template specializations for `std::shared_ptr` and `std::weak_ptr`
They are variant symbols (ideally, they should be treat as the `std::move()`). The issue here is that unlike `std::move` which has two individual entries in the index page, we only have one entry for `std::atomic` (extending the cppreference_parser.py script to perfectly distinguish these two cases in the [page](https://en.cppreference.com/w/cpp/atomic/atomic) seems non-trivial). Some options:
1) treat them as multiple-header symbols (like what this patch does now)
2) special-case these symbols like `std::move()`
3) always prefer the header providing generic templates
@kadircet, what do you think?
================
Comment at: clang/include/clang/Tooling/Inclusions/StdSymbolMap.inc:369
+SYMBOL(div, std::, <cinttypes>)
+SYMBOL(div, std::, <cstdlib>)
SYMBOL(div_t, std::, <cstdlib>)
----------------
this one as well, both headers provide different overloads.
================
Comment at: clang/lib/Tooling/Inclusions/Stdlib/StandardLibrary.cpp:20
static llvm::StringRef *HeaderNames;
-static std::pair<llvm::StringRef, llvm::StringRef> *SymbolNames;
-static unsigned *SymbolHeaderIDs;
+static llvm::DenseMap<int, std::pair<llvm::StringRef, llvm::StringRef>>
+ *SymbolNames;
----------------
int => unsigned.
Any reason to change it to a map? My understanding is that SymbolNames is a mapping from the SymbolID => {Scope, Name}, it should not be affected if we add multiple-header symbols.
================
Comment at: clang/lib/Tooling/Inclusions/Stdlib/StandardLibrary.cpp:57
+ bool IsNewSymbol = true;
+ int SymIndex = NextAvailSymIndex;
+ if (NSSymbolMap *NSSymbols = NamespaceSymbols->lookup(NS)) {
----------------
int => unsigned.
================
Comment at: clang/lib/Tooling/Inclusions/Stdlib/StandardLibrary.cpp:58
+ int SymIndex = NextAvailSymIndex;
+ if (NSSymbolMap *NSSymbols = NamespaceSymbols->lookup(NS)) {
+ auto It = NSSymbols->find(Name);
----------------
Given the fact that multiple-header symbols are grouped in the .inc file, we could simplify the code of checking a new symbol by looking at the last available SymIndex:
```
if (NextAvailSymIndex > 0 && SymbolNames[NextAvailSymIndex-1].first == NS && SymbolNames[NextAvailSymIndex-1].second == Name) {
// this is not a new symbol.
}
```
================
Comment at: clang/lib/Tooling/Inclusions/Stdlib/StandardLibrary.cpp:69
+ unsigned HeaderID = AddHeader(HeaderName);
+ SymbolHeaderIDs[SymIndex].emplace_back(HeaderID);
----------------
nit: inline the `HeaderID`.
================
Comment at: clang/lib/Tooling/Inclusions/Stdlib/StandardLibrary.cpp:103
+llvm::StringRef Symbol::scope() const {
+ auto It = SymbolNames->find(ID);
+ if (It != SymbolNames->end()) {
----------------
The `ID` field is guaranteed to be valid, so no need to do the sanity check.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D142092/new/
https://reviews.llvm.org/D142092
More information about the cfe-commits
mailing list