[PATCH] D142092: [include-mapping] Allow multiple headers for the same symbol. Choose the first header of available ones.
Haojian Wu via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Feb 1 01:00:54 PST 2023
hokein added a comment.
As discussed offline, we decided to stop spending effort on improving the cppreference_parser.py, instead, we allow human edits in the generated `StdSymbolMap.inc` and `CSymbolMap.inc`, it gives us more flexibility: sort the headers for multi-header symbol in a humn-desired order rather than suboptimal alphabet order; symbols (e.g. `std::experimental::filesystem`) that are not available in cppreference website; symbols (e.g. `PRIX16`) from a complicated cppreference HTML page which is hard to parse in cppreference_parser.py etc.
We have two major things in this patch:
1. extend the Stdlib API to support multiple-header symbols
2. emits multiple-header symbols in the *.inc files
For 1), this change is good. We need to make sure the change not break other users of the `.inc` files. clangd is the main user, the behavior will be changed in https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clangd/index/CanonicalIncludes.cpp#L736 as now we have duplicated qualified names (the ideal solution is to ignore multiple-header symbols, I think it is fine now if we just add a single `std::size_t` symbol, see my comment below)
For 2), I have some concerns, it looks like some multiple-headers symbols that are handled incorrectly by the `cppreference_parser.py` (see my other comments). I think the best way for us is to list them manually (no action required, I will make a followup)
I'd suggest to narrow the scope of the patch:
- keep 1) and add a unittest in `llvm-project/clang/unittests/Tooling/StandardLibraryTest.cpp`;
- drop all generated multi-header symbols in *.inc files, instead, we manually add a single one `std::size_t` symbol to the `StdSymbolMap.inc` for the StandardLibrary unittest. This means we need to revert the current change of `gen_std.py`;
================
Comment at: clang/include/clang/Tooling/Inclusions/CSymbolMap.inc:80
SYMBOL(FOPEN_MAX, None, <stdio.h>)
+SYMBOL(FP_FAST_FMA, None, <math.h>)
+SYMBOL(FP_FAST_FMAF, None, <math.h>)
----------------
These symbols seem to only have a single header, I think it is an improvement effect of the new change of parsing logic in cppreference_parser.py.
================
Comment at: clang/include/clang/Tooling/Inclusions/CSymbolMap.inc:96
+SYMBOL(INT16_C, None, <inttypes.h>)
+SYMBOL(INT16_C, None, <stdint.h>)
SYMBOL(INT16_MAX, None, <stdint.h>)
----------------
This type of symbol (`INTX_C`) seems not correct, they are only from <stdint.h> -- (this is a limitation of our current cppreference parser (it can not handle the complex page: https://en.cppreference.com/w/c/types/integer)
================
Comment at: clang/include/clang/Tooling/Inclusions/CSymbolMap.inc:172
SYMBOL(ONCE_FLAG_INIT, None, <threads.h>)
+SYMBOL(PRIX16, None, <inttypes.h>)
+SYMBOL(PRIX16, None, <stdint.h>)
----------------
I think this kind of symbol (`PRIdN`, `PRIdLEASTN`, `PRIdFASTN`, `PRIdMAX`, `PRIdPTR` etc) is only from `<inttypes.h>` (another limitation of the cppreference parser).
================
Comment at: clang/lib/Tooling/Inclusions/Stdlib/StandardLibrary.cpp:33
#undef SYMBOL
SymbolNames = new std::remove_reference_t<decltype(*SymbolNames)>[SymCount];
SymbolHeaderIDs =
----------------
It is sad that we're allocating more memory for these two arrays now (as we have multiple SYMBOLs for the same qualified name). No action needed, can you add a comment saying that we're allocate more space here?
================
Comment at: clang/tools/include-mapping/test.py:56
"""
- self.assertEqual(_ParseSymbolPage(html, 'foo'), set(['<cmath>']))
+ self.assertEqual(_ParseSymbolPage(html, 'foo', ""), set(['<cmath>']))
----------------
The test needs to update as well, as you remove the change of `_ParseSymbolPage`.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D142092/new/
https://reviews.llvm.org/D142092
More information about the cfe-commits
mailing list