[PATCH] D142092: [include-mapping] Allow multiple headers for the same symbol. Choose the first header of available ones.

Haojian Wu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Feb 1 01:00:54 PST 2023


hokein added a comment.

As discussed offline, we decided to stop spending effort on improving the cppreference_parser.py, instead, we allow human edits in the generated `StdSymbolMap.inc` and `CSymbolMap.inc`, it gives us more flexibility: sort the headers for multi-header symbol in a humn-desired order rather than suboptimal alphabet order; symbols (e.g. `std::experimental::filesystem`) that are not available in cppreference website; symbols (e.g. `PRIX16`) from a complicated cppreference HTML page which is hard to parse in cppreference_parser.py etc.

We have two major things in this patch:

1. extend the Stdlib API to support multiple-header symbols
2. emits multiple-header symbols in the *.inc files

For 1), this change is good. We need to make sure the change not break other users of the `.inc` files. clangd is the main user, the behavior will be changed in https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clangd/index/CanonicalIncludes.cpp#L736 as now we have duplicated qualified names (the ideal solution is to ignore multiple-header symbols, I think it is fine now if we just add a single `std::size_t` symbol, see my comment below)
For 2), I have some concerns, it looks like some multiple-headers symbols that are handled incorrectly by the `cppreference_parser.py` (see my other comments). I think the best way for us is to list them manually (no action required, I will make a followup)

I'd suggest to narrow the scope of the patch:

- keep 1) and add a unittest in `llvm-project/clang/unittests/Tooling/StandardLibraryTest.cpp`;
- drop all generated multi-header symbols in *.inc files, instead, we manually add a single one `std::size_t` symbol to the `StdSymbolMap.inc` for the StandardLibrary unittest. This means we need to revert the current change of `gen_std.py`;



================
Comment at: clang/include/clang/Tooling/Inclusions/CSymbolMap.inc:80
 SYMBOL(FOPEN_MAX, None, <stdio.h>)
+SYMBOL(FP_FAST_FMA, None, <math.h>)
+SYMBOL(FP_FAST_FMAF, None, <math.h>)
----------------
These symbols seem to only have a single header, I think it is an improvement effect of  the new change of parsing logic in cppreference_parser.py.


================
Comment at: clang/include/clang/Tooling/Inclusions/CSymbolMap.inc:96
+SYMBOL(INT16_C, None, <inttypes.h>)
+SYMBOL(INT16_C, None, <stdint.h>)
 SYMBOL(INT16_MAX, None, <stdint.h>)
----------------
This type of symbol (`INTX_C`) seems not correct, they are only from <stdint.h> -- (this is a limitation of our current cppreference parser (it can not handle the complex page: https://en.cppreference.com/w/c/types/integer)


================
Comment at: clang/include/clang/Tooling/Inclusions/CSymbolMap.inc:172
 SYMBOL(ONCE_FLAG_INIT, None, <threads.h>)
+SYMBOL(PRIX16, None, <inttypes.h>)
+SYMBOL(PRIX16, None, <stdint.h>)
----------------
I think this kind of symbol (`PRIdN`, `PRIdLEASTN`, `PRIdFASTN`, `PRIdMAX`, `PRIdPTR` etc) is only from `<inttypes.h>` (another limitation of the cppreference parser). 


================
Comment at: clang/lib/Tooling/Inclusions/Stdlib/StandardLibrary.cpp:33
 #undef SYMBOL
   SymbolNames = new std::remove_reference_t<decltype(*SymbolNames)>[SymCount];
   SymbolHeaderIDs =
----------------
It is sad that we're allocating more memory for these two arrays now (as we have multiple SYMBOLs for the same qualified name). No action needed, can you add a comment saying that we're allocate more space here?


================
Comment at: clang/tools/include-mapping/test.py:56
 """
-    self.assertEqual(_ParseSymbolPage(html, 'foo'), set(['<cmath>']))
+    self.assertEqual(_ParseSymbolPage(html, 'foo', ""), set(['<cmath>']))
 
----------------
The test needs to update as well, as you remove the change of `_ParseSymbolPage`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142092/new/

https://reviews.llvm.org/D142092



More information about the cfe-commits mailing list