[PATCH] D142092: [include-mapping] Allow multiple headers for the same symbol. Choose the first header of available ones.

Haojian Wu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Jan 27 05:15:24 PST 2023


hokein added a comment.

I haven't read though all the change of `_ParseSymbolPage`, left some comments.

I think we need to update&add proper tests in `include-mapping/test.py` as we has changed the parser.



================
Comment at: clang/include/clang/Tooling/Inclusions/StdSymbolMap.inc:1162
 SYMBOL(subtract_with_carry_engine, std::, <random>)
+SYMBOL(swap, std::, <algorithm>)
 SYMBOL(swap_ranges, std::, <algorithm>)
----------------
Is this intended? it seems to me the cppparser doesn't handle this case correctly, in the swap symbol page, we have two headers in a single `t-dsc-header` tr, the parser should consider both (like the above `size_t`).

```
Defined in header <algorithm>
Defined in header <utility>
```


================
Comment at: clang/lib/Tooling/Inclusions/Stdlib/StandardLibrary.cpp:57
+    // ordered alpahbetically in the map.
+    unsigned SymIndex = NextAvailSymIndex;
+    if (NextAvailSymIndex > 0 &&
----------------
We can avoid a local variable by

```
auto Add = [.. SymIndex(-1)] (....) {
if (SymIndex >=0 &&
    SymbolNames[SymIndex].first == NS && ...) {
} else {
    // the first symbol, or new symbol.
    ++SymIndex;
}

SymbolNames[SymIndex] = {NS, Name};
....
}
```


================
Comment at: clang/tools/include-mapping/cppreference_parser.py:39
 
-def _ParseSymbolPage(symbol_page_html, symbol_name):
+def _ParseSymbolPage(symbol_page_html, symbol_name, header_to_accept):
   """Parse symbol page and retrieve the include header defined in this page.
----------------
If the `header_to_accept` is set, I think we can just return {header_to_accept} directly.


================
Comment at: clang/tools/include-mapping/cppreference_parser.py:96
+        if symbol_name in found_symbols:
+          # FIXME: only update symbol headers on first symbol discovery, assume
+          # same symbol occurence in a subsequent header block is an overload.
----------------
IIUC, this is the major and intended change of the function, this is not a FIXME.


================
Comment at: clang/tools/include-mapping/cppreference_parser.py:105
+
+  # If the symbol was never named, consider all named headers.      
+  # FIXME: only consider first symbol, assume further symbols are overloads
----------------
The comment doesn't match the current behavior. IIUC, iof the symbol was never named, we only consider the first header now.


================
Comment at: clang/tools/include-mapping/cppreference_parser.py:107
+  # FIXME: only consider first symbol, assume further symbols are overloads
+  all_headers = sorted(list(all_headers))
+  if len(all_headers) > 0:
----------------
why sort the headers?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142092/new/

https://reviews.llvm.org/D142092



More information about the cfe-commits mailing list