[PATCH] D142092: [include-mapping] Allow multiple headers for the same symbol. Choose the first header of available ones.

Kadir Cetinkaya via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Jan 27 10:39:05 PST 2023


kadircet added inline comments.


================
Comment at: clang/tools/include-mapping/cppreference_parser.py:196
+      "std::remove$": "algorithm",
+      "std::atomic.*": "atomic",
+
----------------
there's no variant of "std::atomic.*" called "atomic", in https://en.cppreference.com/w/cpp/symbol_index. is it something in 2018 version of the html book? otherwise there's only the unnamed variant and std::shared_ptr variant, and we should be preferring unnamed variant.


================
Comment at: clang/tools/include-mapping/cppreference_parser.py:107
+  # FIXME: only consider first symbol, assume further symbols are overloads
+  all_headers = sorted(list(all_headers))
+  if len(all_headers) > 0:
----------------
VitaNuo wrote:
> hokein wrote:
> > why sort the headers?
> This was a hack to make the generator return `atomic` rather than `memory` for `std::atomic.*` symbols. But I've noticed now that this adds some trouble to the C symbol map. I think now that cutting `all_headers` to the first element is wrong.
> I will special-case the `std::atomic.*` symbols instead, because so far I don't see a way to solve this programmatically without other collateral effects.
> 
> Note that the problem is not `std::atomic` itself (for which Kadir extensively described the solution in some of the other comments), but rather `atomic_bool`, `atomic_char` etc. mentioned further in the page. The headers for them come from `all_headers` rather than `symbol_headers`, and there seems to be no way to limit them to `<atomic>` without other collateral effects.
> I think now that cutting all_headers to the first element is wrong.

What's exactly breaking once you do that? (I guess by `first element` you still mean "all the headers we've seen until the first declaration in the page")

>  but rather atomic_bool, atomic_char etc. mentioned further in the page. The headers for them come from all_headers rather than symbol_headers, and there seems to be no way to limit them to <atomic> without other collateral effects.

I guess this is happening because we're trying to find headers that are for the declaration block containing the interesting symbol name. Any idea if that logic is winning us anything? e.g. if we dropped the `        if symbol_name in found_symbols:` what does change?


================
Comment at: clang/tools/include-mapping/cppreference_parser.py:165
       # FIXME: use these as a fallback rather than ignoring entirely.
-      variants_for_symbol = variants_to_accept.get(
-          (namespace or "") + symbol_name, ())
-      if variant and variant not in variants_for_symbol:
+      header_to_accept = variants_to_accept.get(
+          (namespace or "") + symbol_name, "")
----------------
VitaNuo wrote:
> kadircet wrote:
> > `variant` is not necessarily the `header`, eg:
> > ```
> > acos()
> > acos<>() (std::complex) (since C++11)
> > acos<>() (std::valarray)
> > ```
> > 
> > in this case variants are `std::complex` and `std::valarray`. 
> > 
> > hence we're not trying to "infer" the header we want to preserve but rather decide on which symbol page we want to parse. we should still accept "all the headers" mentioned in that variant symbol page.
> Ah ok, I was mostly looking at `std::remove (algorithm)`. Ok, I will use the term "variant" as before. Thanks.
sorry i wasn't just trying to nitpick on the name of the variable/parameters.

i was trying to say `_ReadSymbolPage` should not be a function of `variant_to_accept`.
we should "completely skip" parsing of symbol pages for variants we don't want to take in (that's what this `continue` here is trying to achieve)

later on when we're parsing a symbol page, the name of the variant shouldn't effect the headers we're picking in any way (especially in the way you're relying right now that says variant name and the header name will be the same, as i pointed out this might as well be `std::complex`).

Is this trying to work around a different problem? (especially around `std::atomic_.*` I assume, i'll follow up on that thread)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142092/new/

https://reviews.llvm.org/D142092



More information about the cfe-commits mailing list