[PATCH] D142092: [include-mapping] Allow multiple headers for the same symbol. Choose the first header of available ones.

Kadir Cetinkaya via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Jan 24 08:08:34 PST 2023


kadircet added inline comments.


================
Comment at: clang/tools/include-mapping/cppreference_parser.py:174
-      # std::remove<> has variant algorithm.
-      "std::remove": ("algorithm"),
-  }
----------------
VitaNuo wrote:
> VitaNuo wrote:
> > kadircet wrote:
> > > VitaNuo wrote:
> > > > kadircet wrote:
> > > > > this is actually checking for something else (sorry for the confusing naming).
> > > > > 
> > > > > the `variant` here refers to library name mentioned in parentheses (this is same problem as `std::move`) on the std symbol index page https://en.cppreference.com/w/cpp/symbol_index (e.g. `remove<>() (algorithm)`). by getting rid of this we're introducing a regression, as previously `std::remove` wouldn't be recognized by the library, but now it'll be recognized and we'll keep suggesting `<cstdio>` for it.
> > > > > 
> > > > > so we should actually keep this around.
> > > > Ok, I can keep this out of this patch, but we'll have to remove this logic evetually when we deal with overloads.
> > > > 
> > > > I have a slight suspicion that this code might be buggy, because it suggests that one _of_ the variants should be accepted. What is does in reality, though, is it keeps `algorithm` in the list of headers suitable for `std::remove` alongside `cstdio`, and then in the last step `std::remove` is ignored by the generator because of being defined in two headers.
> > > > 
> > > > With this patch, the result will be both `{cstdio, algorithm}`. Is this (more) satisfactory for now compared to skipping `algorithm` due to being an overload?
> > > > 
> > > > Ok, I can keep this out of this patch, but we'll have to remove this logic evetually when we deal with overloads.
> > > 
> > > Surely, I wasn't saying this should stay here forever, i am just saying that what's done in the scope of this patch doesn't really address the issues "worked around" by this piece.
> > > 
> > > > I have a slight suspicion that this code might be buggy, because it suggests that one _of_ the variants should be accepted. What is does in reality, though, is it keeps algorithm in the list of headers suitable for std::remove alongside cstdio, and then in the last step std::remove is ignored by the generator because of being defined in two headers.
> > > 
> > > right, it's because we have logic to prefer "non-variant" versions of symbols when available (i.e. in the absence of this logic, we'd prefer std::remove from cstdio). this logic enables us to preserve certain variants (in addition to non-variants). that way we treat std::remove as ambigious rather than always resolving to <cstdio>, hence it's marked as "missing", similar to `std::move`.
> > > 
> > > > With this patch, the result will be both {cstdio, algorithm}. Is this (more) satisfactory for now compared to skipping algorithm due to being an overload?
> > > 
> > > in the end this should probably look like {algorithm, cstdio}, but as mentioned elsewhere, this is not the same problem as "same symbol being provided by multiple header" but rather "different symbols having same name and different headers". so treatment of it shouldn't change by this patch.
> > On second thought, I think we'd rather special-case the overloads for now (until we get to them). See the newest patch version.
> > right, it's because we have logic to prefer "non-variant" versions of symbols when available (i.e. in the absence of this logic, we'd prefer std::remove from cstdio).
> 
> Where is this logic? AFAICS the generator in the current state doesn't generate anything for std::remove. 
> Where is this logic? AFAICS the generator in the current state doesn't generate anything for std::remove.

It's the logic that you're deleting:
```
    if variant and variant not in variants_for_symbol:
        continue
```

we ignore any symbols that has a variant we shouldn't accept and always prefer the non-variant versions. to be more concrete when parsing c++ symbol index we'll see two alternatives for `std::remove`:

```
remove()
remove<>() (algorithm)
```

first one is a non-variant, hence it's accepted by default. second one is `algorithm` variant, and is accepted per this deleted logic. in the end we **used** to didn't generate anything because we now have multiple headers providing `std::remove`.

i think it creates more confusion to special case only std::remove down below, and not other symbols whose non-variant versions we accept, e.g. sin also has non-variant versions, but we don't "special case" it to keep it out of the map.

i'd suggest treating std::remove similar to other symbols with an accepted variant, rather than creating a divergence. so my suggestion is changing the logic above to **only** accept variants mentioned in the map when there's a mapping for the symbol (and reject the non-variant version). net effect will be that we'll now include the mapping from `std::remove` to `<algorithm>`, which is currently inline with our behaviour against other symbols with different variants. we try to pick the most common variant, and that's almost always the "non-variant" version, apart from `std::remove`.

that way we should get rid of the special casing below completely. does that make sense?


================
Comment at: clang/tools/include-mapping/gen_std.py:112
   for symbol in symbols:
-    if len(symbol.headers) == 1:
-      # SYMBOL(unqualified_name, namespace, header)
-      print("SYMBOL(%s, %s, %s)" % (symbol.name, symbol.namespace,
-                                    symbol.headers[0]))
-    elif len(symbol.headers) == 0:
+    if re.match("^remove$|^swap$", symbol.name) and symbol.namespace == "std::":
+      continue
----------------
different headers providing `std::swap` don't provide different variants of it, these are just ADL extension points, similar to std::begin (which also shouldn't be missing from the mapping).

for `remove`, i've provided more context in the discussion above, about keeping a different variant of `std::remove` around.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142092/new/

https://reviews.llvm.org/D142092



More information about the cfe-commits mailing list