[PATCH] D143640: [Tooling/Inclusion] Support overload symbols in the stdlib.

Haojian Wu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Feb 13 04:50:31 PST 2023


hokein added a comment.

Thanks for the comment, putting more thoughts.

In D143640#4121998 <https://reviews.llvm.org/D143640#4121998>, @kadircet wrote:

> All this complexity for handling only 4 symbols feels wrong. is this the whole set? are there any other symbols that fall under this class? can we disambiguate all of them solely based on number of parameters?

No, these 4 symbols is not the whole set, there are other symbols (e.g. `isspace` from `<cctype>`, or `<locale>`), but these are not important, IMO.

The number of parameters is the most critical information to disambiguate -- there are some exceptions, e.g. `std::get()` has a few 1-parameter overloads for array, vector, etc. We will add more information in the `SymbolName` structure for disambiguation in the future.

> Downsides:
>
> - We are relying heavily on number of parameters in certain cases now

Yeah, I think this is expected if we want to do disambiguations.

> - We only have that information available for symbols with variants

My intention of this information is internal, and merely used for disambiguation. And we're not exposing them to public APIs.

> - We're creating a discrepancy in Symbol::all() vs Symbol::named() views of the world

Oh, this is not intended, we can remove this discrepancy, and make `Symbol::all()` and `Symbol::named()` aligned.

> - In addition to generator and hand curated list, now there's a 3rd place to augment symbol mapping

Right, this bit is untweaked in this current patch (there is a FIXME), we can merge it with the hand-curated list.

> This complexity both in terms of mental load when introducing new symbols, and also possible discrepancies that can be encountered when using the public APIs makes me question if we're doing the sensible thing that'll generalize for the future variant symbols, or just trying to patch up a solution for couple cases we know to be not working.
> I'd suggest:
> 1- doing a little bit more design work here and at least getting rid of the burden on the public APIs and a way to introduce these symbols through hand curated lists (e.g. we should be returning all alternatives in Symbol::named, with enough details for the caller to have a chance to disambiguate and have maybe new syntax in raw symbol map files to indicate this variant situation).

OK, looks like we have some different ideas on the design here. I'm not sure returning all alternatives in `Symbol::named` and exposing all information for disambiguation is a good idea (I'm also not sure it is an important usecase). This seems complicating the public APIs too much.

My original design is

- full disambiguation is *only* performed in the `Recognizer` API where the input is the AST node
- other qualified-name-based APIs `Symbol::named` remains the same (or best effort returning the symbol with most common header)

I share your concern about the current approach, it is not perfect, but I think this is moving towards a right direction, the tradeoff seems OK to me (rather than putting the special-case handling to *every* application, we isolate it in one place).

> 2- or handling this in include-cleaner only, similar to how we do it in clangd and other applications, until we encounter more problematic symbols. as handling this in the general case seems hard (especially in the absence of an AST), and we haven't seen enough signals for symbols other than `std::move` to cause trouble.
>
> i feel like 2nd alternative gets us a long way with minimal effort. WDYT?

The 2nd alternative definitely sounds good to me. Before we settle down the design, I will prepare a patch to special-case these symbols in include-cleaner.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D143640/new/

https://reviews.llvm.org/D143640



More information about the cfe-commits mailing list