[PATCH] D58345: [clangd] Using symbol name to map includes for STL symbols.

Haojian Wu via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Feb 20 11:10:12 PST 2019


hokein marked 2 inline comments as done.
hokein added inline comments.


================
Comment at: clangd/index/CanonicalIncludes.cpp:47
+  // headers (e.g. std::move from <utility> and <algorithm>), using
+  // qualified name can not disambiguate headers. Instead we should return all
+  // headers and do the disambiguation in clangd side.
----------------
sammccall wrote:
> Looking at the actual data, I'm not sure this is the right strategy. (and it's not clear what "the clangd side" is :-)
> 
> The ambiguous cases seem to break down into:
>  - cases where we should just pick one, e.g. cstdlib vs cmath for std::abs
>  - cases where a function is overloaded for a particular type (e.g. complex), where passing more information in here (strawman: "the signature as a string") would let us get this correct. Alternatively (and easier), I believe always using the "primary" version of this function is going to be correct if not *strictly* iwyu, as the type comes from the same header. So In the short term I'd suggest just blacklisting the variants.
>  - cases like std::move and... maybe one or two more? In the short term always returning `<utility>` seems good enough. In the long term, passing in the signature again would let us disambiguate here.
> Looking at the actual data, I'm not sure this is the right strategy. (and it's not clear what "the clangd side" is :-)

`clangd side` I mean the consumer side.

> cases where we should just pick one, e.g. cstdlib vs cmath for std::abs

For these cases, I think we can address them in generator side by using a hardcoded whitelist.

> cases where a function is overloaded for a particular type (e.g. complex), where passing more information in here (strawman: "the signature as a string") would let us get this correct. Alternatively (and easier), I believe always using the "primary" version of this function is going to be correct if not *strictly* iwyu, as the type comes from the same header. So In the short term I'd suggest just blacklisting the variants.

Using primary version seems promising, and our indexer doesn't index the template specification symbols.

> cases like std::move and... maybe one or two more? In the short term always returning <utility> seems good enough. In the long term, passing in the signature again would let us disambiguate here.

For std::move this particular case, I don't think using signature (as a string) really helps to disambiguate, since both of them are templates, the signature string doesn't contain any word specific to the header names <utility>/<algorithm>. The best way to disambiguate them is to use the number of parameters IMO.

I'm +1 on returning `utility`, since `std::move` in `utility` is more widely-used.


Or another way to disambiguate: use symbol name => header for symbols with unique headers; otherwise fallback to use header  mapping.


================
Comment at: clangd/index/CanonicalIncludes.cpp:123
 
   static const std::vector<std::pair<const char *, const char *>>
       SystemHeaderMap = {
----------------
sammccall wrote:
> hokein wrote:
> > ioeric wrote:
> > > Can we remove the suffix header mapping now? Is it for the `std::chrono::` symbols? What are the reasons not to include them in this patch? 
> > Ideally, we could remove it once we get a perfect symbol mapping but I'd prefer to keep it (it serves as a fallback when we don't find symbols in the symbol mapping), so that we could add new symbol mapping increasingly without changing the current behavior.
> > 
> > Removing it should be done carefully without introducing regression, and is outside the scope of this patch.
> We do need fallback heuristics. We should conisder cases:
>  - for known `std::` symbols mapped to one system header, we don't need a fallback
>  - for known `std::` mapped to multiple system headers (`std::move`), we need some fallback. There are probably few enough of these that it doesn't justify listing *all* system header paths.
>  - for known `std::` symbols with a primary template in one file and specializations/overloads in another (swap?) always inserting the primary seems fine 
>  - For "random" unknown symbols from system header `foo/bits/_bar.h`, we can not insert, correct to `<bar>`, or use the full path name. I don't have a strong preference here, maybe we should do what's easiest.
>  - for c standard library (`::printf` --> `<stdio.h>` etc) we probably want mappings just like in this patch, I think cppreference has them.
>  - other "standardish" headers (POSIX etc) - hmm, unclear.
> 
> Thinking about all these cases, I'm thinking the nicest solution would be having the symbol -> header mapping, having a small (symbol,header) -> header mapping **only** for ambiguous cases, and not inserting "system" headers that aren't in the main mapping at all. Then we can improve coverage of posix, windows etc headers over time.
> Question is, how can we recognize "system" headers?
I think we were talking about C++ std symbols.

> for known std:: symbols mapped to one system header, we don't need a fallback
> for known std:: mapped to multiple system headers (std::move), we need some fallback. There are probably few enough of these that it doesn't justify listing *all* system header paths.
> for known std:: symbols with a primary template in one file and specializations/overloads in another (swap?) always inserting the primary seems fine

As discussed in this patch, we have other ways to disambiguate these symbols (rather than relying on the header mapping), so it is possible to remove STL headers mapping, but we'll lose correct headers for STL implement-related symbols (e.g. `__hash_code`), ideally we should drop these symbols in the indexer...

> for c standard library (::printf --> <stdio.h> etc) we probably want mappings just like in this patch, I think cppreference has them.

Yes, cppreference has them.

> other "standardish" headers (POSIX etc) - hmm, unclear.

I think we should still keep the header mapping. Not sure whether there are some sources like cppreference that list all symbols.




Repository:
  rCTE Clang Tools Extra

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D58345/new/

https://reviews.llvm.org/D58345





More information about the cfe-commits mailing list