[PATCH] D155819: [include-cleaner] Loose matching for verbatim headers

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jul 26 05:36:43 PDT 2023


sammccall added inline comments.


================
Comment at: clang-tools-extra/include-cleaner/lib/Types.cpp:110-112
+  while (!P.empty() && path::is_separator(P.back()))
+    P.pop_back();
+  return P;
----------------
kadircet wrote:
> nit: `return P.rtrim('/');` // only separator we can encounter is forward slash at this point.
Oops, yes!
Kept the loop though to avoid making an extra copy, APIs are awkward


================
Comment at: clang-tools-extra/include-cleaner/lib/Types.cpp:137
+    // verbatim "e/f" should match (spelled=c/d/e/f, resolved=/a/b/c/d/e/f).
+    // We assume entry's (normalized) name will match the search dirs.
+    auto Path = normalizePath(I.Resolved->getName());
----------------
kadircet wrote:
> i think this assumption breaks when we have an absolute include search path, which is a subdirectory of the CWD of file manager. as we might now have a resolved include which is relative, but a search path that'll never match a prefix of those includes.
> 
> i guess this is fine, as we're trying to heuristically match. but it would be useful to mention it here and add a unittest for that. (unless I am missing something and headersearch already normalizes such paths)
This is part of the contract of the function, pulled in the doc comment from D155878 and added a test for the cases that don't work.


================
Comment at: clang-tools-extra/include-cleaner/lib/Types.cpp:171-173
+    for (unsigned I : BySpellingAlternate.lookup(Spelling))
+      if (!llvm::is_contained(Result, &All[I]))
+        Result.push_back(&All[I]);
----------------
kadircet wrote:
> i think we shouldn't look into alternate spellings if we've already found an exact spelling. we would be matching wrong headers for sure in those cases.
I had it this way initially and changed my mind. Having the presence of some includes affect whether others match at all breaks an "obvious" part of the contract. (So obvious that I assumed it in the test ten minutes after deliberately breaking it). I wouldn't add a special case here unless there's some observed bad-results motivation for it.

> we would be matching wrong headers for sure in those cases.

You could be including the same header under two paths (and in fact that's what this testcase is doing). I don't know why you would, but we also don't have real examples of this happening with *different* headers.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D155819/new/

https://reviews.llvm.org/D155819



More information about the cfe-commits mailing list