[PATCH] D135953: [IncludeCleaner] Introduce decl to location mapping

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Nov 17 08:27:12 PST 2022


sammccall added inline comments.


================
Comment at: clang-tools-extra/include-cleaner/lib/AnalysisInternal.h:87
+/// FIXME: Expose in public API for decision making (ranking, ignoring, etc.).
+enum class Hint : uint8_t {
+  /// Declaration for the symbol, which is provided by all locations.
----------------
can you explain the design of hints a bit?

In particular:
 - are hints emitted by every mapping layer (node -> symbol -> location -> header -> include)? if so, are they the same kind (i.e. same enum?)
 - are the hints from one stage provided to subsequent stages?
 - node -> symbol mapping already produces hints in the form of RefKind, do we reconcile these concepts somehow?
 - do hints for the same entity found on different paths get merged, or will we ultimately report a list of vector<HintedHeader> where the same header may repeat with different hints?


================
Comment at: clang-tools-extra/include-cleaner/lib/AnalysisInternal.h:89
+  /// Declaration for the symbol, which is provided by all locations.
+  Declaration = 0,
+  /// Complete definition for the symbol.
----------------
Is this intended to be:
 - a bit? needs to be nonzero
 - a set of bits? should be `None` or so as it does not set any bits
 - something else? I don't understand



================
Comment at: clang-tools-extra/include-cleaner/lib/AnalysisInternal.h:91
+  /// Complete definition for the symbol.
+  Complete = 1,
+  // FIXME: Add header based hints, e.g. IWYU mappings, header-symbol name.
----------------
You're only setting this in cases where incompleteness is possible and significant. e.g. for AliasDecls it is not set even though they are (shallowly) complete.

I think the most self-documenting solution is to invert it to `Incomplete`.
Otherwise the docs here should explain when it is set vs not set.


================
Comment at: clang-tools-extra/include-cleaner/lib/AnalysisInternal.h:108
 
+using HintedLocation = std::pair<SymbolLocation, Hint>;
+/// A set of locations that provides the declaration.
----------------
Having been down this path, I think using a pair for this is too clumsy for such an important data type (and this may be repeated for each mapping layer).

I'd suggest a template that can attach hints to another type, and acts as a smart pointer to that type: `Loc->kind(), Loc.hint()` rather than `Loc.first.kind(), Loc.second`, can define a decent print function, etc.


================
Comment at: clang-tools-extra/include-cleaner/lib/LocateSymbol.cpp:24
+// Looks for a visible definition of \p D. Returns nullptr if none is available,
+// or there are possibly multiple definitions (e.g. namespaces).
+const Decl *getDefinition(const Decl *D) {
----------------
Hmm, this isn't list (e.g. ObjCCategoryDecl), is hard to complete, and we don't actually have to complete it to get the correct behavior. This suggests it's not the right primitive, e.g. it's hard to look at the code and see whether there's a bug.

One option would be to make this function `isIncomplete`, call it on each redecl, and use isThisDeclarationADefinition.
But we can also avoid the chain of `dyn_cast`s for each redecl by making this function something like `Optional<Decl*> getCompletingDefinition()` where `None` means any redecl is complete (usable without definition) and `nullptr` means no redecl is complete (definition needed but not found), and a pointer means that redecl is complete and no other is.


================
Comment at: clang-tools-extra/include-cleaner/lib/LocateSymbol.cpp:28
+    return TD->getDefinition();
+  if (const auto *VD = dyn_cast<VarDecl>(D))
+    return VD->getDefinition();
----------------
the net result of include VarDecl and FunctionDecl here is that we prefer headers defining these things over headers declaring them. How sure are we that this is a good heuristic?

I thought the idea here was "prefer declarations that are complete enough to use", which is basically for classes & templates, but not functions and variables.

If it's rather "definitions are more likely to be the thing we want" then the hint should probably be named `Definition` rather than `Complete`.


================
Comment at: clang-tools-extra/include-cleaner/lib/LocateSymbol.cpp:43
+  std::vector<HintedLocation> Result;
+  // FIXME: Should we also provide physical locations?
+  if (auto SS = tooling::stdlib::Recognizer()(&D))
----------------
I think the same logic that says we include a header that is `// IWYU pragma private` says we should, probably with some `Private` hint.


================
Comment at: clang-tools-extra/include-cleaner/lib/LocateSymbol.cpp:45
+  if (auto SS = tooling::stdlib::Recognizer()(&D))
+    return {{*SS, Hint::Complete}};
+  SourceLocation DefLoc;
----------------
nit: I think {{...}} is less confusing for both compilers and humans if you spell the inner type


================
Comment at: clang-tools-extra/include-cleaner/lib/WalkAST.cpp:20
 #include "clang/Basic/SourceManager.h"
+#include "clang/Tooling/Inclusions/StandardLibrary.h"
 #include "llvm/ADT/STLExtras.h"
----------------
unused? and the other two


================
Comment at: clang-tools-extra/include-cleaner/unittests/LocateSymbolTest.cpp:36
+// generated.
+void testLocate(llvm::StringRef Code, llvm::StringRef SymbolName = "foo") {
+  llvm::Annotations Target(Code);
----------------
this is a heavy helper that embeds the assertions inside itself, which is a pattern that has gotten us into trouble in the past.

locateSymbol has a pretty simple signature and seems possible to test more directly with something like:

```
LocateExample X("struct $decl^foo; struct $def^foo {};");
EXPECT_THAT(
    locateSymbol(X.findDecl("decl", "foo")),
    ElementsAre(
        Pair(X.loc("decl"), 0),
        Pair(X.loc("def"), Hint::Complete));
```

it's a little longer, but also more flexible (tests hints without hacks, can test standard library, can test macros), is attached to the right line, etc


================
Comment at: clang-tools-extra/include-cleaner/unittests/LocateSymbolTest.cpp:71
+          {Offset,
+           llvm::StringRef(bool(Loc.second & Hint::Complete) ? "$def" : "")});
+      break;
----------------
`def` seems like a strange annotation for `Complete` - again definition/complete should be different concepts or have the same name


================
Comment at: clang-tools-extra/include-cleaner/unittests/LocateSymbolTest.cpp:91
+  testLocate("enum class ^foo; enum class $def^foo {};");
+  // Check for stdlib matching.
+  testLocate("namespace std { struct vector; }", "vector");
----------------
this doesn't actually test that it does the right thing: it would pass if it returns nothing


================
Comment at: clang-tools-extra/include-cleaner/unittests/LocateSymbolTest.cpp:92
+  // Check for stdlib matching.
+  testLocate("namespace std { struct vector; }", "vector");
+}
----------------
should we have at least a simple testcase for macros?


================
Comment at: clang-tools-extra/include-cleaner/unittests/WalkASTTest.cpp:17
 #include "llvm/ADT/StringRef.h"
+#include "llvm/ADT/Twine.h"
 #include "llvm/Support/Error.h"
----------------
I can't see where twine/variant are used, is a tool suggesting these?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D135953/new/

https://reviews.llvm.org/D135953



More information about the cfe-commits mailing list