[PATCH] D136071: [include-cleaner] WIP: Add PragmaIncludes which handles include-mapping pragmas.

Sam McCall via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Oct 17 13:51:01 PDT 2022


sammccall added a comment.

(It's a bit hard to evaluate this without the other pieces, but I suppose we need to start somewhere)



================
Comment at: clang-tools-extra/include-cleaner/include/clang-include-cleaner/Hooks.h:38
+// spelling header rather than the header directly defines the symbol.
+class PragmaIncludes {
+public:
----------------
IWYU pragmas requires a PP listener, we'll also need PP events for other purposes (determining which files are self-contained, finding #includes in the main file). Should these be the same listener and write into the same class?

The argument for fewer components is usability: the embedder (standalone tool, clangd, clang-tidy) is responsible for connecting these hooks to the compiler and feeding the results back into the analysis.

So the library is somewhat simpler to use if the number of parts is small.
We need (I think) at least one for the AST and one for the preprocessor, because they need to be connected at different parts of the lifecycle.

How much do we gain by splitting up further than that?


================
Comment at: clang-tools-extra/include-cleaner/include/clang-include-cleaner/Hooks.h:44
+
+  bool shouldKeep(FileEntryRef File) const;
+  // Returns the mapping include for the given physical header file.
----------------
At first glance, this signature doesn't look right: `keep` applies to directives, not to files.
e.g.
```
#include "foo.h" // keep
#include "foo.h"
```
the second should be removed

(it's a silly example, but in the past we've confused ourselves into doing IO or complicated string matching code for things that are about spelling).

I'd suggest some representation like the line number (where the `#` appears, not where the comment is), though the details probably matter less than the concept.


================
Comment at: clang-tools-extra/include-cleaner/include/clang-include-cleaner/Hooks.h:47
+  // Returns "" if there is no mapping.
+  llvm::StringRef getMapping(FileEntryRef File) const;
+
----------------
FileEntry* rather than FileEntryRef - we don't have different mappings when we open the same file using different names.

(This seems like a nit here but `Header` will be something like `variant<stdlib::Header, FileEntry*, ...>` and inconsistency would be confusing)


================
Comment at: clang-tools-extra/include-cleaner/include/clang-include-cleaner/Hooks.h:53
+  // Main file headers that should be kept, decorated by "IWYU pragma: keep".
+  llvm::DenseSet<llvm::sys::fs::UniqueID> ShouldKeep; // FIXME: implement
+
----------------
Why are these UniqueID rather than FileEntry*? Is the idea to try to use this in clangd?

We hadn't planned on doing this, because pushing clangd's weird requirements around not using FileEntry's to the library, ability to clone the data structure etc seemed pretty intrusive. Since we switched from name-based to UniqueID it might be doable...


================
Comment at: clang-tools-extra/include-cleaner/lib/Hooks.cpp:24
+static llvm::Optional<StringRef> parseIWYUPragma(const char *Text) {
+  constexpr llvm::StringLiteral IWYUPragma = "// IWYU pragma: ";
+  if (strncmp(Text, IWYUPragma.data(), IWYUPragma.size()))
----------------
while we're adding this, we should probably support `/*` as well as `//`, it's allowed per spec (I didn't add it to clangd just to keep NFC)


================
Comment at: clang-tools-extra/include-cleaner/lib/Hooks.cpp:42
+        PP.getSourceManager().getCharacterData(Range.getBegin(), &Invalid));
+    if (!Pragma || Invalid)
+      return false;
----------------
nit: there's no need to check Invalid, getCharacterData always returns a valid pointer and it will never have a spurious IWYU pragma


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136071/new/

https://reviews.llvm.org/D136071



More information about the cfe-commits mailing list