[PATCH] D116085: [clang-tidy] Performance improvements for NOLINTBEGIN/END blocks

Kadir Cetinkaya via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Jan 21 02:13:15 PST 2022


kadircet added a comment.

here are some more comments, can't say I've been able to go through all of the patch yet, unfortunately it's considerably big in size. it would be great to get rid of some of those extra code by just dropping accessors/classes when not needed.



================
Comment at: clang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.cpp:161
     std::unique_ptr<ClangTidyOptionsProvider> OptionsProvider,
-    bool AllowEnablingAnalyzerAlphaCheckers)
+    bool AllowEnablingAnalyzerAlphaCheckers, bool AllowIO,
+    bool EnableNoLintBlocks)
----------------
i think having an `AllowIO` parameter is quite confusing here. It doesn't, and probably won't, control all the IO that might be done directly or transitively through `ClangTidyContext` as one would naturally expect.

what about just making it a parameter to `shouldSuppressDiagnostic` call ?
when IO is allowed, the suppression logic will be able to fetch the contents and work as expected so no problems there.
when it is not, we'll either fail to get the buffer and bail out early, or we'll get the buffer from memory and again everything will work as intended.
In either case the cache is never poisoned, so we can call `shouldSuppressDiagnostic` with different values of `AllowIO` on a single instance of `ClangTidyContext`.
The same applies to `EnableNoLintBlocks` too. So lets move both back into the parameters of `shouldSuppressDiagnostics`.


================
Comment at: clang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.h:84
+                   bool AllowEnablingAnalyzerAlphaCheckers = false,
+                   bool AllowIO = true, bool EnableNoLintBlocks = true);
+
----------------
why are we moving these two parameters here?


================
Comment at: clang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.h:107
 
-  DiagnosticBuilder diag(const ClangTidyError &Error);
+  DiagnosticBuilder diag(const NoLintError &Error);
 
----------------
can we not introduce a new error type in this patch, as it is already doing a good amount of changes?


================
Comment at: clang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.h:234
 
-/// Check whether a given diagnostic should be suppressed due to the presence
-/// of a "NOLINT" suppression comment.
-/// This is exposed so that other tools that present clang-tidy diagnostics
-/// (such as clangd) can respect the same suppression rules as clang-tidy.
-/// This does not handle suppression of notes following a suppressed diagnostic;
-/// that is left to the caller as it requires maintaining state in between calls
-/// to this function.
-/// If `AllowIO` is false, the function does not attempt to read source files
-/// from disk which are not already mapped into memory; such files are treated
-/// as not containing a suppression comment.
-/// \param EnableNolintBlocks controls whether to honor NOLINTBEGIN/NOLINTEND
-/// blocks; if false, only considers line-level disabling.
-/// If suppression is not possible due to improper use of "NOLINT" comments -
-/// for example, the use of a "NOLINTBEGIN" comment that is not followed by a
-/// "NOLINTEND" comment - a diagnostic regarding the improper use is returned
-/// via the output argument `SuppressionErrors`.
-bool shouldSuppressDiagnostic(
-    DiagnosticsEngine::Level DiagLevel, const Diagnostic &Info,
-    ClangTidyContext &Context,
-    SmallVectorImpl<ClangTidyError> &SuppressionErrors, bool AllowIO = true,
-    bool EnableNolintBlocks = true);
+  std::unique_ptr<NoLintPragmaHandler> NoLintHandler;
+};
----------------
no need to make this dependency weak, let's just depend on the public header and make this `NoLintPragmaHandler NoLintHandler`.


================
Comment at: clang-tools-extra/clang-tidy/NoLintPragmaHandler.cpp:37
+namespace tidy {
+
+//===----------------------------------------------------------------------===//
----------------
place class/struct definitions in this file inside an anonymous namespace.


================
Comment at: clang-tools-extra/clang-tidy/NoLintPragmaHandler.cpp:63
+// as parsed from the file's character contents.
+class NoLintToken {
+public:
----------------
why not just a basic struct ? i don't think all of these accessors/constructors carry their weight (as mentioned below once the `ChecksGlob` is not lazily computed, there should be no need for any of these).


================
Comment at: clang-tools-extra/clang-tidy/NoLintPragmaHandler.cpp:63
+// as parsed from the file's character contents.
+class NoLintToken {
+public:
----------------
kadircet wrote:
> why not just a basic struct ? i don't think all of these accessors/constructors carry their weight (as mentioned below once the `ChecksGlob` is not lazily computed, there should be no need for any of these).
can we please inline the definitions (if there's a good reason to stick with the class)?


================
Comment at: clang-tools-extra/clang-tidy/NoLintPragmaHandler.cpp:90
+  size_t Pos;
+  Optional<std::string> Checks;
+  mutable std::unique_ptr<CachedGlobList> ChecksGlob;
----------------
this deserves a comment about the difference of None vs empty


================
Comment at: clang-tools-extra/clang-tidy/NoLintPragmaHandler.cpp:97
+// Whitespace within a NOLINT's check list shall be ignored.
+// "NOLINT( check1, check2 )" is equivalent is "NOLINT(check1,check2)".
+// Return the check list with all extraneous whitespace removed.
----------------
s/equivalent is/equivalent to/


================
Comment at: clang-tools-extra/clang-tidy/NoLintPragmaHandler.cpp:133
+
+  // Delay the construction of this heavy object until we actually need it.
+  if (!ChecksGlob)
----------------
i think it's fine to assume that this check will always be performed (otherwise why would user have the nolint comment in their code in the first place). so i don't think all the complexity around making this mutable and lazily computing is worth it.


================
Comment at: clang-tools-extra/clang-tidy/NoLintPragmaHandler.cpp:213
+
+// Read an entire buffer and return all `NoLintToken`s that were found.
+class NoLintTokenizer {
----------------
`Tokenize the contents in \p Buffer between \p BeginPos and \p EndPos.` Not necessarily the entire buffer, right?


================
Comment at: clang-tools-extra/clang-tidy/NoLintPragmaHandler.cpp:214
+// Read an entire buffer and return all `NoLintToken`s that were found.
+class NoLintTokenizer {
+public:
----------------
all of this seems a little extra, why not something like:
```
vector<NoLinttoken> getNoLintTokens(StringRef Buffer, size_t StartOffset, size_t EndOffset) {
vector<NoLinttoken> Results;
Buffer = Buffer.slice(StartOffset, EndOffset);
while(auto Token = getNextNoLintToken(Buffer)) { // Also trims Buffer, returns None when not found.
  Token.pos += StartOffset;
  Results.push_back(std::move(Token));
}
return Results;
}
```


================
Comment at: clang-tools-extra/clang-tidy/NoLintPragmaHandler.cpp:350
+  bool HasNoLint = diagHasNoLintInMacro(Diag, DiagName);
+  NoLintErrors = std::move(Errors);
+  return HasNoLint;
----------------
i think it's better to propagate `NoLintErrors` all the way until error generation. as it looks quite confusing as the class state.


================
Comment at: clang-tools-extra/clang-tidy/NoLintPragmaHandler.cpp:407
+// We will check for NOLINTs and NOLINTNEXTLINEs first. Checking for these is
+// not so exhaustive (just need to parse the current and previous lines). Only
+// if that fails do we look for NOLINT(BEGIN/END) blocks (which requires
----------------
s/exhaustive/expensive


================
Comment at: clang-tools-extra/clang-tidy/NoLintPragmaHandler.cpp:420
+  // file if it is a <built-in>.
+  Optional<StringRef> FileName = SrcMgr.getNonBuiltinFilenameForID(File);
+  if (!FileName)
----------------
filenames in source manager could be misleading depending on how the file is accessed (there might be an override, symlinks, includemaps etc.) and different fileids can also refer to same file (e.g. when header is not include guarded). so both can result in reading the same file contents multiple times, but at least fileids are not strings so should be better keys for the cache && get rid of this step around fetching the filename from fileid.

Hence can we switch the cache from stringmap to a `densemap<fileid, vector<nolinttoken>>` instead.


================
Comment at: clang-tools-extra/clang-tidy/NoLintPragmaHandler.cpp:469
+      (NoLint.type() == NoLintType::NoLintBegin)
+          ? ("unmatched 'NOLINTBEGIN' comment without a subsequent 'NOLINT"
+             "END' comment")
----------------
maybe just `'NOLINTBEGIN' comment without a matching 'NOLINTEND' comment`, vice versa for the other case.


================
Comment at: clang-tools-extra/test/clang-tidy/infrastructure/Inputs/nolintbeginend/1st-translation-unit.cpp:1
+// NOLINTBEGIN
+class A { A(int i); };
----------------
no run lines or anything here (and the following file)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116085/new/

https://reviews.llvm.org/D116085



More information about the cfe-commits mailing list