[PATCH] D41326: [clang-tidy] Added diagnostics about incorrect usage of NOLINT comment

Sat Dec 23 15:01:23 PST 2017

xgsa marked 13 inline comments as done.
xgsa added a comment.

Aaron, thank you for your review and sorry for the coding convention mistakes -- I still cannot get used to the llvm coding convention, because it quite differs from the one I have been using in my projects.

================
Comment at: clang-tidy/ClangTidyDiagnosticConsumer.cpp:276
+
+  using NolintMap = std::unordered_map<NolintTargetInfo,
+                                       NolintCommentInfo,
----------------
aaron.ballman wrote:
> Is there a better LLVM ADT to use here?
This data structure provides the fast lookup by check name+line number and it's exactly what is necessary. What are the concerns about this data structure?

================
Comment at: clang-tidy/ClangTidyDiagnosticConsumer.cpp:316
+          Context->diag(NolintCheckName, NolintLoc,
+              "unknown check name '%0' is specified for %1 comment")
+                  << CheckName << CommentName;
----------------
aaron.ballman wrote:
> I'd reword this slightly to: "unknown check name '%0' specified in %1 directive"
I'd used the word "comment" instead of "directive" for consistency with documentation and the other messages. The rest is applied.

================
Comment at: clang-tidy/ClangTidyDiagnosticConsumer.cpp:424
+// This method is not const, because it fills cache on first demand.
+// It is possible to fill cache in constructor or make cache volatile
+// and make this method const, but they seem like worse solutions.
----------------
aaron.ballman wrote:
> Making the cache volatile will have no impact on this.
> 
> Any reason not to make the cache `mutable`, however? That's quite common for implementation details.
Sorry, certainly, instead of "volatile" I meant "mutable".

Actually, using of "mutable" violates a constancy contract, as the field is get modified in a const method. Thus I'd tend to avoid using `mutable`, if possible, because e.g. in multi-threaded applications these fields require additional protection/synchronization. Moreover, I see that using of  `mutable` is not very spread in clang-tidy. Thus as, currently, `hasCheck` is called from the non-constant context, I'd prefer leaving it non-const instead of making cache `mutable`. Please, let me know if you insist on the `mutable` option.

================
Comment at: clang-tidy/ClangTidyDiagnosticConsumer.cpp:839-840
+        case NolintCommentType::Nolint:
+          Message = "there is no diagnostics on this line, "
+                    "the NOLINT comment is redundant";
+          break;
----------------
aaron.ballman wrote:
> I don't think the user is going to care about the distinction between no diagnostics being triggered and the expected diagnostic not being triggered. Also, it's dangerous to claim the lint comment is redundant because it's possible the user has NOLINT(foo, bar) and while foo is not triggered, bar still is. The NOLINT comment itself isn't redundant, it's that the check specified doesn't occur.
> 
> I would consolidate those scenarios into a single diagnostic: "expected diagnostic '%0' not generated" and "expected diagnostic '%0' not generated for the following line".
> 
> One concern I have with this functionality is: how should users silence a lint diagnostic that's target sensitive? e.g., a diagnostic that triggers based on the underlying type of size_t or the signedness of plain char. In that case, the diagnostic may trigger for some targets but not others, but on the targets where the diagnostic is not triggered, they now get a diagnostic they cannot silence. There should be a way to silence the "bad NOLINT" diagnostics.
> I don't think the user is going to care about the distinction between no diagnostics being triggered and the expected diagnostic not being triggered. Also, it's dangerous to claim the lint comment is redundant because it's possible the user has NOLINT(foo, bar) and while foo is not triggered, bar still is. The NOLINT comment itself isn't redundant, it's that the check specified doesn't occur.
> 
> I would consolidate those scenarios into a single diagnostic: "expected diagnostic '%0' not generated" and "expected diagnostic '%0' not generated for the following line".

This branch of `if (NolintEntry.first.CheckName == NolintCommentsCollector::AnyCheck)` reports only about `NOLINT`/`NOLINTNEXTLINE` comments without check list, so I suppose it's fair to claim that this comment is redundant (we have already checked that no single check reported diagnostics on the line). The `else`-branch reports the diagnostics for the definite check in a list in case of `NOLINT(foo, bar)` (actually, if neither `foo` nor `bar` checks reported diagnostics for the line, there will be a few diagnostics from `nolint-usage` - not sure if it's good, but it seems acceptable). That is why, I suppose, it is necessary to distinct these cases.

> One concern I have with this functionality is: how should users silence a lint diagnostic that's target sensitive? e.g., a diagnostic that triggers based on the underlying type of size_t or the signedness of plain char. In that case, the diagnostic may trigger for some targets but not others, but on the targets where the diagnostic is not triggered, they now get a diagnostic they cannot silence. There should be a way to silence the "bad NOLINT" diagnostics.

There is such mechanism: it is possible to specify `// NOLINT(nolint-usage)` or `//NOLINT(check1, check2, nolint-usage) to silence the `nolint-usage`-mechanism. Please, see tests for details and more examples. 

================
Comment at: clang-tidy/ClangTidyDiagnosticConsumer.h:295
+  // separately to process NOLINT misuse.
+  std::unordered_map<unsigned, std::vector<unsigned>> DiagIDsToLineNumbers;
 };
----------------
aaron.ballman wrote:
> Perhaps this would make more sense as an `llvm::IndexedMap` or other non-STL datatype?
I think, the map of vectors fits well: it provides efficient adding, storing and iterating through elements. In spite items in vector can be duplicated, it's not a usual case (when there are a lot of diagnostics for the same check on the same line), so I don't think it worth using set. I don't see how `llvm::IndexedMap` could help here. What are the concerns about the current data structure?

https://reviews.llvm.org/D41326