[PATCH] D111208: [clang-tidy] Support globbing in NOLINT* expressions

Mon Oct 11 11:11:58 PDT 2021

aaron.ballman added reviewers: njames93, hokein, whisperity.
aaron.ballman added a comment.
Herald added a subscriber: rnkovacs.

In D111208#3054246 <https://reviews.llvm.org/D111208#3054246>, @carlosgalvezp wrote:

> Awesome, thanks a lot for the review, I really improved my understanding of the tool :)
>
> @aaron.ballman Could you review? Otherwise could you point me to who should review? I tagged @alexfh as owner of clang-tidy but haven't really seen him around... Shouldn't there be multiple owners, so that there's not a "single point of failure"?

I'm happy to review (I'll add a few more folks to the list though). Just an FYI, but we usually only ping when there's been about a week of no traffic on the review. Reviewers have a fairly heavy workload and we've settled on a week between pings as being the happy medium.

In D111208#3053370 <https://reviews.llvm.org/D111208#3053370>, @carlosgalvezp wrote:

> Thanks for the clarification, makes a lot of sense now! I'll see what I can do with that.
>
> By the way, is `NOLINTBEGIN/END` expected to work/give errors when the check name is something else than a real check name? See for example:
>
> https://godbolt.org/z/b6cbTeezs
>
> I would expect to get a warning that an `END` was found that didn't match a `BEGIN`.

If the tags don't match (so there's an open for a check but never a close for the check), then we should diagnose. However, we should *not* diagnose unknown check names (because the user may be suppressing checks that clang-tidy doesn't know about, such as ones in another static analysis tool run over the same code). However, given that we don't run the NOLINT pass unless there are diagnostics to emit, I think it's fine to not diagnose a mismatch between unknown tags.

================
Comment at: clang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.cpp:349
+      // Allow specifying a few checks with a glob expression.
+      GlobList Globs(ChecksStr);
+      if (!Globs.contains(CheckName))
----------------
carlosgalvezp wrote:
> salman-javed-nz wrote:
> > What happens when `CheckStr` is empty?
> > How has Clang-Tidy treated `// NOLINT()` in the past? Does this patch change the behaviour? What //should // be the "right" behaviour?
> Very good question! Currently on master `// NOLINT()` will *not* suppress warnings. However `// NOLINT(` will. My patch shouldn't change existing behavior - an empty list of globs will return false when calling `contains`.
> 
> I'll add a unit test to cover this case!
(All of this is a bit orthogonal to this patch, so no changes requested here.)

Naively, I would expect `NOLINT()` to be an error because it expects an argument and none was given. (I'd expect `NOLINT(` to also be an error because the syntax is malformed.)

================
Comment at: clang-tools-extra/docs/clang-tidy/index.rst:347
+    // No warnings are suppressed, due to double negation
+    Foo(bool param);  // NOLINT(-google*)
   };
----------------
salman-javed-nz wrote:
> carlosgalvezp wrote:
> > salman-javed-nz wrote:
> > > salman-javed-nz wrote:
> > > > carlosgalvezp wrote:
> > > > > salman-javed-nz wrote:
> > > > > > Would anyone do this on purpose, or is this a user error?
> > > > > I don't see any use case here, no. I just thought to document it to clarify the double negation that's happening, in case a user gets confused and doesn't see the warning being suppressed.
> > > > > 
> > > > > Do you think it brings value or does more harm than good?
> > > > 
> > > > I don't see any use case here, no. I just thought to document it to clarify the double negation that's happening, in case a user gets confused and doesn't see the warning being suppressed.
> > > > 
> > > > Do you think it brings value or does more harm than good?
> > > 
> > > I'd be happy with just the basic `+` glob functionality. The first thing I'd use it on is to suppress checks that are an alias of another check, e.g. `NOLINT(*no-malloc)` to cover both `hicpp-no-malloc` and `cppcoreguidelines-no-malloc`.
> > > 
> > > As for glob expressions that begin with a `-`... you get the functionality for free thanks to the `GlobList` class but it's not a feature I think I will need. I speak only for myself though. Maybe someone else here has a strong need for this?
> > > 
> > > Is `NOLINTBEGIN(-check)` equivalent to `NOLINTEND(check)`? It hursts my head even thinking about it.
> > > 
> > > Your unit test where you test `NOLINT(google*,-google*,google*)`, Clang-Tidy does the right thing in that situation, but I have to wonder why any user would want to add, remove, then add the same check group in the one expression in the first place? Should we even be entertaining this kind of use?
> > > Is NOLINTBEGIN(-check) equivalent to NOLINTEND(check)?
> > 
> > To me it's clear that it's not. Any NOLINTBEGIN(X) must have a matching NOLINTEND(X), regardless of whether X is a glob or not.
> > 
> > Totally agree that negative globs are probably not needed, the main use case is being able to suppress all aliases as you say.
> > 
> > I just thought it was neat to be able to reuse code in that way, the code is very clean and easy to read. Plus users are familiar with the syntax. If users want to abuse it that would be at their own maintenance expense - clang-tidy would just do what it was asked to do, without crashing or anything bad happening.
> > 
> > The same thing can be done when running the tool - you can run "--checks=google*,-google*" and I don't think clang-tidy has code to warn the user/prevent this from happening. Considering all these edge cases would perhaps complicate the design of the tool for no real use case? I.e. too defensive programming.
> > 
> > I added the unit test mostly to make sure clang-tidy doesn't crash or does something strange, it just does what the user instructed it to do. I thought such edge cases are encouraged in unit tests to increase coverage.
> > 
> > Would it be reasonable to keep negative globs in the implementation (to simplify maintenance, reuse code), but not document it in the public docs? Otherwise I'll need to refactor GlobList to be able to reuse the part of the code that consumes the positive globs. I don't think it makes sense to create a parallel implementation of that part (kind of what exists already now, before my patch).
> > > Is NOLINTBEGIN(-check) equivalent to NOLINTEND(check)?
> > 
> > To me it's clear that it's not. Any NOLINTBEGIN(X) must have a matching NOLINTEND(X), regardless of whether X is a glob or not.
> > 
> > Totally agree that negative globs are probably not needed, the main use case is being able to suppress all aliases as you say.
> > 
> > I just thought it was neat to be able to reuse code in that way, the code is very clean and easy to read. Plus users are familiar with the syntax. If users want to abuse it that would be at their own maintenance expense - clang-tidy would just do what it was asked to do, without crashing or anything bad happening.
> > 
> > The same thing can be done when running the tool - you can run "--checks=google*,-google*" and I don't think clang-tidy has code to warn the user/prevent this from happening. Considering all these edge cases would perhaps complicate the design of the tool for no real use case? I.e. too defensive programming.
> > 
> > I added the unit test mostly to make sure clang-tidy doesn't crash or does something strange, it just does what the user instructed it to do. I thought such edge cases are encouraged in unit tests to increase coverage.
> > 
> 
> I see all your points about reusing the glob functionality. As long as this functionality does not let a user shoot themselves in the foot, I don't see a problem with it. 
> A large proportion of the review of my `NOLINTBEGIN/NOLINTEND` patch was spent going over all the ways a user could misuse the functionality. The concern was mainly that we didn't want a user to unwittingly disable checks for the entire file because of a stray `NOLINTBEGIN` without an `NOLINTEND` marker.
> The worst I can see a confused user do with this new glob feature is use `NOLINT(-check*)` which as you have already explained is a double negative and shows the warnings anyway, so no harm done.
> 
> I find it funny that master allows `NOLINT(` with no closing bracket go though as a `NOLINT` when it's clearly a sign of user error. Clang-Tidy could be more proactive in alerting the user to these mistakes, but I get there's a balance between code complexity and harm reduction. This quirk is not directly related to your patch so that's a story to continue on another day.
> 
> > Would it be reasonable to keep negative globs in the implementation (to simplify maintenance, reuse code), but not document it in the public docs? Otherwise I'll need to refactor GlobList to be able to reuse the part of the code that consumes the positive globs. I don't think it makes sense to create a parallel implementation of that part (kind of what exists already now, before my patch).
> 
> I'm only here to provide some insight about code I recently wrote. Let's wait for the actual reviewers who have a better sense of the direction of this project to say how things should be.
> I see all your points about reusing the glob functionality. As long as this functionality does not let a user shoot themselves in the foot, I don't see a problem with it.

Likewise; code reuse is a good thing when it works in our favor.

> The worst I can see a confused user do with this new glob feature is use NOLINT(-check*) which as you have already explained is a double negative and shows the warnings anyway, so no harm done.

I have to think on this more, but my initial reaction is that it's a bit scary because the check names themselves have dashes within them already, but those dashes don't mean negation. So I worry about this being confusing:
```
// NOLINT(misc-whatever-awesome)
// Oops, we didn't notice that there's also bugprone-whatever-awesome, 
// so let's try to suppress them all.

// NOLINT(-whatever-awesome)
// Oops. that actual disables the check because we didn't know to add the star. But because it's in a NOLINT, that's a double negation, so we've reenabled the check. So the user will still get diagnostics, but likely not understand why.

// NOLINT(*-whatever-awesome)
// This was what we meant to write.
```
I don't think allowing users to do double-negations in NOLINT comments adds a lot of value. Would it be reasonable to pass in a flag to `GlobList` to disable negative glob handling so we can still reuse the logic but not have to enable a weird feature that users may accidentally run into?

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111208/new/

https://reviews.llvm.org/D111208