[PATCH] D111208: [clang-tidy] Support globbing in NOLINT* expressions

Salman Javed via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sat Oct 9 03:46:46 PDT 2021


salman-javed-nz added a comment.

In D111208#3053061 <https://reviews.llvm.org/D111208#3053061>, @carlosgalvezp wrote:

> Looking at the code, I see that you use `SuppressionIsSpecific` in order to separate the errors into 2 error lists: `SpecificNolintBegins` and `GlobalNolintBegins`. However you don't do anything different with either list, so why do they need to be different lists?
>
> Here checking that a combined list is empty would be equivalent:
>
>   bool WithinNolintBegin =
>       !SpecificNolintBegins.empty() || !GlobalNolintBegins.empty();
>
> And here, you are running identical code for both lists:
>
>   for (const auto NolintBegin : SpecificNolintBegins) {
>     auto Error = createNolintError(Context, SM, NolintBegin, true);
>     SuppressionErrors.emplace_back(Error);
>   }
>   for (const auto NolintBegin : GlobalNolintBegins) {
>     auto Error = createNolintError(Context, SM, NolintBegin, true);
>     SuppressionErrors.emplace_back(Error);
>   }
>
> And then these lists are not used any further than the scope of the function where they are declared. So to me it feels like they could be combined, and this logic of `SuppressionIsSpecific` be removed. Let me know if I'm missing something obvious!

`tallyNolintBegins()` goes through every line, and whenever it sees a `NOLINTBEGIN`, it pushes its SourceLocation onto a stack. When it sees a `NOLINTEND`, it pops one entry off the stack. At the end of the file, all entries on the stack should have been popped off - if not, it's a unmatched `NOLINTBEGIN/END` expression error.

`NOLINTBEGIN(check-name)` expressions are pushed onto the `SpecificNolintBegins` list, while `NOLINTBEGIN` & `NOLINTBEGIN(*)` expressions are pushed onto the `GlobalNolintBegins` list. The reason for the two lists is so that `NOLINTBEGIN(check-name)` cannot be popped off by `NOLINTEND` or `NOLINTEND(*)`; and `NOLINTBEGIN` and `NOLINTBEGIN(*)` cannot be popped off by a `NOLINTEND(check-name)`.

See this line in `tallyNolintBegins()` where it figures out which list to use for the `NOLINTBEGIN/END` expression currently being evaluated...

  auto List = [&]() -> SmallVector<SourceLocation> * {
    return SuppressionIsSpecific ? &SpecificNolintBegins : &GlobalNolintBegins;
  };

I'm sure there is an alternate implementation which achieves the same thing with just one list object... however the list will probably need to store more information about the `NOLINTBEGIN` expression than just its SourceLocation to be able to achieve the same goals though... probably a list of `pair<SourceLocation Location, StringRef CheckName>`, otherwise you can't match a `NOLINTEND` to its corresponding `NOLINTBEGIN`. If you want to take a crack at simplifying the logic here, all improvements are appreciated.



================
Comment at: clang-tools-extra/clang-tidy/ClangTidyDiagnosticConsumer.cpp:353
+      if (SuppressionIsSpecific)
+        *SuppressionIsSpecific = true;
     }
----------------
carlosgalvezp wrote:
> salman-javed-nz wrote:
> > salman-javed-nz wrote:
> > > carlosgalvezp wrote:
> > > > salman-javed-nz wrote:
> > > > > So when this function is called with args `Line = "NOLINTBEGIN(google*)"` and `CheckName = "google-explicit-constructor"`, it will return true with `*SuppressionIsSpecific = true`.
> > > > > Should `*SuppressionIsSpecific = false` instead here?
> > > > Good point. I honestly don't understand what SuppressionIsSpecific means and how it's used downstream, could you clarify?
> > > `SuppressionIsSpecific = true` means when a check is suppressed by name, i.e. `NOLINTBEGIN(check-name)`.
> > > 
> > > `SuppressionIsSpecific = false` means when `NOLINTBEGIN` <zero args> or `NOLINTBEGIN(*)` is used. My gut feeling is that It should probably be false when a glob is used too.
> > > 
> > > The point of this variable is so that a check-specific `BEGIN` can only be `END`ed with a check-specific END.
> > I think the behaviour has changed in this patch.
> > 
> > It used to return `*SuppressionIsSpecific = false` when `ChecksStr = "*"`. 
> > `"*"` is disabling all checks, not a specific check.
> > 
> > Now it returns `*SuppressionIsSpecific = true`. `Globs.contains(CheckName)` is true regardless of whether `ChecksStr = "*"` or `ChecksStr = "check-name"`, so it continues running onto line 353 and sets `*SuppressionIsSpecific = true` no matter what.
> You are right! It's interesting however that this hasn't triggered any unit test error, so I wonder if this variable is needed at all then? I will have a deeper look at the code to get a better understanding and update the patch accordingly.
> You are right! It's interesting however that this hasn't triggered any unit test error

That's because the unit tests are missing a test case that checks this specific combination of BEGIN/END expressions. The tests predominantly use `NOLINTBEGIN` <zero args>, not `NOLINTBEGIN(*)`.
It would be great if this shortcoming in the test coverage could be plugged in this patch.

`nolintbeginend-begin-global-end-specific.cpp` checks:
```
// NOLINTBEGIN
class A { A(int i); };
// NOLINTEND(google-explicit-constructor)
```

`nolintbeginend-begin-specific-end-global.cpp` checks:
```
// NOLINTBEGIN(google-explicit-constructor)
class A { A(int i); };
// NOLINTEND
```

These 2 files could be used as the basis to create tests for `NOLINTBEGIN(*)` / `NOLINTEND(*)`.


================
Comment at: clang-tools-extra/docs/clang-tidy/index.rst:347
+    // No warnings are suppressed, due to double negation
+    Foo(bool param);  // NOLINT(-google*)
   };
----------------
carlosgalvezp wrote:
> salman-javed-nz wrote:
> > salman-javed-nz wrote:
> > > carlosgalvezp wrote:
> > > > salman-javed-nz wrote:
> > > > > Would anyone do this on purpose, or is this a user error?
> > > > I don't see any use case here, no. I just thought to document it to clarify the double negation that's happening, in case a user gets confused and doesn't see the warning being suppressed.
> > > > 
> > > > Do you think it brings value or does more harm than good?
> > > 
> > > I don't see any use case here, no. I just thought to document it to clarify the double negation that's happening, in case a user gets confused and doesn't see the warning being suppressed.
> > > 
> > > Do you think it brings value or does more harm than good?
> > 
> > I'd be happy with just the basic `+` glob functionality. The first thing I'd use it on is to suppress checks that are an alias of another check, e.g. `NOLINT(*no-malloc)` to cover both `hicpp-no-malloc` and `cppcoreguidelines-no-malloc`.
> > 
> > As for glob expressions that begin with a `-`... you get the functionality for free thanks to the `GlobList` class but it's not a feature I think I will need. I speak only for myself though. Maybe someone else here has a strong need for this?
> > 
> > Is `NOLINTBEGIN(-check)` equivalent to `NOLINTEND(check)`? It hursts my head even thinking about it.
> > 
> > Your unit test where you test `NOLINT(google*,-google*,google*)`, Clang-Tidy does the right thing in that situation, but I have to wonder why any user would want to add, remove, then add the same check group in the one expression in the first place? Should we even be entertaining this kind of use?
> > Is NOLINTBEGIN(-check) equivalent to NOLINTEND(check)?
> 
> To me it's clear that it's not. Any NOLINTBEGIN(X) must have a matching NOLINTEND(X), regardless of whether X is a glob or not.
> 
> Totally agree that negative globs are probably not needed, the main use case is being able to suppress all aliases as you say.
> 
> I just thought it was neat to be able to reuse code in that way, the code is very clean and easy to read. Plus users are familiar with the syntax. If users want to abuse it that would be at their own maintenance expense - clang-tidy would just do what it was asked to do, without crashing or anything bad happening.
> 
> The same thing can be done when running the tool - you can run "--checks=google*,-google*" and I don't think clang-tidy has code to warn the user/prevent this from happening. Considering all these edge cases would perhaps complicate the design of the tool for no real use case? I.e. too defensive programming.
> 
> I added the unit test mostly to make sure clang-tidy doesn't crash or does something strange, it just does what the user instructed it to do. I thought such edge cases are encouraged in unit tests to increase coverage.
> 
> Would it be reasonable to keep negative globs in the implementation (to simplify maintenance, reuse code), but not document it in the public docs? Otherwise I'll need to refactor GlobList to be able to reuse the part of the code that consumes the positive globs. I don't think it makes sense to create a parallel implementation of that part (kind of what exists already now, before my patch).
> > Is NOLINTBEGIN(-check) equivalent to NOLINTEND(check)?
> 
> To me it's clear that it's not. Any NOLINTBEGIN(X) must have a matching NOLINTEND(X), regardless of whether X is a glob or not.
> 
> Totally agree that negative globs are probably not needed, the main use case is being able to suppress all aliases as you say.
> 
> I just thought it was neat to be able to reuse code in that way, the code is very clean and easy to read. Plus users are familiar with the syntax. If users want to abuse it that would be at their own maintenance expense - clang-tidy would just do what it was asked to do, without crashing or anything bad happening.
> 
> The same thing can be done when running the tool - you can run "--checks=google*,-google*" and I don't think clang-tidy has code to warn the user/prevent this from happening. Considering all these edge cases would perhaps complicate the design of the tool for no real use case? I.e. too defensive programming.
> 
> I added the unit test mostly to make sure clang-tidy doesn't crash or does something strange, it just does what the user instructed it to do. I thought such edge cases are encouraged in unit tests to increase coverage.
> 

I see all your points about reusing the glob functionality. As long as this functionality does not let a user shoot themselves in the foot, I don't see a problem with it. 
A large proportion of the review of my `NOLINTBEGIN/NOLINTEND` patch was spent going over all the ways a user could misuse the functionality. The concern was mainly that we didn't want a user to unwittingly disable checks for the entire file because of a stray `NOLINTBEGIN` without an `NOLINTEND` marker.
The worst I can see a confused user do with this new glob feature is use `NOLINT(-check*)` which as you have already explained is a double negative and shows the warnings anyway, so no harm done.

I find it funny that master allows `NOLINT(` with no closing bracket go though as a `NOLINT` when it's clearly a sign of user error. Clang-Tidy could be more proactive in alerting the user to these mistakes, but I get there's a balance between code complexity and harm reduction. This quirk is not directly related to your patch so that's a story to continue on another day.

> Would it be reasonable to keep negative globs in the implementation (to simplify maintenance, reuse code), but not document it in the public docs? Otherwise I'll need to refactor GlobList to be able to reuse the part of the code that consumes the positive globs. I don't think it makes sense to create a parallel implementation of that part (kind of what exists already now, before my patch).

I'm only here to provide some insight about code I recently wrote. Let's wait for the actual reviewers who have a better sense of the direction of this project to say how things should be.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D111208/new/

https://reviews.llvm.org/D111208



More information about the cfe-commits mailing list