[PATCH] D109210: [clang-tidy] Attach fixit to warning, not note, in add_new_check.py example

Fri Sep 3 15:30:24 PDT 2021

mattbeardsley added a comment.

In D109210#2981912 <https://reviews.llvm.org/D109210#2981912>, @whisperity wrote:

> Uuuh.. I get the sentiment, but this change breaks the very essence of the joke that the default generated code wants to pass on. It also does not showcase that we can emit notes, not just warnings.
>
> How about `function %0 is insufficiently awesome, mark it as 'awesome_'!` in the warning text? (Note how the fix and the note's text diverged already.)
>
> And so we should come up with a joke for the Note tag.
>
> BTW, the test wasn't checking for the note's message at all?

I'm certainly not trying to be a buzzkill on the joke :)
Yes, I did notice that the printed diagnostic text changed from its original, but I'll add some background investigation that I should have written in the original review text, sorry about laying out quite an incomplete thought process. 
At least, I'm hoping I can show there's some extent of due diligence behind-the-scenes, and that I'm not just trying to change this willy-nilly!

I read/grepped/etc through several (but certainly not all) existing checks to try to identify what the "canonical" diagnostic-and-fix pattern looks like. 
I was thinking that a decent goal would be that, if someone takes the files generated by add_new_check.py as their starting point and runs with it, they'll end up with check behavior that's reasonably consistent with a "typical" existing check.

Here are my observations w.r.t "note" diagnostic usage and diagnostic message wording:

1. < ~15% of existing checks use "note" diagnostics

  $ cd clang-tools-extra/clang-tidy

  # 40 checks use notes
  $ find . -name '*Check.cpp' | xargs grep -nl 'DiagnosticIDs::Note' | wc -l
  40

  # out of 265 calling diag directly
  # (there are 287 total *Check.cpp - most of the remaining 22 are android/Cloexec*.cpp which call "CloexecCheck::replaceFunc", which also doesn't use DiagnosticIDs::Note)
  $ find . -name '*Check.cpp' | xargs grep -nl 'diag(' | wc -l
  265

2. Out of several messages that I skimmed through arbitrarily (subject to potentially being a misrepresentative sample, but hopefully not), the general pattern seems to be to state the problem, but not the fix in the handwritten diagnostic message text. Here are some examples from the readability module:
  - `readability-redundant-string-init` warns "redundant string initialization", but doesn't state "remove the initialization"
  - `readability-redundant-string-cstr` warns "redundant call to %0", but doesn't state "remove the call"
  - `readability-named-parameter` warns "all parameters should be named in a function", but doesn't state "insert the commented parameter name"
  - `readability-implicit-bool-conversion` warns "implicit conversion bool -> %0", but doesn't state the various insertions and removals it's going to do
  - `readability-braces-around-statements` warns "statement should be inside braces", but doesn't state "insert statements"

The typical pattern appeared to be to state the issue, then let the FixItHint display the suggested conversion, without writing text describing that conversion explicitly.
So I aimed to adjust the add_new_check.py script so that it would produce an example that's consistent with those observed best practices from the live code, so that a new check based on add_new_check.py would look reasonably similar to surrounding checks. 
My interpretation of that was to state the issue ("insufficiently awesome"), but not describe the insertion - leaving that to the FixItHint::CreateInsertion to emit as it sees fit.

Then, since a relatively small percentage of checks use Note diagnostics, my take on it was that including the Note example would lead a newcomer to unnecessarily start adding Note diagnostics. 
This is anecdotal evidence obviously, but after a couple years spent (on & off) writing about 40 clang-tidy checks in my organization, I only just learned yesterday from a git bisect that I'd been incorrectly using note diagnostics! ("why did all my tests start failing after pulling from upstream!"). 
And I'd been using Note diagnostics that way all along because my impression from the add_new_check.py example was that streaming the FixItHints into Note diagnostics was the right thing to do for the "standard" fix in a clang-tidy check.

Last - given relatively few (~15%) checks actually use Note diagnostics at all, it didn't seem like Note diagnostics had "earned" their place in the generated example code, and could be left as a slightly more "advanced maneuver" instead
Slightly more (42, still about 15%) Check.cpp files (using similar find/grep as above methodology) use "Lexer::getSourceText," but that's not part of the add_new_check.py example. 
So I guess I'm left wondering - what drives Note diagnostics to be a crucial piece to keep as a demo in add_new_check.py, as opposed to showcasing a different maneuver? Is there something else that would be more important to demonstrate than Note diagnostics, without taking away from the helpfulness of the basicexample?

Anyway - hopefully that shows a bit more thorough of a thought process than my original brief message might have made it seem. Sorry for the mountain of text though, just wanted to explain what I was thinking.
I thought this might help overall though because then we could work on identifying which key steps/conclusions that led to the diff that there might be issues with - rather than just sending the diff and leaving you to have to wonder what in the world I'm thinking!

- Does keeping the add_new_check.py script's generated example consistent with "typical" clang-tidy checks in the surrounding codebase seem like a suitable goal?
- Do my observations of the diagnostic reporting patterns (namely: message wording style & note usage frequency) in the codebase seem consistent with yours?
- Does the train of thought & conclusions seem reasonable/rational to you overall? (and of course, what differences in choices/conclusions you may have! I hear you on stating "mark it as 'awesome_', but just wanted to clarify why I'd left text like that out based on what I saw when reading through how a variety of existing checks word their messages, and get your take on it with that extra context - or if I've looked at a misrepresentative set of check implementations for their diagnostic statement patterns!)

Sorry about all the words I've just asked you read, but much appreciated if you did power through it all :). Just wanted to show some semblance of due diligence in trying to have add_new_check.py produce as useful a starter-example as possible for anyone else who's trying to get started in this wonderful codebase!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109210/new/

https://reviews.llvm.org/D109210