[clang] Reland "[Utils] add update-verify-tests.py" (#108630)" (PR #108658)

Wed Sep 18 05:10:24 PDT 2024

AaronBallman wrote:

> But for clang -verify you have to add checkers for exactly the emitted diagnostics. No more or less. So the only real question is on which line to place them.

As someone who does *a lot* of review in Clang, I (strongly) disagree with this assertion.

You do not have to add checks for exactly the emitted diagnostics, there are times when it's more appropriate to drop some part of the message. In fact, we used to have guidance that only the very first instance of the diagnostic should be spelled out fully, and all subsequent ones should be limited to just the bare minimum needed to identify what the diagnostic is. That's why there are hundreds of instances of things like `// expected-note {{here}}`. What's more, there is a rich set of functionality provided by the diagnostic verifier that we often request patch authors use, but is stylistic. It's important to use regular expressions and bookmarks to help make the tests readable and reviewable, which is why they're used thousands of times across Clang's lit tests. Finally, as @Endilll pointed out, there are tests which have multiple (sometime numerous) RUN lines where the output is expected to differ between RUN lines; we typically ask authors to use custom verify prefixes for those tests, but crafting those can sometimes be an art form (and other times is very straightforward).

Writing tests is not a mechanical operation, it requires you to think about what you need to test as well as think about *how* you test it. I think the same is less true for things like testing LLVM IR because that tends to be more routine pattern matching, which typically requires less stylistic choices.

> That said my main goal is not to make creating tests easier (although it of course also does that, and I don't know that I agree that it's harmful to lower the threshold to create tests), but to make it easier to batch update many/large breaking tests. 

This is a serious problem we have, so working on ways to improve that is greatly appreciated! While there's some pretty significant push-back about your current approach, perhaps there is an alternative which makes folks feel more comfortable. For example, the script could explicitly refuse to add new diagnostic comments, only remove or update existing comments. That would make it require more developer effort when adding a new diagnostic and wanting to update tests, but *that's a feature not a bug* because we want the developer to think critically about whether the diagnostic actually makes sense everywhere it fires before they put up the PR. It doesn't solve all problems (like style choices), but it at least helps in the easier cases without the risk of people abusing it accidentally.

> This existing means I can no longer trust that a test is a reflection of what the author intended to write, as they might have just run this and "YOLO'ed" it away.

FWIW, I share this concern -- developers could do this anyway by just writing the comments by hand, but a script *encourages* the developer to trust the output the compiler gives because it's easy to assume problems will be caught by review. This slows down review throughput, increases the burden on reviewers, and given how often people update existing diagnostics, seems like a poor tradeoff (then again, maybe more people will update existing diagnostics if that wasn't so painful to do).

Given the amount of push-back, I think we should probably revert the changes here so we can discuss with a bit wider of an audience than weighed in on the original PR.

https://github.com/llvm/llvm-project/pull/108658