[PATCH] D124500: [clang-tidy] Support expressions of literals in modernize-macro-to-enum
Richard via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Fri Apr 29 12:40:19 PDT 2022
LegalizeAdulthood marked 3 inline comments as done.
LegalizeAdulthood added inline comments.
================
Comment at: clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:99
+
+ if (!Current->isLiteral() || isStringLiteral(Current->getKind()) ||
+ !isIntegralConstant(*Current)) {
----------------
aaron.ballman wrote:
> LegalizeAdulthood wrote:
> > aaron.ballman wrote:
> > > I know this is code moved from elsewhere, but I suppose we never considered the odd edge case where a user does something like `"foo"[0]` as a really awful integer constant. :-D
> > It's always possible to write crazy contorted code and have a check not recognize it. I don't think it's worthwhile to spend time trying to handle torture cases unless we can find data that shows it's prevalent in real world code.
> >
> > If I was doing a code review and saw this:
> > ```
> > enum {
> > FOO = "foo"[0]
> > };
> > ```
> > I'd flag that in a code review as bogus, whereas if I saw:
> > ```
> > enum {
> > FOO = 'f'
> > };
> > ```
> > That would be acceptable, which is why character literals are accepted as an integral literal initializer for an enum in this check.
> > I don't think it's worthwhile to spend time trying to handle torture cases unless we can find data that shows it's prevalent in real world code.
>
> I think I'm okay agreeing to that in this particular case, but this is more to point out that writing your own parser is a maintenance burden. Users will hit cases we've both forgotten about here, they'll file a bug, then someone has to deal with it. It's very hard to justify to users "we think you write silly code" because they often have creative ways in which their code is not actually so silly, especially when we support "most" valid expressions.
Writing your own parser is unavoidable here because we can't just assume that any old thing will be a valid initializer just by looking at the set of tokens present in the macro body. (There is a separate discussion going on about improving the preprocessor support and parsing things more deeply, but that isn't even to the point of a prototype yet.) The worst thing we can do is create "fixits" that produce invalid code.
The worst that happens if your expression isn't recognized is that your macro isn't matched as a candidate for an enum. You can always make it an enum manually and join it with adjacent macros that were recognized and converted.
As it stands, the check only recognizes a single literal with an optional unary operator.
This change expands the check to recognize a broad range of expressions, allowing those macros to be converted to enums. I opened the issue because running modernize-macro-to-enum on the [[ https://github.com/InsightSoftwareConsortium/ITK | ITK codebase ]] showed some simple expressions involving literals that weren't recognized and converted.
If an expression isn't recognized and an issue is opened, it will be an enhancement request to support a broader range of expression, not a bug that this check created invalid code.
IMO, the more useful thing that's missing from the grammar is recognizing `sizeof` expressions rather than indexing string literals with an integral literal subscript.
I had planned on doing another increment to recognize `sizeof` expressions.
================
Comment at: clang-tools-extra/clang-tidy/modernize/IntegralLiteralExpressionMatcher.cpp:204
+ return true;
+}
+
----------------
aaron.ballman wrote:
> LegalizeAdulthood wrote:
> > aaron.ballman wrote:
> > > Comma operator?
> > Remember that the use case here is identifying expressions that are initializers for an enum. If you were doing a code review and you saw this:
> > ```
> > enum {
> > FOO = (2, 3)
> > };
> > ```
> > Would you be OK with that? I wouldn't. Clang even warns about it: https://godbolt.org/z/Y641cb8Y9
> >
> > Therefore I deliberately left comma operator out of the grammar.
> This is another case where I think you're predicting that users won't be using the full expressivity of the language and we'll get bug reports later. Again, in insolation, I tend to agree that I wouldn't be happy seeing that code. However, users write some very creative code and there's no technical reason why we can't or shouldn't handle comma operators.
"Don't let the perfect be the enemy of the good."
My inclination is to simply explicitly state that comma operator is not recognized in the documentation. It's already implicit by it's absence from the list of recognized operators.
Again, the worst that happens is that your macro isn't converted.
I'm open to being convinced that it's important, but you haven't convinced me yet `:)`
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D124500/new/
https://reviews.llvm.org/D124500
More information about the cfe-commits
mailing list