[PATCH] D125622: [clang-tidy] Reject invalid enum initializers in C files

Richard via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri May 27 17:52:42 PDT 2022


LegalizeAdulthood marked 5 inline comments as done.
LegalizeAdulthood added inline comments.


================
Comment at: clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp:324
 {
-  IntegralLiteralExpressionMatcher Matcher(MacroTokens);
-  return Matcher.match();
+  IntegralLiteralExpressionMatcher Matcher(MacroTokens, LangOpts.C99 == 0);
+  bool Matched = Matcher.match();
----------------
aaron.ballman wrote:
> Huh? Comma wasn't allowed in C89 either... In fact, there's no language mode which allows top-level commas in either C or C++ -- the issue with the comma operator is a parsing one. You can't tell the difference between the comma being part of the initializer expression or the comma being the separator between enumerators.
The most recent diff handles the issue with `operator,` but the previous version was only handling the overly-big literal


================
Comment at: clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp:326
+  bool Matched = Matcher.match();
+  if (LangOpts.C99 &&
+      (Matcher.largestLiteralSize() != LiteralSize::Int &&
----------------
aaron.ballman wrote:
> And C89?
`C99` was the earliest dialect of C I could find in `LangOptions.def`.  If you can show me an earlier version for C, I'm happy to switch it.


================
Comment at: clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.c:3-7
+// C requires enum values to fit into an int.
+#define TOO_BIG1 1L
+#define TOO_BIG2 1UL
+#define TOO_BIG3 1LL
+#define TOO_BIG4 1ULL
----------------
aaron.ballman wrote:
> How do we want to handle the fact that Clang has an extension to allow this? Perhaps we want a config option for pedantic vs non-pedantic fixes?
I was trying to find a way to make it fail on gcc/clang on compiler explorer and I couldn't get it to fail in either compiler....

Where is the extension documented?  It appears to be on by default in both compilers.


================
Comment at: clang-tools-extra/test/clang-tidy/checkers/modernize-macro-to-enum.c:10
+// C forbids comma operator in initializing expressions.
+#define BAD_OP 1, 2
+
----------------
aaron.ballman wrote:
> LegalizeAdulthood wrote:
> > aaron.ballman wrote:
> > > Can you add a test:
> > > ```
> > > #define GOOD_OP (1, 2)
> > > ```
> > > to make sure it still gets converted to an enum?
> > Yeah, another one I was thinking of is when someone does [[ https://godbolt.org/z/c677je8s1 | something as disgusting as this. ]]
> > 
> > ```
> > #define ARGS 1, 2
> > #define OTHER_ARGS (1, 2)
> > 
> > int f(int x, int y) { return x + y; }
> > 
> > int main()
> > {
> >     return f(ARGS) + f OTHER_ARGS;
> > }
> > ```
> > 
> > However, the only real way to handle avoiding conversion of the 2nd case is to examine the context of macro expansion.  This is another edge case that will have to be handled subsequently.
> > 
> > This gets tricky because AFAIK there is no way to select expressions in the AST that result from macro expansion.  You have to match the macro expansion locations against AST nodes to identify the node(s) that match the expansion location yourself.
> That's a good test case (for some definition of good)! :-)
> 
> At this point, there's a few options:
> 
> 1) Go back to the way things were -- disallow comma expressions, even in parens. Document it as a limitation.
> 2) Accept comma expressions in parens, ignore the problem case like you described above. Document it as a limitation?
> 3) Try to solve every awful code construct we can ever imagine. Document the check is perfect in every way, but probably not land it for several years.
> 
> I think I'd be happy with #2 but I could also live with #1
In the most recent diff, I'm supporting `operator,` inside parens for C++ and never allowing it in C.

I plan to do a subsequent enhancement that examines the expansion locations of macros and rejects any macro whose expansion doesn't encompass a single expression.  In other words, if the tokens in the macro somehow get incorporated into some syntactic element that isn't completely encased in a single expression, then reject the macro.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125622/new/

https://reviews.llvm.org/D125622



More information about the cfe-commits mailing list