[PATCH] D117522: [clang-tidy] Add modernize-macro-to-enum check

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Fri Mar 25 06:12:16 PDT 2022


aaron.ballman accepted this revision.
aaron.ballman added a comment.
This revision is now accepted and ready to land.

Thanks for the discussion on this new check, it LGTM!



================
Comment at: clang-tools-extra/clang-tidy/modernize/MacroToEnumCheck.cpp:47
+    CRLF,
+    CRLFCR,
+  };
----------------
LegalizeAdulthood wrote:
> aaron.ballman wrote:
> > LegalizeAdulthood wrote:
> > > aaron.ballman wrote:
> > > > LegalizeAdulthood wrote:
> > > > > aaron.ballman wrote:
> > > > > > LegalizeAdulthood wrote:
> > > > > > > aaron.ballman wrote:
> > > > > > > > I'm a bit confused by this one as this is not a valid line ending (it's either three valid line endings or two valid line endings, depending on how you look at it). Can you explain why this is needed?
> > > > > > > It's a state machine, where the states are named for what we've seen so far and we're looking for //two// consecutive line endings, not just one.  Does it make sense now?
> > > > > > Thanks, I understood it was a state machine, but it's a confused one to me. `\r` was the line ending on Mac Classic, I've not seen it used outside of that platform (and I've not seen anyone write code for that platform in a long time). So, to me, the only valid combinations of line endings to worry about are: `LF LF`; `CRLF CRLF`; `CRLF LF`; `LF CRLF`.
> > > > > > 
> > > > > > `LF LF` returns false (Nothing -> LF -> return false)
> > > > > > `CRLF CRLF` returns false (Nothing -> CR -> CRLF -> CRLFCR -> return false)
> > > > > > `CRLF LF` returns true (Nothing -> CR -> CRLF -> LF -> finish loop)
> > > > > > `LF CRLF` returns true (Nothing -> LF -> CR -> CRLF -> finish loop)
> > > > > > 
> > > > > > (If you also intend to support Mac Classic line endings for some reason, this gets even more complicated.)
> > > > > I was trying to follow "be liberal in what you accept as input and conservative in what you generate as output" maxim.  I can remove the `CR` as a line ending case if you think it's too obscure.
> > > > If Clang supports it as a line ending, we probably should too, but... how do we handle CRLF vs "I mixed a CR with an LF by accident" kind of inputs? (Maybe we just treat that as CRLF and if the behavior is bad, the user shouldn't mix their line endings that way; I think that's defensible.) That seems to be similar to the scenario that's confusing me above where the user mixed an LF and CRLF by accident.
> > > Well, as far as Clang is concerned it's all just "whitespace" that gets eaten up by the preprocessor.  Actually, that gives me a thought.  A preprocessing directive is considered to end at the physical line ending, so I should look to see what sort of characters it considers to "end the line".
> > > 
> > > For the accidental mix-up, I'm not going to worry about that here.  Your input files are assumed to be "well formed".  The worst that happens in this check is that two blocks of macros that //look// like they are separated by a blank line are considered as a single clump by this check.
> > > 
> > > In other words, the worst that can happen is:
> > >   - Two clumps of macros are considered together.
> > >   - One clump of macros that is discarded because it doesn't follow the constraints "taints" an adjacent clump of macros that do follow the constraints.
> > > 
> > > Either way, nothing harmful happens to your code.  It will still compile and be syntactically and semantically equivalent to what was there before.
> > > 
> > > Actually, that gives me a thought. A preprocessing directive is considered to end at the physical line ending, so I should look to see what sort of characters it considers to "end the line".
> > 
> > All of `\r`, `\n`, `\r\n` I believe (you can double-check in `Lexer::LexTokenInternal()`
> > 
> > > Either way, nothing harmful happens to your code. It will still compile and be syntactically and semantically equivalent to what was there before.
> > 
> > Oh, that's a very good point, thank you. I think that's reasonable fallback behavior for these weird edge cases.
> Well..... maybe.
> 
> If you look at `Lexer::ReadToEndOfLine` which is used to skip to the end of a preprocessor directive you'll see that it considers the first of `'\r'`, `'\n'` or `'\0'` (end of file) as the end of the "line".  This is around line 2835 of Lexer.cpp in my tree.
Yeah, I saw that as well (that's typically used for error recovery in the preprocessor). We also have `Lexer::SkipWhitespace()` which skips all vertical whitespace, but not `\0`.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D117522/new/

https://reviews.llvm.org/D117522



More information about the cfe-commits mailing list