[PATCH] D37079: [Preprocessor] Correct internal token parsing of newline characters in CRLF

Reid Kleckner via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Aug 24 10:47:04 PDT 2017


rnk added inline comments.


================
Comment at: lib/Lex/Lexer.cpp:3076-3077
   case '\r':
+    if (CurPtr[0] != Char && (CurPtr[0] == '\n' || CurPtr[0] == '\r'))
+      Char = getAndAdvanceChar(CurPtr, Result);
     // If we are inside a preprocessor directive and we see the end of line,
----------------
erichkeane wrote:
> rnk wrote:
> > Should we only do this in the `\r` case? If I understand correctly, we're basically saying, if this is a CR, and the next byte is an LF, advance one more and do the pre-processor stuff.
> That is exactly what we're doing.
> 
> I debated that personally, and am a bit on the fence.  It seems a number of places like to treat a '\r\n' and a '\n\r' as the same thing, though it seems  a little foolish to me.  If you fall toward that opinion, I'll definitely change it, just say the word :)
The bug probably doesn't happen in the \n\r case, because don't we count '\n's to compute our line numbers?

Anyway, yeah, I think we should make this specific to '\r'. In that case, we peek one ahead, and if we see a simple '\n' byte, we advance one more so that our line numbers stay correct.


https://reviews.llvm.org/D37079





More information about the cfe-commits mailing list