[PATCH] Don't warn about Unicode characters in -E mode

Richard Smith richard at metafoo.co.uk
Tue Jan 29 15:06:25 PST 2013



================
Comment at: lib/Lex/Lexer.cpp:2836
@@ -2836,1 +2835,3 @@
+  if (!isASCII(*BufferPtr) && !isAllowedIDChar(C) &&
+      (isLexingRawMode() || !PP->isPreprocessedOutput())) {
     // Non-ASCII characters tend to creep into source code unintentionally.
----------------
Jordan Rose wrote:
> Richard Smith wrote:
> > Do you need the isLexingRawMode() check here? I guess it doesn't matter if we drop the non-ASCII characters in that case?
> It's about the right behavior when raw-lexing without a preprocessor. I'm not sure what "the right behavior" is, though.
I think if we're in -E mode we should keep all characters (and leave removing bad characters to whatever downstream process handles the file next), and I think it also makes sense to preserve them if we're in raw mode, so perhaps the check should be
##  if (... &&
      !isLexingRawMode && !PP->isPreprocessedOutput())
##


http://llvm-reviews.chandlerc.com/D346



More information about the cfe-commits mailing list