[cfe-commits] [PATCH] Support for universal character names in identifiers

Jordan Rose jordan_rose at apple.com
Mon Jan 21 10:46:39 PST 2013



================
Comment at: lib/Lex/Lexer.cpp:1598
@@ -1597,3 +1693,3 @@
   char PrevCh = 0;
-  while (isNumberBody(C)) { // FIXME: UCNs in ud-suffix.
     CurPtr = ConsumeChar(CurPtr, Size, Result);
----------------
Richard Smith wrote:
> This FIXME still needs to be addressed, right?
I'm not sure. Eli had this taken out in his initial patch, and certainly we now give the proper warning for using a UCN sans underscore in a ud-suffix. But I don't know if it actually works end-to-end yet. I'll double-check.

================
Comment at: lib/Lex/Lexer.cpp:2752
@@ +2751,3 @@
+  //   short identifier is less than 00A0 other than 0024 ($), 0040 (@), or
+  //   0060 (‘), nor one in the range D800 through DFFF inclusive.)
+  if (CodePoint < 0xA0) {
----------------
Richard Smith wrote:
> This ' should be a `, right?
> 
> It'd be nice to also reference C++'s equivalent "Additionally, if the hexadecimal value for a universal-character-name outside the c-char-sequence, s-char-sequence, or r-char-sequence of a character or string literal corresponds to a control character (in either of the ranges 0x00–0x1F or 0x7F–0x9F, both inclusive) or to a character in the basic source character set, the program is ill-formed."
Grr...darn OS X being "helpful". Good catch.

I'll add the C++ comment.

================
Comment at: lib/Lex/Lexer.cpp:2818-2824
@@ +2817,9 @@
+    // whitespace.
+    if (!isLexingRawMode()) {
+      CharSourceRange CharRange =
+        CharSourceRange::getCharRange(getSourceLocation(),
+                                      getSourceLocation(CurPtr));
+      Diag(BufferPtr, diag::err_non_ascii)
+        << FixItHint::CreateRemoval(CharRange);
+    }
+
----------------
Richard Smith wrote:
> Do we diagnose such characters within #if 0 blocks?
Hm, we should but this code does not. But I was hitting a reentrancy problem before where emitting the diagnostic required re-lexing. Is there a better way to distinguish "actual parsing" from "lexing for diagnostics" that doesn't include "skipping over #if 0 blocks"?


http://llvm-reviews.chandlerc.com/D312



More information about the cfe-commits mailing list