[cfe-commits] [PATCH] Support for universal character names in identifiers

Richard Smith richard at metafoo.co.uk
Fri Jan 18 17:55:13 PST 2013



================
Comment at: lib/Lex/Lexer.cpp:1598
@@ -1597,3 +1693,3 @@
   char PrevCh = 0;
-  while (isNumberBody(C)) { // FIXME: UCNs in ud-suffix.
     CurPtr = ConsumeChar(CurPtr, Size, Result);
----------------
This FIXME still needs to be addressed, right?

================
Comment at: lib/Lex/Lexer.cpp:2744-2745
@@ +2743,4 @@
+    Result->setFlag(Token::HasUCN);
+    while (StartPtr != CurPtr)
+      getAndAdvanceChar(StartPtr, *Result);
+  } else {
----------------
You can skip this in the common case that CurPtr - StartPtr == NumHexDigits + 2

================
Comment at: lib/Lex/Lexer.cpp:2752
@@ +2751,3 @@
+  //   short identifier is less than 00A0 other than 0024 ($), 0040 (@), or
+  //   0060 (‘), nor one in the range D800 through DFFF inclusive.)
+  if (CodePoint < 0xA0) {
----------------
This ' should be a `, right?

It'd be nice to also reference C++'s equivalent "Additionally, if the hexadecimal value for a universal-character-name outside the c-char-sequence, s-char-sequence, or r-char-sequence of a character or string literal corresponds to a control character (in either of the ranges 0x00–0x1F or 0x7F–0x9F, both inclusive) or to a character in the basic source character set, the program is ill-formed."

================
Comment at: lib/Lex/Lexer.cpp:2818-2824
@@ +2817,9 @@
+    // whitespace.
+    if (!isLexingRawMode()) {
+      CharSourceRange CharRange =
+        CharSourceRange::getCharRange(getSourceLocation(),
+                                      getSourceLocation(CurPtr));
+      Diag(BufferPtr, diag::err_non_ascii)
+        << FixItHint::CreateRemoval(CharRange);
+    }
+
----------------
Do we diagnose such characters within #if 0 blocks?


http://llvm-reviews.chandlerc.com/D312



More information about the cfe-commits mailing list