[PATCH] D104975: Implement P1949

Arthur O'Dwyer via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Sat Jun 26 06:40:26 PDT 2021


Quuxplusone added inline comments.


================
Comment at: clang/lib/Lex/Lexer.cpp:1462
+          static const llvm::sys::UnicodeCharSet XIDStartChars(XIDStartRanges);
+          return C == '_' || XIDStartChars.contains(C);
+  } else if (LangOpts.C11) {
----------------
This is overly indented (or your editor snuck in some hard tabs).
Why is `'_'` treated specially, here and on line 1444 above? Shouldn't it just be included in the `XIDStartChars` set?


================
Comment at: clang/lib/Lex/Lexer.cpp:1503-1511
-  // Check C++98 compatibility.
-  if (!Diags.isIgnored(diag::warn_cxx98_compat_unicode_id, Range.getBegin())) {
-    static const llvm::sys::UnicodeCharSet CXX03AllowedIDChars(
-        CXX03AllowedIDCharRanges);
-    if (!CXX03AllowedIDChars.contains(C)) {
-      Diags.Report(Range.getBegin(), diag::warn_cxx98_compat_unicode_id)
-        << Range;
----------------
Why remove this diagnostic? This looks unintentional.

But if it is intentional, then besides justifying it in the commit message (and splitting out this removal into a separate PR), you should remove the enumerator from `DiagnosticLexKinds.td`.


================
Comment at: clang/lib/Lex/UnicodeCharSets.h:229
+static const llvm::sys::UnicodeCharRange XIDContinueRanges[] = {
+    {0x0030, 0x0039},   {0x005F, 0x005F},   {0x00B7, 0x00B7},
+    {0x0300, 0x036F},   {0x0387, 0x0387},   {0x0483, 0x0487},
----------------
Ah, here we go. `0x005F` is underscore. It should be in the XIDStart table instead.


================
Comment at: clang/test/Lexer/unicode.c:31
+extern int 🌹; // expected-error {{character not allowed in identifier}} expected-warning {{declaration does not declare anything}}
+int v=[=](auto){return~x;}();;  // expected-error 12{{character not allowed in identifier}} expected-error {{expected ';'}}
+// expected-error at -1 {{expected unqualified-id}}
----------------
Why 12 errors? Are these `{` `(` etc. characters not the ASCII versions? If so, this test is needlessly confusing, and should be rewritten with one character per test case, as in lines 43-53 below. (Except line 46, which is also bad. :))

I don't understand why you're disabling the existing tests. In particular, line 43 tests the error message for GREEK QUESTION MARK, which is important to preserve.


================
Comment at: clang/test/Preprocessor/ucn-pp-identifier.c:31
 
-#define a\u0024
+#define a\u00c0
 
----------------
This test shouldn't need tweaking, should it?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104975/new/

https://reviews.llvm.org/D104975



More information about the cfe-commits mailing list