[PATCH] D130416: [Clang] Add support for Unicode identifiers (UAX31) in C2x mode.

Tom Honermann via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Thu Jul 28 12:11:32 PDT 2022


tahonermann added a comment.

Apologies for the much delayed review (I know this already landed).

I added a comment, but I don't know if there is actually an issue to be fixed. This looks good to me otherwise.



================
Comment at: clang/lib/Lex/Lexer.cpp:1490-1492
     static const llvm::sys::UnicodeCharSet XIDStartChars(XIDStartRanges);
     // '_' doesn't have the XID_Start property but is allowed in C++.
     return C == '_' || XIDStartChars.contains(C);
----------------
Perhaps this should permit `$` when `LangOpts.DollarIdents` is true. However, my attempts to produce a test case that illustrates such a need failed. I don't know why; I guess `isAllowedInitiallyIDChar()` isn't called for all identifiers? Or there is special handling of `LangOpts.DollarIdents` somewhere? I wonder if that might imply an issue elsewhere; such as a missing call to `isAllowedInitiallyIDChar()` somewhere?

I did verify that '$' (U+003F) is not in `XID_Start`: https://util.unicode.org/UnicodeJsps/list-unicodeset.jsp?a=[:XID_Start=Yes:]


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130416/new/

https://reviews.llvm.org/D130416



More information about the cfe-commits mailing list