[PATCH] D123064: [Clang][C++23] P2071 Named universal character escapes
Aaron Ballman via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Mon Jun 13 11:42:48 PDT 2022
aaron.ballman added inline comments.
================
Comment at: clang/lib/Lex/Lexer.cpp:3139
Diag(SlashLoc, diag::warn_ucn_not_valid_in_c89);
- return 0;
+ return {};
}
----------------
aaron.ballman wrote:
> FWIW, I think using `llvm::None` instead of `{}` is more clear to readers (I doubt we're consistent with this in the code base though).
>
> (Same comment applies other places that we're making an empty `Optional`.)
I think this comment was missed.
================
Comment at: clang/lib/Lex/LiteralSupport.cpp:509
+ Distance = Match.Distance;
+ if (Distance - Match.Distance > 3)
+ break;
----------------
Can the subtraction of two `unsigned` values cause a wrapping issue here? (If not, can we add an assertion?)
================
Comment at: clang/lib/Lex/LiteralSupport.cpp:537
+ const char *UcnBegin = ThisTokBuf;
+ assert(UcnBegin[1] == 'N');
+ ThisTokBuf += 2;
----------------
Should we assert `UcnBegin[0] == '\\'` as well?
================
Comment at: llvm/lib/Support/UnicodeNameToCodepoint.cpp:31-44
+struct node {
+ bool is_root = false;
+ char32_t value = 0xFFFFFFFF;
+ uint32_t children_offset = 0;
+ bool has_sibling = false;
+ uint32_t size = 0;
+ StringRef name = {};
----------------
aaron.ballman wrote:
> You should rename things to match the usual coding conventions.
It looks like this comment was missed (it applies to this whole file).
================
Comment at: llvm/lib/Support/UnicodeNameToCodepoint.cpp:49
+ std::string s;
+ s.reserve(64);
+ auto n = this;
----------------
aaron.ballman wrote:
> Any particular reason for 64?
Still wondering why 64 bytes specifically.
================
Comment at: llvm/lib/Support/UnicodeNameToCodepoint.cpp:257
+constexpr const char32_t SBase = 0xAC00;
+// constexpr const char32_t LBase = 0x1100;
+// constexpr const char32_t VBase = 0x1161;
----------------
aaron.ballman wrote:
> Plan to remove the commented out code?
Unanswered question here.
================
Comment at: llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp:17-18
+// List of generated names
+// Should be kept in sync with Unicode
+// "Name Derivation Rule Prefix String".
+static bool generated(char32_t c) {
----------------
aaron.ballman wrote:
> Do we have something more direct to point users towards?
Unanswered question here. May be a good place for a link like Tom had mentioned.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D123064/new/
https://reviews.llvm.org/D123064
More information about the llvm-commits
mailing list