[PATCH] D123064: [Clang][C++23] P2071 Named universal character escapes
Aaron Ballman via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Jun 15 05:39:53 PDT 2022
aaron.ballman added a comment.
Some likely final coding style nits. The only thing of substance I'm waiting on is an answer to whether we need to update a license file in order to comply with the Unicode license requirements. @tstellar, any chance you can help there?
================
Comment at: clang/lib/Lex/Lexer.cpp:3237-3239
+ if (C != '{') {
+ if (Diagnose)
+ Diag(StartPtr, diag::warn_ucn_escape_incomplete);
----------------
cor3ntin wrote:
> aaron.ballman wrote:
> > Is this a case where we might want a fixit because the user did `\N` when they meant to do `\n`?
> >
> > (Should we look for `\N(` and fixit that to `\N{`?)
> I think if we wanted to diagnose \n we should also diagnose \U, which we don't do, Maybe a follow up patch, what do you think?
> I can't imagine trying to be smart about `\N(` would be exercised by many users
Follow-up (if at all) is fine by me!
================
Comment at: llvm/lib/Support/UnicodeNameToCodepoint.cpp:41
+ constexpr bool isValid() const {
+ return Name.size() != 0 || Value == 0xFFFFFFFF;
+ }
----------------
================
Comment at: llvm/lib/Support/UnicodeNameToCodepoint.cpp:45
+
+ std::string FullName() const {
+ std::string s;
----------------
================
Comment at: llvm/lib/Support/UnicodeNameToCodepoint.cpp:120
+
+static bool StartsWith(StringRef Name, StringRef Needle, bool Strict,
+ std::size_t &Consummed, char &PreviousCharInName,
----------------
================
Comment at: llvm/lib/Support/UnicodeNameToCodepoint.cpp:131
+ }
+ if (Needle.size() == 0)
+ return true;
----------------
================
Comment at: llvm/lib/Support/UnicodeNameToCodepoint.cpp:183
+ BufferType &Buffer, const Node *Parent = nullptr) {
+ auto N = readNode(Offset, Parent);
+ std::size_t Consummed = 0;
----------------
================
Comment at: llvm/lib/Support/UnicodeNameToCodepoint.cpp:222
+// clang-format off
+constexpr const char *const hangul_syllables[][3] = {
+ { "G", "A", "" },
----------------
================
Comment at: llvm/lib/Support/UnicodeNameToCodepoint.cpp:303
+ Name = Name.substr(findSyllable(Name, Strict, NameStart, T, 2));
+ if (L != -1 && V != -1 && T != -1 && Name.size() == 0) {
+ if (!Strict) {
----------------
================
Comment at: llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp:1-3
+//===- utils/UnicodeData/UnicodeNameMappingGenerator.cpp - Unicode name data
+// generator -===//
+//-*- C++ -*-===//
----------------
Something looks amiss here -- no idea how we usually handle > 80 col file names, but maybe removing the `utils/UnicodeData/` prefix will suffice?
================
Comment at: llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp:31
+
+static const llvm::StringRef LETTERS =
+ " _-ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
----------------
================
Comment at: llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp:56
+
+ auto Name =
+ Line.substr(FirstSemiPos + 1, SecondSemiPos - FirstSemiPos - 1);
----------------
================
Comment at: llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp:59
+
+ if (Name.size() && Name[0] == '<') {
+ // Ignore ranges of characters, as their name is either absent or
----------------
================
Comment at: llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp:67
+ if (IsAliasFile) {
+ auto Kind = Line.substr(SecondSemiPos + 1);
+ if (Kind != "control" && Kind != "correction" && Kind != "alternate")
----------------
================
Comment at: llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp:206
+
+ assert(N->Name.size() != 0);
+ Offsets[N] = Offset;
----------------
================
Comment at: llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp:349
+
+ auto Out = fopen(argv[3], "w");
+ if (!Out) {
----------------
================
Comment at: llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp:360-361
+ for (const std::pair<const char32_t, std::string> &Entry : Entries) {
+ const auto &Codepoint = Entry.first;
+ const auto &Name = Entry.second;
+ // Ignore names which are not valid.
----------------
================
Comment at: llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp:379-380
+ std::pair<std::string, std::vector<uint8_t>> Data = T.serialize();
+ const auto &Dict = Data.first;
+ const auto &Tree = Data.second;
+
----------------
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D123064/new/
https://reviews.llvm.org/D123064
More information about the cfe-commits
mailing list