[PATCH] D123064: [Clang][C++23] P2071 Named universal character escapes

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Mon Jun 13 12:42:04 PDT 2022


aaron.ballman added inline comments.


================
Comment at: llvm/lib/Support/UnicodeNameToCodepoint.cpp:49
+    std::string s;
+    s.reserve(64);
+    auto n = this;
----------------
cor3ntin wrote:
> aaron.ballman wrote:
> > aaron.ballman wrote:
> > > Any particular reason for 64?
> > Still wondering why 64 bytes specifically.
> It's semi arbitrary (aka a nice power of two that fits in a cacheline) - but it's large enough that it fits the 99% of names (the 99th percentile is actually around 46 byte)
Okay, 64 is fine by me then, but if you wanted to bump down to 46 I also wouldn't be sad, it's your call.


================
Comment at: llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp:17-18
+// List of generated names
+// Should be kept in sync with Unicode
+// "Name Derivation Rule Prefix String".
+static bool generated(char32_t c) {
----------------
cor3ntin wrote:
> aaron.ballman wrote:
> > aaron.ballman wrote:
> > > Do we have something more direct to point users towards?
> > Unanswered question here. May be a good place for a link like Tom had mentioned.
> This comment should no longer apply as I got rid of the `generated` method - instead only relying on info we find when parsing the file (see line 44).
> It doesn't mean that we don't need more reference to the UnicodeData.txt url though
Ah, okay -- basically, I'm hoping to have a comment somewhere near the top of this file with a link to the Unicode data.

Btw, I noticed this file is missing its license header, you should add that.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123064/new/

https://reviews.llvm.org/D123064



More information about the cfe-commits mailing list