[PATCH] D133807: Update Unicode to 15.0

Aaron Ballman via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Sep 14 06:09:04 PDT 2022


aaron.ballman added a comment.

The changes LGTM, but I'd like to wait for @tahonermann to weigh in with the final acceptance.



================
Comment at: llvm/lib/Support/UnicodeCaseFold.cpp:713
+  // 8 characters
+  if (C <= 0xa7c2)
     return C | 1;
----------------
cor3ntin wrote:
> shafik wrote:
> > Maybe I am misunderstanding the comments but should this be `0xa7be`?
> Quirk of the script,  the comment for C|1 never make sense, the values seems correct though (this script is the only think i have not written myself)
> https://github.com/llvm/llvm-project/blob/main/llvm/utils/unicode-case-fold.py#L89
> So you have 8 even codepoints mapping to  C|1 + 7 odd codepoint mapping to C|1 which is C. If my math is correct.
> I'm a bit reluctant to modify that script
Heh, I thought it should have been `0xa7bc` based on the changed comment above, but after talking to Corentin off-list, it sounds like any time we see `return C | 1;`, the comment above it is specifying the wrong number of characters in the range. So the issue is that the comment says `8 characters` when it should say `14 characters`.

We could correct the comment manually, but the next time we run the script we'll get the incorrect comment again. So for right now, I think this code is actually correct. At some point, we should fix that script to output the correct comment though, as it's hard to review the generated changes when the comments are misleading.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133807/new/

https://reviews.llvm.org/D133807



More information about the cfe-commits mailing list