[PATCH] D133807: Update Unicode to 15.0

Corentin Jabot via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Tue Sep 13 13:21:07 PDT 2022


cor3ntin created this revision.
Herald added subscribers: hiraditya, dschuff.
Herald added a project: All.
cor3ntin requested review of this revision.
Herald added projects: clang, LLVM.
Herald added subscribers: llvm-commits, cfe-commits.

Unicode 15.0 adds 4,489 characters, for a total of 149,186 characters.
These additions include 2 new scripts along with 20 new emoji characters,
and 4,193 CJK ideographs.

This changes modify most existing tables including

- XID_Start/XID_Continue in Clang
- The character name database (used by \N{} in Clang)
- The list of formattable/printable codepoints
- The case folding algorithm (which we had not updated since Unicode 9)
- The list of nonspacing/enclosing marks used by the column width computation algorithm. The rest of the column width algorithm is not updated.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D133807

Files:
  clang/lib/Lex/UnicodeCharSets.h
  llvm/lib/Support/Unicode.cpp
  llvm/lib/Support/UnicodeCaseFold.cpp
  llvm/lib/Support/UnicodeNameToCodepoint.cpp
  llvm/lib/Support/UnicodeNameToCodepointGenerated.cpp
  llvm/utils/UnicodeData/UnicodeNameMappingGenerator.cpp



More information about the cfe-commits mailing list