[libcxx-commits] [PATCH] D144346: [libc++][format] Improves Unicode decoders.
Tom Honermann via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Tue Feb 21 15:28:52 PST 2023
tahonermann added inline comments.
================
Comment at: libcxx/include/__format/unicode.h:139-147
+ // U+0000..U+007F 00..7F
+ // U+0080..U+07FF *C2*..DF 80..BF U+0000..U0+007F 1 code unit range*
+ // U+0800..U+0FFF E0 *A0*..BF 80..BF U+0000..U+07FFF 1 and 2 code unit range
+ // U+1000..U+CFFF E1..EC 80..BF 80..BF
+ // U+D000..U+D7FF ED 80..*9F* 80..BF U+D800..D+DFFFF surrogate range
+ // U+E000..U+FFFF EE..EF 80..BF 80..BF
+ // U+10000..U+3FFFF F0 *90*..BF 80..BF 80..BF U+0000..U+FFFF 1, 2, and 3 code unit range
----------------
I corrected several of the `U+XXXX` identifiers in the suggested edit. I also aligned the remarks with the rows that I think they better correspond to.
================
Comment at: libcxx/include/__format/unicode.h:150
+ // *Marked* entries are not the full range 80..BF.
+ // *) This entry is not marked in the Unicode standard, but this entry is also not the full range.
+ //
----------------
I don't understand this footnote. The full range of code points that are encodeable in a single code unit is U+0000..U+007F.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D144346/new/
https://reviews.llvm.org/D144346
More information about the libcxx-commits
mailing list