[libcxx-commits] [PATCH] D103413: [libc++][format] Implement Unicode support.

Victor Zverovich via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Sun Jul 11 09:17:45 PDT 2021


vitaut added inline comments.


================
Comment at: libcxx/include/__format/parser_std_format_spec.h:687-688
+ * - The simple scanner @ref __estimate_column_width_fast. This scanner assumes
+ *   1 character is 1 column. This scanner stops when it can't be sure the
+ *   assumption is valid:
+ *   - UTF-8 when the code point is encoded in more than 1 character.
----------------
Mordante wrote:
> vitaut wrote:
> > I think a more robust error handling would be to count each invalid code unit as contributing 1 to the width. Terminals normally use replacement characters so it will give the correct result.
> Not entirely sure what you mean. This algorithm is fast, but simple. The engine first uses this algorithm, when it "fails" it uses a more advanced algorithm. The more advanced algorithm implements the estimation as defined in the Standard.
> Basically this algorithm's main purpose is to handle ASCII, when it's not ASCII the Standard's estimation is used.
I misinterpreted how the two scanners interact. Makes sense now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103413/new/

https://reviews.llvm.org/D103413



More information about the libcxx-commits mailing list