[all-commits] [llvm/llvm-project] 359b96: [libc++][format] Improves escaping.

Mark de Wever via All-commits all-commits at lists.llvm.org
Wed Apr 24 11:37:21 PDT 2024


  Branch: refs/heads/users/mordante/improves_format_escaping
  Home:   https://github.com/llvm/llvm-project
  Commit: 359b961231510e31a4fdfc9d96abd48539f49e5b
      https://github.com/llvm/llvm-project/commit/359b961231510e31a4fdfc9d96abd48539f49e5b
  Author: Mark de Wever <koraq at xs4all.nl>
  Date:   2024-04-24 (Wed, 24 Apr 2024)

  Changed paths:
    M libcxx/docs/ReleaseNotes/19.rst
    M libcxx/docs/Status/Cxx23Papers.csv
    M libcxx/docs/Status/Cxx2cIssues.csv
    M libcxx/docs/Status/FormatIssues.csv
    M libcxx/include/__format/escaped_output_table.h
    M libcxx/include/__format/write_escaped.h
    M libcxx/test/std/utilities/format/format.functions/escaped_output.unicode.pass.cpp
    M libcxx/utils/generate_escaped_output_table.py

  Log Message:
  -----------
  [libc++][format] Improves escaping.

The change increments the size of the lookup table considerably. The table
has an "upper boundary" check. The removal of the code units with the
property Grapheme_Extend=Yes removes the range E0100..E01EF. This breaks
the trailing large continues section in two parts. This will be improved
in a followup patch.

Implements:
- P2713R1 Escaping improvements in std::format
- LWG3965 Incorrect example in [format.string.escaped] p3 for formatting of combining characters

Before
-----------------------------------------------------------------------
Benchmark                             Time             CPU   Iterations
-----------------------------------------------------------------------
BM_ascii_escaped<char>            95696 ns        95459 ns         7341
BM_unicode_escaped<char>          89311 ns        89088 ns         7835
BM_cyrillic_escaped<char>         58633 ns        58494 ns        11964
BM_japanese_escaped<char>         44500 ns        44382 ns        15780
BM_emoji_escaped<char>            99156 ns        98911 ns         7075
BM_ascii_escaped<wchar_t>         92245 ns        92017 ns         7592
BM_unicode_escaped<wchar_t>       80970 ns        80747 ns         8651
BM_cyrillic_escaped<wchar_t>      51253 ns        51112 ns        13729
BM_japanese_escaped<wchar_t>      37252 ns        37156 ns        18758
BM_emoji_escaped<wchar_t>         96226 ns        95961 ns         7270

After
-----------------------------------------------------------------------
Benchmark                             Time             CPU   Iterations
-----------------------------------------------------------------------
BM_ascii_escaped<char>           110704 ns       110696 ns         6206
BM_unicode_escaped<char>         101371 ns       101374 ns         6862
BM_cyrillic_escaped<char>         63329 ns        63327 ns        11013
BM_japanese_escaped<char>         41223 ns        41225 ns        16938
BM_emoji_escaped<char>           111022 ns       111021 ns         6304
BM_ascii_escaped<wchar_t>        112441 ns       112443 ns         6231
BM_unicode_escaped<wchar_t>      102776 ns       102779 ns         6813
BM_cyrillic_escaped<wchar_t>      58977 ns        58975 ns        11868
BM_japanese_escaped<wchar_t>      36885 ns        36886 ns        18975
BM_emoji_escaped<wchar_t>        115885 ns       115881 ns         6051



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list