[llvm-branch-commits] [libcxx] [libc++][format] Improves escaping performance. (PR #88533)
Louis Dionne via llvm-branch-commits
llvm-branch-commits at lists.llvm.org
Tue Apr 23 10:21:06 PDT 2024
================
@@ -305,23 +316,28 @@ def generate_data_tables() -> str:
data = compactPropertyRanges(sorted(properties, key=lambda x: x.lower))
- # The last entry is large. In Unicode 14 it contains the entries
- # 3134B..0FFFF 912564 elements
- # This are 446 entries of 1325 entries in the table.
- # Based on the nature of these entries it is expected they remain for the
- # forseeable future. Therefore we only store the lower bound of this section.
- #
- # When this region becomes substantially smaller we need to investigate
- # this design.
- #
- # Due to P2713R1 Escaping improvements in std::format the range
+ # The output table has two large entries at the end, with a small "gap"
# E0100..E01EF ; Grapheme_Extend # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
- # is no longer part of these entries. This causes an increase in the size
- # of the table.
- assert data[-1].upper == 0x10FFFF
- # assert data[-1].upper - data[-1].lower > 900000
-
- return "\n".join([generate_cpp_data(data[:-1], data[-1].lower)])
+ # Based on Unicode 15.1.0:
+ # - Encoding all these entries in the table requires 1173 entries.
+ # - Manually handling these last two blocks reduces the size to 729 entries.
+ # This not only reduces the binary size, but also improves the performance
+ # by having less elements to search.
+ # The exact entrires may differ between Unicode versions. When these numbers
----------------
ldionne wrote:
```suggestion
# The exact entries may differ between Unicode versions. When these numbers
```
https://github.com/llvm/llvm-project/pull/88533
More information about the llvm-branch-commits
mailing list