[llvm-branch-commits] [libcxx] [libc++][format][3/7] Improves std::format performance. (PR #101817)

Mark de Wever via llvm-branch-commits llvm-branch-commits at lists.llvm.org
Sat Aug 3 05:06:54 PDT 2024


https://github.com/mordante updated https://github.com/llvm/llvm-project/pull/101817

>From a3acb85e3fd8dd9bb4320bf028d9773d018f27b4 Mon Sep 17 00:00:00 2001
From: Mark de Wever <koraq at xs4all.nl>
Date: Sat, 30 Mar 2024 17:35:56 +0100
Subject: [PATCH] [libc++][format][3/7] Improves std::format performance.

This changes the __output_buffer to a new structure. Since the other
formatting fucntions std::format_to, std::format_to_n, and
std::formatted_size still use the old codepaths the class is in a
transition state. At the end of the series the class should be in its
final state.

write_double_comparison:

Before
----------------------------------------------------------------------------------------
Benchmark                                              Time             CPU   Iterations
----------------------------------------------------------------------------------------
BM_sprintf                                           197 ns          196 ns      3550000
BM_to_string                                         218 ns          218 ns      3214000
BM_to_chars                                         42.4 ns         42.3 ns     16575000
BM_to_chars_as_string                               48.2 ns         48.1 ns     14542000
BM_format                                            175 ns          175 ns      4000000
BM_format_to_back_inserter<std::string>              175 ns          175 ns      3995000
BM_format_to_back_inserter<std::vector<char>>        207 ns          206 ns      3393000
BM_format_to_back_inserter<std::list<char>>          752 ns          750 ns       931000
BM_format_to_iterator/<std::array>                   161 ns          161 ns      4345000
BM_format_to_iterator/<std::string>                  161 ns          161 ns      4344000
BM_format_to_iterator/<std::vector>                  162 ns          161 ns      4344000

After
----------------------------------------------------------------------------------------
Benchmark                                              Time             CPU   Iterations
----------------------------------------------------------------------------------------
BM_sprintf                                           197 ns          197 ns      3550000
BM_to_string                                         219 ns          219 ns      3199000
BM_to_chars                                         42.4 ns         42.4 ns     16554000
BM_to_chars_as_string                               48.1 ns         48.1 ns     14569000
BM_format                                            167 ns          167 ns      4203000
BM_format_to_back_inserter<std::string>              179 ns          179 ns      3920000
BM_format_to_back_inserter<std::vector<char>>        214 ns          214 ns      3274000
BM_format_to_back_inserter<std::list<char>>          751 ns          751 ns       930000
BM_format_to_iterator/<std::array>                   164 ns          164 ns      4258000
BM_format_to_iterator/<std::string>                  165 ns          164 ns      4247000
BM_format_to_iterator/<std::vector>                  165 ns          165 ns      4248000

Comparison
Benchmark                                                       Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------------------------------
BM_sprintf                                                   +0.0013         +0.0028           197           197           196           197
BM_to_string                                                 +0.0023         +0.0038           218           219           218           219
BM_to_chars                                                  +0.0014         +0.0030            42            42            42            42
BM_to_chars_as_string                                        -0.0025         -0.0010            48            48            48            48
BM_format                                                    -0.0476         -0.0462           175           167           175           167
BM_format_to_back_inserter<std::string>                      +0.0190         +0.0205           175           179           175           179
BM_format_to_back_inserter<std::vector<char>>                +0.0348         +0.0363           207           214           206           214
BM_format_to_back_inserter<std::list<char>>                  -0.0013         +0.0005           752           751           750           751
BM_format_to_iterator/<std::array>                           +0.0188         +0.0203           161           164           161           164
BM_format_to_iterator/<std::string>                          +0.0207         +0.0226           161           165           161           164
BM_format_to_iterator/<std::vector>                          +0.0197         +0.0212           162           165           161           165
OVERALL_GEOMEAN                                              +0.0058         +0.0074             0             0             0             0

write_int_comparison:

Before
----------------------------------------------------------------------------------------
Benchmark                                              Time             CPU   Iterations
----------------------------------------------------------------------------------------
BM_sprintf                                          79.6 ns         79.5 ns      8739000
BM_to_string                                        14.9 ns         14.9 ns     46713000
BM_to_chars                                         5.68 ns         5.67 ns    120614000
BM_to_chars_as_string                               14.2 ns         14.1 ns     49513000
BM_format                                           69.3 ns         69.2 ns     10105000
BM_format_to_back_inserter<std::string>             69.2 ns         69.1 ns     10138000
BM_format_to_back_inserter<std::vector<char>>       90.6 ns         90.5 ns      7728000
BM_format_to_back_inserter<std::list<char>>          234 ns          234 ns      2986000
BM_format_to_iterator/<std::array>                  59.3 ns         59.3 ns     11805000
BM_format_to_iterator/<std::string>                 58.7 ns         58.6 ns     11943000
BM_format_to_iterator/<std::vector>                 60.1 ns         60.1 ns     11670000

After
----------------------------------------------------------------------------------------
Benchmark                                              Time             CPU   Iterations
----------------------------------------------------------------------------------------
BM_sprintf                                          80.2 ns         80.2 ns      8670000
BM_to_string                                        15.0 ns         15.0 ns     46559000
BM_to_chars                                         4.93 ns         4.93 ns    138016000
BM_to_chars_as_string                               15.4 ns         15.4 ns     45415000
BM_format                                           62.1 ns         62.0 ns     11316000
BM_format_to_back_inserter<std::string>             70.2 ns         70.2 ns      9962000
BM_format_to_back_inserter<std::vector<char>>       92.8 ns         92.8 ns      7544000
BM_format_to_back_inserter<std::list<char>>          240 ns          240 ns      2917000
BM_format_to_iterator/<std::array>                  60.5 ns         60.5 ns     11572000
BM_format_to_iterator/<std::string>                 60.2 ns         60.2 ns     11653000
BM_format_to_iterator/<std::vector>                 60.1 ns         60.1 ns     11659000

Comparison
Benchmark                                                       Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------------------------------
BM_sprintf                                                   +0.0072         +0.0081            80            80            80            80
BM_to_string                                                 +0.0043         +0.0053            15            15            15            15
BM_to_chars                                                  -0.1324         -0.1316             6             5             6             5
BM_to_chars_as_string                                        +0.0895         +0.0906            14            15            14            15
BM_format                                                    -0.1047         -0.1040            69            62            69            62
BM_format_to_back_inserter<std::string>                      +0.0148         +0.0157            69            70            69            70
BM_format_to_back_inserter<std::vector<char>>                +0.0241         +0.0251            91            93            90            93
BM_format_to_back_inserter<std::list<char>>                  +0.0222         +0.0232           234           240           234           240
BM_format_to_iterator/<std::array>                           +0.0196         +0.0205            59            60            59            60
BM_format_to_iterator/<std::string>                          +0.0266         +0.0275            59            60            59            60
BM_format_to_iterator/<std::vector>                          -0.0004         +0.0005            60            60            60            60
OVERALL_GEOMEAN                                              -0.0045         -0.0036             0             0             0             0

write_string_comparison:

Before
---------------------------------------------------------------------------------------------------------------
Benchmark                                                                     Time             CPU   Iterations
---------------------------------------------------------------------------------------------------------------
BM_sprintf/C string len = 6                                                4.82 ns         4.82 ns    145548481
BM_format/C string len = 6                                                 55.1 ns         55.1 ns     12713693
BM_format_to_back_inserter<std::string>/C string len = 6                   55.1 ns         55.1 ns     12690821
BM_format_to_back_inserter<std::vector<char>>/C string len = 6             71.5 ns         71.5 ns      9805550
BM_format_to_back_inserter<std::deque<char>>/C string len = 6               154 ns          154 ns      4536256
BM_format_to_back_inserter<std::list<char>>/C string len = 6                130 ns          130 ns      5348845
BM_format_to_iterator/<std::array> C string len = 6                        44.9 ns         44.9 ns     15556175
BM_format_to_iterator/<std::string> C string len = 6                       45.8 ns         45.8 ns     15290662
BM_format_to_iterator/<std::vector> C string len = 6                       44.4 ns         44.4 ns     15807704
BM_format_to_iterator/<std::deque> C string len = 6                        50.0 ns         50.0 ns     13973893
BM_format/string len = 6                                                   54.7 ns         54.7 ns     12793406
BM_format_to_back_inserter<std::string>/string len = 6                     55.5 ns         55.5 ns     12620370
BM_format_to_back_inserter<std::vector<char>>/string len = 6               70.4 ns         70.4 ns      9936490
BM_format_to_back_inserter<std::deque<char>>/string len = 6                 155 ns          155 ns      4521357
BM_format_to_back_inserter<std::list<char>>/string len = 6                  135 ns          135 ns      5201519
BM_format_to_iterator/<std::array> string len = 6                          44.6 ns         44.6 ns     15703872
BM_format_to_iterator/<std::string> string len = 6                         45.0 ns         45.0 ns     15545182
BM_format_to_iterator/<std::vector> string len = 6                         45.0 ns         45.0 ns     15539130
BM_format_to_iterator/<std::deque> string len = 6                          50.5 ns         50.5 ns     13846916
BM_format/string_view len = 6                                              54.6 ns         54.6 ns     12821301
BM_format_to_back_inserter<std::string>/string_view len = 6                54.6 ns         54.6 ns     12827673
BM_format_to_back_inserter<std::vector<char>>/string_view len = 6          69.9 ns         69.9 ns      9958365
BM_format_to_back_inserter<std::deque<char>>/string_view len = 6            157 ns          157 ns      4462445
BM_format_to_back_inserter<std::list<char>>/string_view len = 6             134 ns          134 ns      5232155
BM_format_to_iterator/<std::array> string_view len = 6                     44.2 ns         44.2 ns     15871362
BM_format_to_iterator/<std::string> string_view len = 6                    45.0 ns         45.0 ns     15582582
BM_format_to_iterator/<std::vector> string_view len = 6                    45.1 ns         45.1 ns     15539906
BM_format_to_iterator/<std::deque> string_view len = 6                     50.5 ns         50.5 ns     13896524
BM_sprintf/C string len = 60                                               4.15 ns         4.15 ns    168165776
BM_format/C string len = 60                                                73.8 ns         73.8 ns      9498291
BM_format_to_back_inserter<std::string>/C string len = 60                  73.6 ns         73.6 ns      9535890
BM_format_to_back_inserter<std::vector<char>>/C string len = 60            83.1 ns         83.1 ns      8428154
BM_format_to_back_inserter<std::deque<char>>/C string len = 60              281 ns          281 ns      2490093
BM_format_to_back_inserter<std::list<char>>/C string len = 60              1157 ns         1157 ns       605227
BM_format_to_iterator/<std::array> C string len = 60                       44.9 ns         44.9 ns     15604442
BM_format_to_iterator/<std::string> C string len = 60                      45.8 ns         45.8 ns     15272196
BM_format_to_iterator/<std::vector> C string len = 60                      44.6 ns         44.7 ns     15683193
BM_format_to_iterator/<std::deque> C string len = 60                       50.6 ns         50.6 ns     13698382
BM_format/string len = 60                                                  72.3 ns         72.3 ns      9648955
BM_format_to_back_inserter<std::string>/string len = 60                    72.0 ns         72.0 ns      9738373
BM_format_to_back_inserter<std::vector<char>>/string len = 60              82.3 ns         82.3 ns      8517896
BM_format_to_back_inserter<std::deque<char>>/string len = 60                280 ns          280 ns      2496054
BM_format_to_back_inserter<std::list<char>>/string len = 60                1162 ns         1162 ns       602383
BM_format_to_iterator/<std::array> string len = 60                         44.5 ns         44.5 ns     15727799
BM_format_to_iterator/<std::string> string len = 60                        49.6 ns         49.6 ns     14096012
BM_format_to_iterator/<std::vector> string len = 60                        49.8 ns         49.8 ns     14053734
BM_format_to_iterator/<std::deque> string len = 60                         50.8 ns         50.8 ns     13801448
BM_format/string_view len = 60                                             72.5 ns         72.5 ns      9653638
BM_format_to_back_inserter<std::string>/string_view len = 60               72.7 ns         72.7 ns      9598203
BM_format_to_back_inserter<std::vector<char>>/string_view len = 60         81.9 ns         81.9 ns      8522306
BM_format_to_back_inserter<std::deque<char>>/string_view len = 60           283 ns          283 ns      2475014
BM_format_to_back_inserter<std::list<char>>/string_view len = 60           1162 ns         1162 ns       600924
BM_format_to_iterator/<std::array> string_view len = 60                    44.1 ns         44.1 ns     15858951
BM_format_to_iterator/<std::string> string_view len = 60                   44.9 ns         44.9 ns     15579340
BM_format_to_iterator/<std::vector> string_view len = 60                   44.9 ns         44.9 ns     15586711
BM_format_to_iterator/<std::deque> string_view len = 60                    50.0 ns         50.0 ns     13980804
BM_sprintf/C string len = 6000                                              116 ns          116 ns      6051541
BM_format/C string len = 6000                                              1000 ns         1000 ns       698647
BM_format_to_back_inserter<std::string>/C string len = 6000                1002 ns         1002 ns       701440
BM_format_to_back_inserter<std::vector<char>>/C string len = 6000           956 ns          956 ns       727585
BM_format_to_back_inserter<std::deque<char>>/C string len = 6000          14898 ns        14898 ns        46994
BM_format_to_back_inserter<std::list<char>>/C string len = 6000          114860 ns       114859 ns         6106
BM_format_to_iterator/<std::array> C string len = 6000                      158 ns          158 ns      4425133
BM_format_to_iterator/<std::string> C string len = 6000                     161 ns          161 ns      4335471
BM_format_to_iterator/<std::vector> C string len = 6000                     157 ns          157 ns      4444174
BM_format_to_iterator/<std::deque> C string len = 6000                      445 ns          445 ns      1574120
BM_format/string len = 6000                                                 929 ns          929 ns       753630
BM_format_to_back_inserter<std::string>/string len = 6000                   930 ns          930 ns       752888
BM_format_to_back_inserter<std::vector<char>>/string len = 6000             910 ns          910 ns       771111
BM_format_to_back_inserter<std::deque<char>>/string len = 6000            14875 ns        14876 ns        47221
BM_format_to_back_inserter<std::list<char>>/string len = 6000            114937 ns       114936 ns         6092
BM_format_to_iterator/<std::array> string len = 6000                        118 ns          118 ns      5963643
BM_format_to_iterator/<std::string> string len = 6000                       106 ns          106 ns      6584711
BM_format_to_iterator/<std::vector> string len = 6000                       106 ns          106 ns      6583118
BM_format_to_iterator/<std::deque> string len = 6000                        391 ns          391 ns      1790538
BM_format/string_view len = 6000                                            935 ns          935 ns       744348
BM_format_to_back_inserter<std::string>/string_view len = 6000              934 ns          934 ns       742039
BM_format_to_back_inserter<std::vector<char>>/string_view len = 6000        895 ns          895 ns       783527
BM_format_to_back_inserter<std::deque<char>>/string_view len = 6000       14864 ns        14865 ns        47122
BM_format_to_back_inserter<std::list<char>>/string_view len = 6000       115042 ns       115044 ns         6091
BM_format_to_iterator/<std::array> string_view len = 6000                   115 ns          115 ns      6070197
BM_format_to_iterator/<std::string> string_view len = 6000                  116 ns          116 ns      6035109
BM_format_to_iterator/<std::vector> string_view len = 6000                  115 ns          115 ns      6067683
BM_format_to_iterator/<std::deque> string_view len = 6000                   387 ns          387 ns      1803466

After
---------------------------------------------------------------------------------------------------------------
Benchmark                                                                     Time             CPU   Iterations
---------------------------------------------------------------------------------------------------------------
BM_sprintf/C string len = 6                                                3.56 ns         3.56 ns    196337806
BM_format/C string len = 6                                                 49.1 ns         49.1 ns     14241174
BM_format_to_back_inserter<std::string>/C string len = 6                   56.8 ns         56.8 ns     12341483
BM_format_to_back_inserter<std::vector<char>>/C string len = 6             72.9 ns         72.9 ns      9610864
BM_format_to_back_inserter<std::deque<char>>/C string len = 6               155 ns          155 ns      4528719
BM_format_to_back_inserter<std::list<char>>/C string len = 6                137 ns          137 ns      5103340
BM_format_to_iterator/<std::array> C string len = 6                        46.4 ns         46.4 ns     15081626
BM_format_to_iterator/<std::string> C string len = 6                       47.0 ns         47.0 ns     14893458
BM_format_to_iterator/<std::vector> C string len = 6                       45.9 ns         45.9 ns     15243762
BM_format_to_iterator/<std::deque> C string len = 6                        52.6 ns         52.6 ns     13323560
BM_format/string len = 6                                                   49.3 ns         49.3 ns     14181485
BM_format_to_back_inserter<std::string>/string len = 6                     55.4 ns         55.4 ns     12644078
BM_format_to_back_inserter<std::vector<char>>/string len = 6               72.7 ns         72.7 ns      9618696
BM_format_to_back_inserter<std::deque<char>>/string len = 6                 154 ns          154 ns      4540873
BM_format_to_back_inserter<std::list<char>>/string len = 6                  134 ns          134 ns      5220153
BM_format_to_iterator/<std::array> string len = 6                          46.5 ns         46.5 ns     15064445
BM_format_to_iterator/<std::string> string len = 6                         47.3 ns         47.3 ns     14786851
BM_format_to_iterator/<std::vector> string len = 6                         46.5 ns         46.5 ns     15069381
BM_format_to_iterator/<std::deque> string len = 6                          52.9 ns         52.9 ns     13207437
BM_format/string_view len = 6                                              47.6 ns         47.6 ns     14688449
BM_format_to_back_inserter<std::string>/string_view len = 6                56.1 ns         56.1 ns     12514239
BM_format_to_back_inserter<std::vector<char>>/string_view len = 6          72.0 ns         72.0 ns      9705591
BM_format_to_back_inserter<std::deque<char>>/string_view len = 6            152 ns          152 ns      4607470
BM_format_to_back_inserter<std::list<char>>/string_view len = 6             134 ns          134 ns      5233005
BM_format_to_iterator/<std::array> string_view len = 6                     46.0 ns         46.0 ns     15205542
BM_format_to_iterator/<std::string> string_view len = 6                    46.5 ns         46.5 ns     15067775
BM_format_to_iterator/<std::vector> string_view len = 6                    46.5 ns         46.5 ns     15057288
BM_format_to_iterator/<std::deque> string_view len = 6                     51.8 ns         51.8 ns     13564649
BM_sprintf/C string len = 60                                               4.83 ns         4.83 ns    145020116
BM_format/C string len = 60                                                72.3 ns         72.3 ns      9665555
BM_format_to_back_inserter<std::string>/C string len = 60                  85.0 ns         85.0 ns      8249050
BM_format_to_back_inserter<std::vector<char>>/C string len = 60            94.7 ns         94.7 ns      7398143
BM_format_to_back_inserter<std::deque<char>>/C string len = 60              286 ns          286 ns      2452225
BM_format_to_back_inserter<std::list<char>>/C string len = 60              1179 ns         1179 ns       595011
BM_format_to_iterator/<std::array> C string len = 60                       46.1 ns         46.1 ns     15157525
BM_format_to_iterator/<std::string> C string len = 60                      47.0 ns         47.0 ns     14899863
BM_format_to_iterator/<std::vector> C string len = 60                      45.8 ns         45.8 ns     15272542
BM_format_to_iterator/<std::deque> C string len = 60                       58.7 ns         58.7 ns     11958910
BM_format/string len = 60                                                  61.7 ns         61.7 ns     11308865
BM_format_to_back_inserter<std::string>/string len = 60                    74.2 ns         74.2 ns      9401855
BM_format_to_back_inserter<std::vector<char>>/string len = 60              86.6 ns         86.6 ns      8079197
BM_format_to_back_inserter<std::deque<char>>/string len = 60                278 ns          278 ns      2519254
BM_format_to_back_inserter<std::list<char>>/string len = 60                1169 ns         1170 ns       597703
BM_format_to_iterator/<std::array> string len = 60                         46.4 ns         46.4 ns     15074759
BM_format_to_iterator/<std::string> string len = 60                        47.3 ns         47.3 ns     14801107
BM_format_to_iterator/<std::vector> string len = 60                        46.3 ns         46.4 ns     15085548
BM_format_to_iterator/<std::deque> string len = 60                         52.2 ns         52.2 ns     13331015
BM_format/string_view len = 60                                             65.0 ns         65.0 ns     10756847
BM_format_to_back_inserter<std::string>/string_view len = 60               78.5 ns         78.5 ns      9051370
BM_format_to_back_inserter<std::vector<char>>/string_view len = 60         87.9 ns         87.9 ns      7946825
BM_format_to_back_inserter<std::deque<char>>/string_view len = 60           280 ns          280 ns      2505999
BM_format_to_back_inserter<std::list<char>>/string_view len = 60           1174 ns         1174 ns       594829
BM_format_to_iterator/<std::array> string_view len = 60                    46.3 ns         46.3 ns     15149785
BM_format_to_iterator/<std::string> string_view len = 60                   46.7 ns         46.7 ns     15002678
BM_format_to_iterator/<std::vector> string_view len = 60                   46.7 ns         46.7 ns     14996445
BM_format_to_iterator/<std::deque> string_view len = 60                    52.6 ns         52.6 ns     13255470
BM_sprintf/C string len = 6000                                             77.1 ns         77.1 ns      9099121
BM_format/C string len = 6000                                               350 ns          350 ns      2013049
BM_format_to_back_inserter<std::string>/C string len = 6000                 992 ns          992 ns       709093
BM_format_to_back_inserter<std::vector<char>>/C string len = 6000          1016 ns         1016 ns       694784
BM_format_to_back_inserter<std::deque<char>>/C string len = 6000          15158 ns        15159 ns        46125
BM_format_to_back_inserter<std::list<char>>/C string len = 6000          115703 ns       115705 ns         6055
BM_format_to_iterator/<std::array> C string len = 6000                      166 ns          166 ns      4224749
BM_format_to_iterator/<std::string> C string len = 6000                     153 ns          153 ns      4573034
BM_format_to_iterator/<std::vector> C string len = 6000                     150 ns          150 ns      4678898
BM_format_to_iterator/<std::deque> C string len = 6000                      465 ns          465 ns      1506323
BM_format/string len = 6000                                                 281 ns          281 ns      2462572
BM_format_to_back_inserter<std::string>/string len = 6000                   935 ns          935 ns       745376
BM_format_to_back_inserter<std::vector<char>>/string len = 6000             939 ns          939 ns       747498
BM_format_to_back_inserter<std::deque<char>>/string len = 6000            15069 ns        15069 ns        46429
BM_format_to_back_inserter<std::list<char>>/string len = 6000            115537 ns       115539 ns         6063
BM_format_to_iterator/<std::array> string len = 6000                        120 ns          120 ns      5849159
BM_format_to_iterator/<std::string> string len = 6000                       108 ns          108 ns      6482306
BM_format_to_iterator/<std::vector> string len = 6000                       107 ns          107 ns      6547915
BM_format_to_iterator/<std::deque> string len = 6000                        397 ns          397 ns      1763729
BM_format/string_view len = 6000                                            282 ns          282 ns      2490133
BM_format_to_back_inserter<std::string>/string_view len = 6000              927 ns          927 ns       750931
BM_format_to_back_inserter<std::vector<char>>/string_view len = 6000        946 ns          946 ns       735544
BM_format_to_back_inserter<std::deque<char>>/string_view len = 6000       15028 ns        15029 ns        46612
BM_format_to_back_inserter<std::list<char>>/string_view len = 6000       115153 ns       115153 ns         6057
BM_format_to_iterator/<std::array> string_view len = 6000                   122 ns          122 ns      5753940
BM_format_to_iterator/<std::string> string_view len = 6000                  108 ns          108 ns      6507555
BM_format_to_iterator/<std::vector> string_view len = 6000                  107 ns          107 ns      6510450
BM_format_to_iterator/<std::deque> string_view len = 6000                   398 ns          398 ns      1762517

Comparison
Benchmark                                                                              Time             CPU      Time Old      Time New       CPU Old       CPU New
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
BM_sprintf/C string len = 6                                                         -0.2599         -0.2599             5             4             5             4
BM_format/C string len = 6                                                          -0.1094         -0.1095            55            49            55            49
BM_format_to_back_inserter<std::string>/C string len = 6                            +0.0316         +0.0316            55            57            55            57
BM_format_to_back_inserter<std::vector<char>>/C string len = 6                      +0.0204         +0.0204            71            73            71            73
BM_format_to_back_inserter<std::deque<char>>/C string len = 6                       +0.0023         +0.0023           154           155           154           155
BM_format_to_back_inserter<std::list<char>>/C string len = 6                        +0.0521         +0.0520           130           137           130           137
BM_format_to_iterator/<std::array> C string len = 6                                 +0.0317         +0.0317            45            46            45            46
BM_format_to_iterator/<std::string> C string len = 6                                +0.0265         +0.0265            46            47            46            47
BM_format_to_iterator/<std::vector> C string len = 6                                +0.0345         +0.0345            44            46            44            46
BM_format_to_iterator/<std::deque> C string len = 6                                 +0.0511         +0.0511            50            53            50            53
BM_format/string len = 6                                                            -0.0981         -0.0981            55            49            55            49
BM_format_to_back_inserter<std::string>/string len = 6                              -0.0028         -0.0028            56            55            56            55
BM_format_to_back_inserter<std::vector<char>>/string len = 6                        +0.0331         +0.0331            70            73            70            73
BM_format_to_back_inserter<std::deque<char>>/string len = 6                         -0.0040         -0.0040           155           154           155           154
BM_format_to_back_inserter<std::list<char>>/string len = 6                          -0.0019         -0.0019           135           134           135           134
BM_format_to_iterator/<std::array> string len = 6                                   +0.0427         +0.0427            45            46            45            46
BM_format_to_iterator/<std::string> string len = 6                                  +0.0515         +0.0515            45            47            45            47
BM_format_to_iterator/<std::vector> string len = 6                                  +0.0316         +0.0316            45            46            45            46
BM_format_to_iterator/<std::deque> string len = 6                                   +0.0475         +0.0475            51            53            51            53
BM_format/string_view len = 6                                                       -0.1267         -0.1267            55            48            55            48
BM_format_to_back_inserter<std::string>/string_view len = 6                         +0.0283         +0.0283            55            56            55            56
BM_format_to_back_inserter<std::vector<char>>/string_view len = 6                   +0.0291         +0.0291            70            72            70            72
BM_format_to_back_inserter<std::deque<char>>/string_view len = 6                    -0.0305         -0.0306           157           152           157           152
BM_format_to_back_inserter<std::list<char>>/string_view len = 6                     -0.0012         -0.0012           134           134           134           134
BM_format_to_iterator/<std::array> string_view len = 6                              +0.0426         +0.0426            44            46            44            46
BM_format_to_iterator/<std::string> string_view len = 6                             +0.0322         +0.0322            45            46            45            46
BM_format_to_iterator/<std::vector> string_view len = 6                             +0.0309         +0.0309            45            46            45            46
BM_format_to_iterator/<std::deque> string_view len = 6                              +0.0248         +0.0248            51            52            51            52
BM_sprintf/C string len = 60                                                        +0.1633         +0.1633             4             5             4             5
BM_format/C string len = 60                                                         -0.0203         -0.0203            74            72            74            72
BM_format_to_back_inserter<std::string>/C string len = 60                           +0.1548         +0.1548            74            85            74            85
BM_format_to_back_inserter<std::vector<char>>/C string len = 60                     +0.1395         +0.1395            83            95            83            95
BM_format_to_back_inserter<std::deque<char>>/C string len = 60                      +0.0157         +0.0157           281           286           281           286
BM_format_to_back_inserter<std::list<char>>/C string len = 60                       +0.0191         +0.0191          1157          1179          1157          1179
BM_format_to_iterator/<std::array> C string len = 60                                +0.0283         +0.0283            45            46            45            46
BM_format_to_iterator/<std::string> C string len = 60                               +0.0251         +0.0252            46            47            46            47
BM_format_to_iterator/<std::vector> C string len = 60                               +0.0263         +0.0263            45            46            45            46
BM_format_to_iterator/<std::deque> C string len = 60                                +0.1592         +0.1591            51            59            51            59
BM_format/string len = 60                                                           -0.1466         -0.1466            72            62            72            62
BM_format_to_back_inserter<std::string>/string len = 60                             +0.0299         +0.0299            72            74            72            74
BM_format_to_back_inserter<std::vector<char>>/string len = 60                       +0.0522         +0.0522            82            87            82            87
BM_format_to_back_inserter<std::deque<char>>/string len = 60                        -0.0099         -0.0099           280           278           280           278
BM_format_to_back_inserter<std::list<char>>/string len = 60                         +0.0062         +0.0062          1162          1169          1162          1170
BM_format_to_iterator/<std::array> string len = 60                                  +0.0430         +0.0430            45            46            45            46
BM_format_to_iterator/<std::string> string len = 60                                 -0.0466         -0.0466            50            47            50            47
BM_format_to_iterator/<std::vector> string len = 60                                 -0.0693         -0.0693            50            46            50            46
BM_format_to_iterator/<std::deque> string len = 60                                  +0.0275         +0.0275            51            52            51            52
BM_format/string_view len = 60                                                      -0.1034         -0.1034            73            65            73            65
BM_format_to_back_inserter<std::string>/string_view len = 60                        +0.0790         +0.0790            73            78            73            78
BM_format_to_back_inserter<std::vector<char>>/string_view len = 60                  +0.0735         +0.0735            82            88            82            88
BM_format_to_back_inserter<std::deque<char>>/string_view len = 60                   -0.0103         -0.0104           283           280           283           280
BM_format_to_back_inserter<std::list<char>>/string_view len = 60                    +0.0101         +0.0101          1162          1174          1162          1174
BM_format_to_iterator/<std::array> string_view len = 60                             +0.0484         +0.0484            44            46            44            46
BM_format_to_iterator/<std::string> string_view len = 60                            +0.0387         +0.0387            45            47            45            47
BM_format_to_iterator/<std::vector> string_view len = 60                            +0.0402         +0.0402            45            47            45            47
BM_format_to_iterator/<std::deque> string_view len = 60                             +0.0508         +0.0508            50            53            50            53
BM_sprintf/C string len = 6000                                                      -0.3337         -0.3337           116            77           116            77
BM_format/C string len = 6000                                                       -0.6500         -0.6500          1000           350          1000           350
BM_format_to_back_inserter<std::string>/C string len = 6000                         -0.0104         -0.0105          1002           992          1002           992
BM_format_to_back_inserter<std::vector<char>>/C string len = 6000                   +0.0630         +0.0630           956          1016           956          1016
BM_format_to_back_inserter<std::deque<char>>/C string len = 6000                    +0.0175         +0.0175         14898         15158         14898         15159
BM_format_to_back_inserter<std::list<char>>/C string len = 6000                     +0.0073         +0.0074        114860        115703        114859        115705
BM_format_to_iterator/<std::array> C string len = 6000                              +0.0504         +0.0504           158           166           158           166
BM_format_to_iterator/<std::string> C string len = 6000                             -0.0486         -0.0486           161           153           161           153
BM_format_to_iterator/<std::vector> C string len = 6000                             -0.0483         -0.0483           157           150           157           150
BM_format_to_iterator/<std::deque> C string len = 6000                              +0.0459         +0.0459           445           465           445           465
BM_format/string len = 6000                                                         -0.6975         -0.6975           929           281           929           281
BM_format_to_back_inserter<std::string>/string len = 6000                           +0.0050         +0.0050           930           935           930           935
BM_format_to_back_inserter<std::vector<char>>/string len = 6000                     +0.0321         +0.0322           910           939           910           939
BM_format_to_back_inserter<std::deque<char>>/string len = 6000                      +0.0130         +0.0130         14875         15069         14876         15069
BM_format_to_back_inserter<std::list<char>>/string len = 6000                       +0.0052         +0.0052        114937        115537        114936        115539
BM_format_to_iterator/<std::array> string len = 6000                                +0.0211         +0.0211           118           120           118           120
BM_format_to_iterator/<std::string> string len = 6000                               +0.0146         +0.0146           106           108           106           108
BM_format_to_iterator/<std::vector> string len = 6000                               +0.0048         +0.0048           106           107           106           107
BM_format_to_iterator/<std::deque> string len = 6000                                +0.0150         +0.0150           391           397           391           397
BM_format/string_view len = 6000                                                    -0.6989         -0.6989           935           282           935           282
BM_format_to_back_inserter<std::string>/string_view len = 6000                      -0.0083         -0.0083           934           927           934           927
BM_format_to_back_inserter<std::vector<char>>/string_view len = 6000                +0.0566         +0.0566           895           946           895           946
BM_format_to_back_inserter<std::deque<char>>/string_view len = 6000                 +0.0111         +0.0110         14864         15028         14865         15029
BM_format_to_back_inserter<std::list<char>>/string_view len = 6000                  +0.0010         +0.0009        115042        115153        115044        115153
BM_format_to_iterator/<std::array> string_view len = 6000                           +0.0560         +0.0560           115           122           115           122
BM_format_to_iterator/<std::string> string_view len = 6000                          -0.0693         -0.0693           116           108           116           108
BM_format_to_iterator/<std::vector> string_view len = 6000                          -0.0703         -0.0703           115           107           115           107
BM_format_to_iterator/<std::deque> string_view len = 6000                           +0.0271         +0.0271           387           398           387           398
OVERALL_GEOMEAN                                                                     -0.0350         -0.0350             0             0             0             0

format:
Before
--------------------------------------------------------------------------------------------
Benchmark                                  Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------------
BM_format_string<char>/1                55.5 ns         55.5 ns     12463288 bytes_per_second=17.187Mi/s
BM_format_string<char>/2                27.6 ns         27.6 ns     25421994 bytes_per_second=69.1874Mi/s
BM_format_string<char>/4                14.0 ns         14.0 ns     49785656 bytes_per_second=271.524Mi/s
BM_format_string<char>/8                7.07 ns         7.07 ns     99247048 bytes_per_second=1.05444Gi/s
BM_format_string<char>/16               3.53 ns         3.53 ns    198072224 bytes_per_second=4.21726Gi/s
BM_format_string<char>/32               2.31 ns         2.31 ns    302771136 bytes_per_second=12.9138Gi/s
BM_format_string<char>/64               1.15 ns         1.15 ns    606646976 bytes_per_second=51.7527Gi/s
BM_format_string<char>/128             0.597 ns        0.597 ns   1172263936 bytes_per_second=199.688Gi/s
BM_format_string<char>/256             0.327 ns        0.327 ns   2148927744 bytes_per_second=728.678Gi/s
BM_format_string<char>/512             0.248 ns        0.248 ns   2821635584 bytes_per_second=1.8779Ti/s
BM_format_string<char>/1024            0.203 ns        0.203 ns   3433579520 bytes_per_second=4.57798Ti/s
BM_format_string<char>/2048            0.164 ns        0.164 ns   4277524480 bytes_per_second=11.3793Ti/s
BM_format_string<char>/4096            0.137 ns        0.137 ns   5122269184 bytes_per_second=27.2589Ti/s
BM_format_string<char>/8192            0.126 ns        0.126 ns   5564243968 bytes_per_second=59.2812Ti/s
BM_format_string<char>/16384           0.136 ns        0.136 ns   5153013760 bytes_per_second=109.492Ti/s
BM_format_string<char>/32768           0.135 ns        0.135 ns   5165088768 bytes_per_second=219.985Ti/s
BM_format_string<char>/65536           0.243 ns        0.242 ns   2930180096 bytes_per_second=246.57Ti/s
BM_format_string<char>/131072          0.490 ns        0.489 ns   1437990912 bytes_per_second=243.75Ti/s
BM_format_string<char>/262144          0.593 ns        0.592 ns   1183055872 bytes_per_second=402.931Ti/s
BM_format_string<char>/524288          0.643 ns        0.641 ns   1092616192 bytes_per_second=743.445Ti/s
BM_format_string<char>/1048576         0.669 ns        0.668 ns   1045430272 bytes_per_second=1.39478Pi/s
BM_format_string<wchar_t>/1             56.0 ns         55.9 ns     12511543 bytes_per_second=68.2628Mi/s
BM_format_string<wchar_t>/2             28.0 ns         27.9 ns     25062366 bytes_per_second=273.519Mi/s
BM_format_string<wchar_t>/4             14.0 ns         14.0 ns     50257068 bytes_per_second=1.06742Gi/s
BM_format_string<wchar_t>/8             9.24 ns         9.21 ns     76118616 bytes_per_second=3.23473Gi/s
BM_format_string<wchar_t>/16            4.66 ns         4.65 ns    151420352 bytes_per_second=12.8261Gi/s
BM_format_string<wchar_t>/32            2.35 ns         2.35 ns    298417600 bytes_per_second=50.7972Gi/s
BM_format_string<wchar_t>/64            1.35 ns         1.34 ns    521608704 bytes_per_second=177.502Gi/s
BM_format_string<wchar_t>/128           1.03 ns         1.03 ns    680946304 bytes_per_second=463.91Gi/s
BM_format_string<wchar_t>/256          0.849 ns        0.847 ns    825871104 bytes_per_second=1.09901Ti/s
BM_format_string<wchar_t>/512          0.681 ns        0.679 ns   1033245696 bytes_per_second=2.74383Ti/s
BM_format_string<wchar_t>/1024         0.576 ns        0.575 ns   1219777536 bytes_per_second=6.48343Ti/s
BM_format_string<wchar_t>/2048         0.515 ns        0.514 ns   1361629184 bytes_per_second=14.4881Ti/s
BM_format_string<wchar_t>/4096         0.546 ns        0.545 ns   1285427200 bytes_per_second=27.3342Ti/s
BM_format_string<wchar_t>/8192         0.550 ns        0.548 ns   1277042688 bytes_per_second=54.3343Ti/s
BM_format_string<wchar_t>/16384        0.583 ns        0.581 ns   1203879936 bytes_per_second=102.544Ti/s
BM_format_string<wchar_t>/32768        0.640 ns        0.638 ns   1095139328 bytes_per_second=186.82Ti/s
BM_format_string<wchar_t>/65536        0.642 ns        0.640 ns   1093337088 bytes_per_second=372.283Ti/s
BM_format_string<wchar_t>/131072       0.655 ns        0.654 ns   1070596096 bytes_per_second=729.428Ti/s
BM_format_string<wchar_t>/262144        2.68 ns         2.67 ns    262406144 bytes_per_second=357.446Ti/s
BM_format_string<wchar_t>/524288        2.13 ns         2.13 ns    330301440 bytes_per_second=897.574Ti/s
BM_format_string<wchar_t>/1048576       2.44 ns         2.43 ns    288358400 bytes_per_second=1.53149Pi/s

After
--------------------------------------------------------------------------------------------
Benchmark                                  Time             CPU   Iterations UserCounters...
--------------------------------------------------------------------------------------------
BM_format_string<char>/1                49.3 ns         49.1 ns     14230742 bytes_per_second=19.4054Mi/s
BM_format_string<char>/2                24.8 ns         24.8 ns     28253114 bytes_per_second=77.0465Mi/s
BM_format_string<char>/4                12.4 ns         12.4 ns     56381440 bytes_per_second=307.462Mi/s
BM_format_string<char>/8                6.23 ns         6.21 ns    112951232 bytes_per_second=1.19924Gi/s
BM_format_string<char>/16               3.10 ns         3.09 ns    225822496 bytes_per_second=4.81566Gi/s
BM_format_string<char>/32               1.98 ns         1.98 ns    354208192 bytes_per_second=15.0825Gi/s
BM_format_string<char>/64              0.990 ns        0.987 ns    714296384 bytes_per_second=60.3689Gi/s
BM_format_string<char>/128             0.504 ns        0.503 ns   1399988480 bytes_per_second=237.144Gi/s
BM_format_string<char>/256             0.335 ns        0.334 ns   2084828928 bytes_per_second=712.859Gi/s
BM_format_string<char>/512             0.187 ns        0.186 ns   3760082432 bytes_per_second=2.50102Ti/s
BM_format_string<char>/1024            0.109 ns        0.108 ns   6455339008 bytes_per_second=8.59552Ti/s
BM_format_string<char>/2048            0.080 ns        0.080 ns   8754006016 bytes_per_second=23.2731Ti/s
BM_format_string<char>/4096            0.051 ns        0.051 ns   13786701824 bytes_per_second=73.4088Ti/s
BM_format_string<char>/8192            0.042 ns        0.042 ns   16851435520 bytes_per_second=178.737Ti/s
BM_format_string<char>/16384           0.122 ns        0.122 ns   5746589696 bytes_per_second=122.029Ti/s
BM_format_string<char>/32768           0.107 ns        0.106 ns   6571687936 bytes_per_second=280.122Ti/s
BM_format_string<char>/65536           0.102 ns        0.102 ns   6876626944 bytes_per_second=584.381Ti/s
BM_format_string<char>/131072          0.106 ns        0.105 ns   6643122176 bytes_per_second=1.10413Pi/s
BM_format_string<char>/262144          0.589 ns        0.587 ns   1189609472 bytes_per_second=406.438Ti/s
BM_format_string<char>/524288          0.644 ns        0.642 ns   1088946176 bytes_per_second=743.064Ti/s
BM_format_string<char>/1048576         0.672 ns        0.670 ns   1039138816 bytes_per_second=1.38968Pi/s
BM_format_string<wchar_t>/1             48.7 ns         48.6 ns     14423178 bytes_per_second=78.5263Mi/s
BM_format_string<wchar_t>/2             24.4 ns         24.3 ns     28831748 bytes_per_second=313.869Mi/s
BM_format_string<wchar_t>/4             12.2 ns         12.2 ns     57661220 bytes_per_second=1.2253Gi/s
BM_format_string<wchar_t>/8             7.81 ns         7.79 ns     89887592 bytes_per_second=3.82675Gi/s
BM_format_string<wchar_t>/16            3.88 ns         3.87 ns    180450176 bytes_per_second=15.418Gi/s
BM_format_string<wchar_t>/32            1.98 ns         1.98 ns    354046112 bytes_per_second=60.3262Gi/s
BM_format_string<wchar_t>/64            1.02 ns         1.01 ns    689511680 bytes_per_second=234.906Gi/s
BM_format_string<wchar_t>/128          0.577 ns        0.576 ns   1215361408 bytes_per_second=828.297Gi/s
BM_format_string<wchar_t>/256          0.804 ns        0.802 ns    872249088 bytes_per_second=1.16111Ti/s
BM_format_string<wchar_t>/512          0.642 ns        0.641 ns   1093858304 bytes_per_second=2.90766Ti/s
BM_format_string<wchar_t>/1024         0.517 ns        0.516 ns   1354798080 bytes_per_second=7.21845Ti/s
BM_format_string<wchar_t>/2048         0.469 ns        0.467 ns   1491910656 bytes_per_second=15.9386Ti/s
BM_format_string<wchar_t>/4096         0.794 ns        0.792 ns    883822592 bytes_per_second=18.8156Ti/s
BM_format_string<wchar_t>/8192         0.722 ns        0.720 ns    971718656 bytes_per_second=41.3996Ti/s
BM_format_string<wchar_t>/16384        0.703 ns        0.701 ns    998031360 bytes_per_second=85.0369Ti/s
BM_format_string<wchar_t>/32768        0.724 ns        0.720 ns    971538432 bytes_per_second=165.467Ti/s
BM_format_string<wchar_t>/65536        0.745 ns        0.744 ns    941228032 bytes_per_second=320.665Ti/s
BM_format_string<wchar_t>/131072       0.742 ns        0.740 ns    945422336 bytes_per_second=644.032Ti/s
BM_format_string<wchar_t>/262144        2.99 ns         2.98 ns    234881024 bytes_per_second=319.75Ti/s
BM_format_string<wchar_t>/524288        3.31 ns         3.30 ns    212860928 bytes_per_second=578.085Ti/s
BM_format_string<wchar_t>/1048576       2.69 ns         2.68 ns    260046848 bytes_per_second=1.39081Pi/s

Comparison
Benchmark                                           Time             CPU      Time Old      Time New       CPU Old       CPU New
--------------------------------------------------------------------------------------------------------------------------------
BM_format_string<char>/1                         -0.1122         -0.1143            55            49            55            49
BM_format_string<char>/2                         -0.0999         -0.1020            28            25            28            25
BM_format_string<char>/4                         -0.1148         -0.1169            14            12            14            12
BM_format_string<char>/8                         -0.1186         -0.1207             7             6             7             6
BM_format_string<char>/16                        -0.1222         -0.1243             4             3             4             3
BM_format_string<char>/32                        -0.1417         -0.1438             2             2             2             2
BM_format_string<char>/64                        -0.1407         -0.1427             1             1             1             1
BM_format_string<char>/128                       -0.1559         -0.1579             1             1             1             1
BM_format_string<char>/256                       +0.0249         +0.0222             0             0             0             0
BM_format_string<char>/512                       -0.2473         -0.2491             0             0             0             0
BM_format_string<char>/1024                      -0.4661         -0.4674             0             0             0             0
BM_format_string<char>/2048                      -0.5096         -0.5111             0             0             0             0
BM_format_string<char>/4096                      -0.6278         -0.6287             0             0             0             0
BM_format_string<char>/8192                      -0.6674         -0.6683             0             0             0             0
BM_format_string<char>/16384                     -0.1005         -0.1027             0             0             0             0
BM_format_string<char>/32768                     -0.2127         -0.2147             0             0             0             0
BM_format_string<char>/65536                     -0.5796         -0.5781             0             0             0             0
BM_format_string<char>/131072                    -0.7844         -0.7844             0             0             0             0
BM_format_string<char>/262144                    -0.0073         -0.0086             1             1             1             1
BM_format_string<char>/524288                    +0.0017         +0.0005             1             1             1             1
BM_format_string<char>/1048576                   +0.0039         +0.0037             1             1             1             1
BM_format_string<wchar_t>/1                      -0.1304         -0.1307            56            49            56            49
BM_format_string<wchar_t>/2                      -0.1285         -0.1286            28            24            28            24
BM_format_string<wchar_t>/4                      -0.1288         -0.1288            14            12            14            12
BM_format_string<wchar_t>/8                      -0.1547         -0.1547             9             8             9             8
BM_format_string<wchar_t>/16                     -0.1681         -0.1681             5             4             5             4
BM_format_string<wchar_t>/32                     -0.1579         -0.1580             2             2             2             2
BM_format_string<wchar_t>/64                     -0.2443         -0.2444             1             1             1             1
BM_format_string<wchar_t>/128                    -0.4400         -0.4399             1             1             1             1
BM_format_string<wchar_t>/256                    -0.0535         -0.0535             1             1             1             1
BM_format_string<wchar_t>/512                    -0.0563         -0.0563             1             1             1             1
BM_format_string<wchar_t>/1024                   -0.1018         -0.1018             1             1             1             1
BM_format_string<wchar_t>/2048                   -0.0910         -0.0910             1             0             1             0
BM_format_string<wchar_t>/4096                   +0.4528         +0.4527             1             1             1             1
BM_format_string<wchar_t>/8192                   +0.3121         +0.3124             1             1             1             1
BM_format_string<wchar_t>/16384                  +0.2059         +0.2059             1             1             1             1
BM_format_string<wchar_t>/32768                  +0.1316         +0.1290             1             1             1             1
BM_format_string<wchar_t>/65536                  +0.1604         +0.1610             1             1             1             1
BM_format_string<wchar_t>/131072                 +0.1326         +0.1326             1             1             1             1
BM_format_string<wchar_t>/262144                 +0.1181         +0.1179             3             3             3             3
BM_format_string<wchar_t>/524288                 +0.5515         +0.5527             2             3             2             3
BM_format_string<wchar_t>/1048576                +0.1007         +0.1011             2             3             2             3
OVERALL_GEOMEAN                                  -0.1685         -0.1693             0             0             0             0
---
 libcxx/include/__format/buffer.h              | 256 ++++++++++++++++--
 libcxx/include/__format/format_functions.h    |  24 +-
 .../test/libcxx/transitive_includes/cxx03.csv |  18 --
 .../test/libcxx/transitive_includes/cxx11.csv |  18 --
 .../test/libcxx/transitive_includes/cxx14.csv |  18 --
 .../test/libcxx/transitive_includes/cxx17.csv |   8 -
 6 files changed, 250 insertions(+), 92 deletions(-)

diff --git a/libcxx/include/__format/buffer.h b/libcxx/include/__format/buffer.h
index 8598f0a1c0395..bfe26bfb01cb5 100644
--- a/libcxx/include/__format/buffer.h
+++ b/libcxx/include/__format/buffer.h
@@ -29,6 +29,7 @@
 #include <__iterator/wrap_iter.h>
 #include <__memory/addressof.h>
 #include <__memory/allocate_at_least.h>
+#include <__memory/allocator.h>
 #include <__memory/allocator_traits.h>
 #include <__memory/construct_at.h>
 #include <__memory/ranges_construct_at.h>
@@ -58,23 +59,156 @@ namespace __format {
 /// This helper is used together with the @ref back_insert_iterator to offer
 /// type-erasure for the formatting functions. This reduces the number to
 /// template instantiations.
+///
+/// The design of the class is being changed to improve performance and do some
+/// code cleanups.
+/// The original design (as shipped up to LLVM-19) uses the following design:
+/// - There is an external object that connects the buffer to the output.
+/// - The class constructor stores a function pointer to a grow function and a
+///   type-erased pointer to the object that does the grow.
+/// - When writing data to the buffer would exceed the external buffer's
+///   capacity it requests the external buffer to flush its contents.
+///
+/// The new design tries to solve some issues with the current design:
+/// - The buffer used is a fixed-size buffer, benchmarking shows that using a
+///   dynamic allocated buffer has performance benefits.
+/// - Implementing P3107R5 "Permit an efficient implementation of std::print"
+///   is not trivial with the current buffers. Using the code from this series
+///   makes it trivial.
+///
+/// This class is ABI-tagged, still the new design does not change the size of
+/// objects of this class.
+///
+/// The new design contains information regarding format_to_n changes, these
+/// will be implemented in follow-up patch.
+///
+/// The new design is the following.
+/// - There is an external object that connects the buffer to the output.
+/// - This buffer object:
+///   - inherits publicly from this class.
+///   - has a static or dynamic buffer.
+///   - has a static member function to make space in its buffer write
+///     operations. This can be done by increasing the size of the internal
+///     buffer or by writing the contents of the buffer to the output iterator.
+///
+///     This member function is a constructor argument, so its name is not
+///     fixed. The code uses the name __prepare_write.
+/// - The number of output code units can be limited by a __max_output_size
+///   object. This is used in format_to_n This object:
+///   - Contains the maximum number of code units to be written.
+///   - Contains the number of code units that are requested to be written.
+///     This number is returned to the user of format_to_n.
+///   - The write functions call objects __request_write member function.
+///     This function:
+///     - Updates the number of code units that are requested to be written.
+///     - Returns the number of code units that can be written without
+///       exceeding the maximum number of code units to be written.
+///
+/// Documentation for the buffer usage members:
+/// - __ptr_ the start of the buffer.
+/// - __capacity_ the number of code units that can be written.
+///   This means [__ptr_, __ptr_ + __capacity_) is a valid range to write to.
+/// - __size_ the number of code units written in the buffer. The next code
+///   unit will be written at __ptr_ + __size_. This __size_ may NOT contain
+///   the total number of code units written by the __output_buffer. Whether or
+///   not it does depends on the sub-class used. Typically the total number of
+///   code units written is not interesting. It is interesting for format_to_n
+///   which has its own way to track this number.
+///
+/// Documentation for the buffer changes function:
+/// The subclasses have a function with the following signature:
+///
+///   static void __prepare_write(
+///     __output_buffer<_CharT>& __buffer, size_t __code_units);
+///
+/// This function is called when a write function writes more code units than
+/// the buffer' available space. When an __max_output_size object is provided
+/// the number of code units is the number of code units returned from
+/// __max_output_size::__request_write function.
+///
+/// - The __buffer contains *this. Since the class containing this function
+///   inherits from __output_buffer it's save to cast it to the subclass being
+///   used.
+/// - The __code_units is the number of code units the caller will write + 1.
+///   - This value does not take the avaiable space of the buffer into account.
+///   - The push_back function is more efficient when writing before resizing,
+///     this means the buffer should always have room for one code unit. Hence
+///     the + 1 is the size.
+/// - When the function returns there is room for at least one code unit. There
+///   is no requirement there is room for __code_units code units:
+///   - The class has some "bulk" operations. For example, __copy which copies
+///     the contents of a basic_string_view to the output. If the sub-class has
+///     a fixed size buffer the size of the basic_string_view may be larger
+///     than the buffer. In that case it's impossible to honor the requested
+///     size.
+///   - The at least one code unit makes sure the entire output can be written.
+///     (Obviously making room one code unit at a time is slow and
+///     it's recommended to return a larger available space.)
+///   - When the buffer has room for at least one code unit the function may be
+///     a no-op.
+/// - When the function makes space for more code units it uses one for these
+///   functions to signal the change:
+///   - __buffer_flushed()
+///     - This function is typically used for a fixed sized buffer.
+///     - The current contents of [__ptr_, __ptr_ + __size_) have been
+///       processed.
+///     - __ptr_ remains unchanged.
+///     - __capacity_ remains unchanged.
+///     - __size_ will be set to 0.
+///   - __buffer_moved(_CharT* __ptr, size_t __capacity)
+///     - This function is typically used for a dynamic sized buffer. There the
+///       location of the buffer changes due to reallocations.
+///     - __ptr_ will be set to __ptr. (This value may be the old value of
+///       __ptr_).
+///     - __capacity_ will be set to __capacity. (This value may be the old
+///       value of       __capacity_).
+///     - __size_ remains unchanged,
+///     - The range [__ptr, __ptr + __size_) contains the original data of the
+///       range  [__ptr_, __ptr_ + __size_).
+///
+/// The push_back function expects a valid buffer and a capacity of at least 1.
+/// This means:
+/// - The class is constructed with a valid buffer,
+/// - __buffer_moved is called with a valid buffer is used before the first
+///   write operation,
+/// - no write function is ever called, or
+/// - the class is constructed with a __max_output_size object with __max_size 0.
+///
+/// The latter option allows formatted_size to use the output buffer without
+/// ever writing anything to the buffer.
 template <__fmt_char_type _CharT>
 class _LIBCPP_TEMPLATE_VIS __output_buffer {
 public:
-  using value_type = _CharT;
+  using value_type           = _CharT;
+  using __prepare_write_type = void (*)(__output_buffer<_CharT>&, size_t);
 
-  template <class _Tp>
+  template <class _Tp> // Deprecated LLVM-19 function.
   _LIBCPP_HIDE_FROM_ABI explicit __output_buffer(_CharT* __ptr, size_t __capacity, _Tp* __obj)
       : __ptr_(__ptr),
         __capacity_(__capacity),
         __flush_([](_CharT* __p, size_t __n, void* __o) { static_cast<_Tp*>(__o)->__flush(__p, __n); }),
         __obj_(__obj) {}
 
+  // New LLVM-20 function.
+  [[nodiscard]]
+  _LIBCPP_HIDE_FROM_ABI explicit __output_buffer(_CharT* __ptr, size_t __capacity, __prepare_write_type __prepare_write)
+      : __ptr_(__ptr), __capacity_(__capacity), __prepare_write_(__prepare_write), __obj_(nullptr) {}
+
+  // Deprecated LLVM-19 function.
   _LIBCPP_HIDE_FROM_ABI void __reset(_CharT* __ptr, size_t __capacity) {
     __ptr_      = __ptr;
     __capacity_ = __capacity;
   }
 
+  // New LLVM-20 function.
+  _LIBCPP_HIDE_FROM_ABI void __buffer_flused() { __size_ = 0; }
+
+  // New LLVM-20 function.
+  _LIBCPP_HIDE_FROM_ABI void __buffer_moved(_CharT* __ptr, size_t __capacity) {
+    __ptr_      = __ptr;
+    __capacity_ = __capacity;
+  }
+
   _LIBCPP_HIDE_FROM_ABI auto __make_output_iterator() { return std::back_insert_iterator{*this}; }
 
   // Used in std::back_insert_iterator.
@@ -84,7 +218,7 @@ class _LIBCPP_TEMPLATE_VIS __output_buffer {
     // Profiling showed flushing after adding is more efficient than flushing
     // when entering the function.
     if (__size_ == __capacity_)
-      __flush();
+      __flush(0);
   }
 
   /// Copies the input __str to the buffer.
@@ -107,7 +241,7 @@ class _LIBCPP_TEMPLATE_VIS __output_buffer {
     size_t __n = __str.size();
 
     __flush_on_overflow(__n);
-    if (__n < __capacity_) { //  push_back requires the buffer to have room for at least one character (so use <).
+    if (__n < __capacity_) { // push_back requires the buffer to have room for at least one character (so use <).
       std::copy_n(__str.data(), __n, std::addressof(__ptr_[__size_]));
       __size_ += __n;
       return;
@@ -118,12 +252,12 @@ class _LIBCPP_TEMPLATE_VIS __output_buffer {
     _LIBCPP_ASSERT_INTERNAL(__size_ == 0, "the buffer should be flushed by __flush_on_overflow");
     const _InCharT* __first = __str.data();
     do {
-      size_t __chunk = std::min(__n, __capacity_);
+      size_t __chunk = std::min(__n, __capacity_ - __size_);
       std::copy_n(__first, __chunk, std::addressof(__ptr_[__size_]));
       __size_ = __chunk;
       __first += __chunk;
       __n -= __chunk;
-      __flush();
+      __flush(__n);
     } while (__n);
   }
 
@@ -148,12 +282,12 @@ class _LIBCPP_TEMPLATE_VIS __output_buffer {
     // Transform the data in "__capacity_" sized chunks.
     _LIBCPP_ASSERT_INTERNAL(__size_ == 0, "the buffer should be flushed by __flush_on_overflow");
     do {
-      size_t __chunk = std::min(__n, __capacity_);
+      size_t __chunk = std::min(__n, __capacity_ - __size_);
       std::transform(__first, __first + __chunk, std::addressof(__ptr_[__size_]), __operation);
       __size_ = __chunk;
       __first += __chunk;
       __n -= __chunk;
-      __flush();
+      __flush(__n);
     } while (__n);
   }
 
@@ -174,20 +308,36 @@ class _LIBCPP_TEMPLATE_VIS __output_buffer {
       std::fill_n(std::addressof(__ptr_[__size_]), __chunk, __value);
       __size_ = __chunk;
       __n -= __chunk;
-      __flush();
+      __flush(__n);
     } while (__n);
   }
 
-  _LIBCPP_HIDE_FROM_ABI void __flush() {
-    __flush_(__ptr_, __size_, __obj_);
-    __size_ = 0;
+  _LIBCPP_HIDE_FROM_ABI void __flush(size_t __size_hint) {
+    if (__obj_) {
+      // LLVM-19 code path
+      __flush_(__ptr_, __size_, __obj_);
+      __size_ = 0;
+    } else {
+      // LLVM-20 code path
+      __prepare_write_(*this, __size_hint + 1); // + 1 to always have space for the next time
+    }
   }
 
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI size_t __capacity() const { return __capacity_; }
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI size_t __size() const { return __size_; }
+
 private:
   _CharT* __ptr_;
   size_t __capacity_;
   size_t __size_{0};
-  void (*__flush_)(_CharT*, size_t, void*);
+  union {
+    // LLVM-19 member
+    void (*__flush_)(_CharT*, size_t, void*);
+    // LLVM-20 member
+    void (*__prepare_write_)(__output_buffer<_CharT>&, size_t);
+  };
+  static_assert(sizeof(__flush_) == sizeof(__prepare_write_), "The union is an ABI break.");
+  static_assert(alignof(decltype(__flush_)) == alignof(decltype(__prepare_write_)), "The union is an ABI break.");
   void* __obj_;
 
   /// Flushes the buffer when the output operation would overflow the buffer.
@@ -224,11 +374,14 @@ class _LIBCPP_TEMPLATE_VIS __output_buffer {
   /// the buffers. This would make the code more complex and \ref format_to_n is
   /// not the most common use case. Therefore the optimization isn't done.
   _LIBCPP_HIDE_FROM_ABI void __flush_on_overflow(size_t __n) {
-    if (__size_ + __n >= __capacity_)
-      __flush();
+    __n += __size_;
+    if (__n >= __capacity_)
+      __flush(__n - __capacity_ + 1);
   }
 };
 
+// ***** ***** ***** LLVM-19 classes ***** ***** *****
+
 /// A storage using an internal buffer.
 ///
 /// This storage is used when writing a single element to the output iterator
@@ -373,7 +526,7 @@ class _LIBCPP_TEMPLATE_VIS __format_buffer {
   _LIBCPP_HIDE_FROM_ABI void __flush(_CharT* __ptr, size_t __n) { __writer_.__flush(__ptr, __n); }
 
   _LIBCPP_HIDE_FROM_ABI _OutIt __out_it() && {
-    __output_.__flush();
+    __output_.__flush(0);
     return std::move(__writer_).__out_it();
   }
 
@@ -395,7 +548,7 @@ class _LIBCPP_TEMPLATE_VIS __formatted_size_buffer {
   _LIBCPP_HIDE_FROM_ABI void __flush(const _CharT*, size_t __n) { __size_ += __n; }
 
   _LIBCPP_HIDE_FROM_ABI size_t __result() && {
-    __output_.__flush();
+    __output_.__flush(0);
     return __size_;
   }
 
@@ -499,11 +652,78 @@ struct _LIBCPP_TEMPLATE_VIS __format_to_n_buffer final
   _LIBCPP_HIDE_FROM_ABI auto __make_output_iterator() { return this->__output_.__make_output_iterator(); }
 
   _LIBCPP_HIDE_FROM_ABI format_to_n_result<_OutIt> __result() && {
-    this->__output_.__flush();
+    this->__output_.__flush(0);
     return {std::move(this->__writer_).__out_it(), this->__size_};
   }
 };
 
+// ***** ***** ***** LLVM-20 classes ***** ***** *****
+
+// A dynamically growing buffer.
+template <__fmt_char_type _CharT>
+class _LIBCPP_TEMPLATE_VIS __allocating_buffer : public __output_buffer<_CharT> {
+public:
+  __allocating_buffer(const __allocating_buffer&)            = delete;
+  __allocating_buffer& operator=(const __allocating_buffer&) = delete;
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI __allocating_buffer()
+      : __output_buffer<_CharT>{__buffer_, __buffer_size_, __prepare_write} {}
+
+  _LIBCPP_HIDE_FROM_ABI ~__allocating_buffer() {
+    if (__ptr_ != __buffer_) {
+      ranges::destroy_n(__ptr_, this->__size());
+      allocator_traits<_Alloc>::deallocate(__alloc_, __ptr_, this->__capacity());
+    }
+  }
+
+  [[nodiscard]] _LIBCPP_HIDE_FROM_ABI basic_string_view<_CharT> __view() { return {__ptr_, this->__size()}; }
+
+private:
+  // At the moment the allocator is hard-code. There might be reasons to have
+  // an allocator trait in the future. This ensures forward compatibility.
+  using _Alloc = allocator<_CharT>;
+  _LIBCPP_NO_UNIQUE_ADDRESS _Alloc __alloc_;
+
+  // Since allocating is expensive the class has a small internal buffer. When
+  // its capacity is exceeded a dynamic buffer will be allocated.
+  static constexpr size_t __buffer_size_ = 256;
+  _CharT __buffer_[__buffer_size_];
+  _CharT* __ptr_{__buffer_};
+
+  _LIBCPP_HIDE_FROM_ABI void __grow_buffer(size_t __capacity) {
+    if (__capacity < __buffer_size_)
+      return;
+
+    _LIBCPP_ASSERT_INTERNAL(__capacity > this->__capacity(), "the buffer must grow");
+    auto __result = std::__allocate_at_least(__alloc_, __capacity);
+    auto __guard  = std::__make_exception_guard([&] {
+      allocator_traits<_Alloc>::deallocate(__alloc_, __result.ptr, __result.count);
+    });
+    // This shouldn't throw, but just to be safe. Note that at -O1 this
+    // guard is optimized away so there is no runtime overhead.
+    new (__result.ptr) _CharT[__result.count];
+    std::copy_n(__ptr_, this->__size(), __result.ptr);
+    __guard.__complete();
+    if (__ptr_ != __buffer_) {
+      ranges::destroy_n(__ptr_, this->__capacity());
+      allocator_traits<_Alloc>::deallocate(__alloc_, __ptr_, this->__capacity());
+    }
+
+    __ptr_ = __result.ptr;
+    this->__buffer_moved(__ptr_, __result.count);
+  }
+
+  _LIBCPP_HIDE_FROM_ABI void __prepare_write(size_t __size_hint) {
+    __grow_buffer(std::max<size_t>(this->__capacity() + __size_hint, this->__capacity() * 1.6));
+  }
+
+  _LIBCPP_HIDE_FROM_ABI static void __prepare_write(__output_buffer<_CharT>& __buffer, size_t __size_hint) {
+    static_cast<__allocating_buffer<_CharT>&>(__buffer).__prepare_write(__size_hint);
+  }
+};
+
+// ***** ***** ***** LLVM-19 and LLVM-20 class ***** ***** *****
+
 // A dynamically growing buffer intended to be used for retargeting a context.
 //
 // P2286 Formatting ranges adds range formatting support. It allows the user to
diff --git a/libcxx/include/__format/format_functions.h b/libcxx/include/__format/format_functions.h
index d14b49aff1495..a5c63bd4db70f 100644
--- a/libcxx/include/__format/format_functions.h
+++ b/libcxx/include/__format/format_functions.h
@@ -452,9 +452,9 @@ format_to(_OutIt __out_it, wformat_string<_Args...> __fmt, _Args&&... __args) {
 // fires too eagerly, see http://llvm.org/PR61563.
 template <class = void>
 [[nodiscard]] _LIBCPP_ALWAYS_INLINE inline _LIBCPP_HIDE_FROM_ABI string vformat(string_view __fmt, format_args __args) {
-  string __res;
-  std::vformat_to(std::back_inserter(__res), __fmt, __args);
-  return __res;
+  __format::__allocating_buffer<char> __buffer;
+  std::vformat_to(__buffer.__make_output_iterator(), __fmt, __args);
+  return string{__buffer.__view()};
 }
 
 #  ifndef _LIBCPP_HAS_NO_WIDE_CHARACTERS
@@ -463,9 +463,9 @@ template <class = void>
 template <class = void>
 [[nodiscard]] _LIBCPP_ALWAYS_INLINE inline _LIBCPP_HIDE_FROM_ABI wstring
 vformat(wstring_view __fmt, wformat_args __args) {
-  wstring __res;
-  std::vformat_to(std::back_inserter(__res), __fmt, __args);
-  return __res;
+  __format::__allocating_buffer<wchar_t> __buffer;
+  std::vformat_to(__buffer.__make_output_iterator(), __fmt, __args);
+  return wstring{__buffer.__view()};
 }
 #  endif
 
@@ -585,9 +585,9 @@ format_to(_OutIt __out_it, locale __loc, wformat_string<_Args...> __fmt, _Args&&
 template <class = void>
 [[nodiscard]] _LIBCPP_ALWAYS_INLINE inline _LIBCPP_HIDE_FROM_ABI string
 vformat(locale __loc, string_view __fmt, format_args __args) {
-  string __res;
-  std::vformat_to(std::back_inserter(__res), std::move(__loc), __fmt, __args);
-  return __res;
+  __format::__allocating_buffer<char> __buffer;
+  std::vformat_to(__buffer.__make_output_iterator(), std::move(__loc), __fmt, __args);
+  return string{__buffer.__view()};
 }
 
 #    ifndef _LIBCPP_HAS_NO_WIDE_CHARACTERS
@@ -596,9 +596,9 @@ vformat(locale __loc, string_view __fmt, format_args __args) {
 template <class = void>
 [[nodiscard]] _LIBCPP_ALWAYS_INLINE inline _LIBCPP_HIDE_FROM_ABI wstring
 vformat(locale __loc, wstring_view __fmt, wformat_args __args) {
-  wstring __res;
-  std::vformat_to(std::back_inserter(__res), std::move(__loc), __fmt, __args);
-  return __res;
+  __format::__allocating_buffer<wchar_t> __buffer;
+  std::vformat_to(__buffer.__make_output_iterator(), std::move(__loc), __fmt, __args);
+  return wstring{__buffer.__view()};
 }
 #    endif
 
diff --git a/libcxx/test/libcxx/transitive_includes/cxx03.csv b/libcxx/test/libcxx/transitive_includes/cxx03.csv
index 51e659f52000b..622fced5ffa40 100644
--- a/libcxx/test/libcxx/transitive_includes/cxx03.csv
+++ b/libcxx/test/libcxx/transitive_includes/cxx03.csv
@@ -245,21 +245,11 @@ experimental/simd limits
 experimental/utility utility
 filesystem compare
 filesystem concepts
-filesystem cstddef
-filesystem cstdint
 filesystem cstdlib
 filesystem cstring
-filesystem ctime
-filesystem iomanip
 filesystem iosfwd
-filesystem limits
-filesystem locale
 filesystem new
-filesystem ratio
-filesystem string
-filesystem string_view
 filesystem system_error
-filesystem type_traits
 filesystem version
 format array
 format cctype
@@ -787,15 +777,7 @@ stack version
 stdexcept cstdlib
 stdexcept exception
 stdexcept iosfwd
-stop_token atomic
-stop_token cstddef
-stop_token cstdint
-stop_token cstring
-stop_token ctime
 stop_token iosfwd
-stop_token limits
-stop_token ratio
-stop_token type_traits
 stop_token version
 streambuf cctype
 streambuf climits
diff --git a/libcxx/test/libcxx/transitive_includes/cxx11.csv b/libcxx/test/libcxx/transitive_includes/cxx11.csv
index 17e85e982729c..11c3c1322c406 100644
--- a/libcxx/test/libcxx/transitive_includes/cxx11.csv
+++ b/libcxx/test/libcxx/transitive_includes/cxx11.csv
@@ -246,21 +246,11 @@ experimental/simd limits
 experimental/utility utility
 filesystem compare
 filesystem concepts
-filesystem cstddef
-filesystem cstdint
 filesystem cstdlib
 filesystem cstring
-filesystem ctime
-filesystem iomanip
 filesystem iosfwd
-filesystem limits
-filesystem locale
 filesystem new
-filesystem ratio
-filesystem string
-filesystem string_view
 filesystem system_error
-filesystem type_traits
 filesystem version
 format array
 format cctype
@@ -793,15 +783,7 @@ stack version
 stdexcept cstdlib
 stdexcept exception
 stdexcept iosfwd
-stop_token atomic
-stop_token cstddef
-stop_token cstdint
-stop_token cstring
-stop_token ctime
 stop_token iosfwd
-stop_token limits
-stop_token ratio
-stop_token type_traits
 stop_token version
 streambuf cctype
 streambuf climits
diff --git a/libcxx/test/libcxx/transitive_includes/cxx14.csv b/libcxx/test/libcxx/transitive_includes/cxx14.csv
index 8aed93da9e6cc..666d5c3896467 100644
--- a/libcxx/test/libcxx/transitive_includes/cxx14.csv
+++ b/libcxx/test/libcxx/transitive_includes/cxx14.csv
@@ -249,21 +249,11 @@ experimental/type_traits type_traits
 experimental/utility utility
 filesystem compare
 filesystem concepts
-filesystem cstddef
-filesystem cstdint
 filesystem cstdlib
 filesystem cstring
-filesystem ctime
-filesystem iomanip
 filesystem iosfwd
-filesystem limits
-filesystem locale
 filesystem new
-filesystem ratio
-filesystem string
-filesystem string_view
 filesystem system_error
-filesystem type_traits
 filesystem version
 format array
 format cctype
@@ -796,15 +786,7 @@ stack version
 stdexcept cstdlib
 stdexcept exception
 stdexcept iosfwd
-stop_token atomic
-stop_token cstddef
-stop_token cstdint
-stop_token cstring
-stop_token ctime
 stop_token iosfwd
-stop_token limits
-stop_token ratio
-stop_token type_traits
 stop_token version
 streambuf cctype
 streambuf climits
diff --git a/libcxx/test/libcxx/transitive_includes/cxx17.csv b/libcxx/test/libcxx/transitive_includes/cxx17.csv
index 2c028462144ee..3a3aa5a894473 100644
--- a/libcxx/test/libcxx/transitive_includes/cxx17.csv
+++ b/libcxx/test/libcxx/transitive_includes/cxx17.csv
@@ -797,15 +797,7 @@ stack version
 stdexcept cstdlib
 stdexcept exception
 stdexcept iosfwd
-stop_token atomic
-stop_token cstddef
-stop_token cstdint
-stop_token cstring
-stop_token ctime
 stop_token iosfwd
-stop_token limits
-stop_token ratio
-stop_token type_traits
 stop_token version
 streambuf cctype
 streambuf climits



More information about the llvm-branch-commits mailing list