[libcxx-commits] [libcxx] f7c0df0 - [libc++][format] Improve format buffer.
Mark de Wever via libcxx-commits
libcxx-commits at lists.llvm.org
Tue Aug 16 09:54:16 PDT 2022
Author: Mark de Wever
Date: 2022-08-16T18:54:10+02:00
New Revision: f7c0df002a083bcca8ac4972330b8198474a355b
URL: https://github.com/llvm/llvm-project/commit/f7c0df002a083bcca8ac4972330b8198474a355b
DIFF: https://github.com/llvm/llvm-project/commit/f7c0df002a083bcca8ac4972330b8198474a355b.diff
LOG: [libc++][format] Improve format buffer.
Allow bulk output operations on the buffer instead of adding one
code unit at a time. This has a huge performance benefit at the cost of
larger binary. This doesn't implement @vitaut's earlier suggestion to
avoid buffering for std::string when writing a strings. That can be done
in a follow-up patch.
There are some minor complications for the non-buffered format_to_n.
When writing one character at a time it's easy to detect when reaching
the limit n. This is solved by adding a small overhead for format_to_n.
When the next write would overflow it stores the data in the internal
buffer and copies that up-to n code units. The overhead isn't measured,
but it's expected to only be an issue for small values of n; for larger
values the general improvements will outweight the new overhead.
```
text data bss dec hex filename
349081 6096 440 355617 56d21 format.libcxx.out-baseline
344442 6088 440 350970 55afa formatted_size.libcxx.out-baseline
4567980 57272 424 4625676 46950c formatter_float.libcxx.out-baseline
718800 12472 488 731760 b2a70 formatter_int.libcxx.out-baseline
376341 6096 552 382989 5d80d format_to.libcxx.out-beaseline
370169 6096 440 376705 5bf81 format.libcxx.out
365530 6088 440 372058 5ad5a formatted_size.libcxx.out
4575116 57272 424 4632812 46b0ec formatter_float.libcxx.out
725936 12472 488 738896 b4650 formatter_int.libcxx.out
397429 6096 552 404077 62a6d format_to.libcxx.out
```
For very small strings the new method is slower, from 4 characters
there's already a small gain.
```
Comparing ./format.libcxx.out-baseline to ./format.libcxx.out
Benchmark Time CPU Time Old Time New CPU Old CPU New
--------------------------------------------------------------------------------------------------------------------------------
BM_format_string<char>/1 +0.0268 +0.0268 43 44 43 44
BM_format_string<char>/2 +0.0133 +0.0133 22 22 22 22
BM_format_string<char>/4 -0.0248 -0.0248 12 11 12 11
BM_format_string<char>/8 -0.0831 -0.0831 6 6 6 6
BM_format_string<char>/16 -0.2976 -0.2976 4 3 4 3
BM_format_string<char>/32 -0.4369 -0.4369 3 2 3 2
BM_format_string<char>/64 -0.6375 -0.6375 3 1 3 1
BM_format_string<char>/128 -0.7685 -0.7685 2 1 2 1
```
The int benchmark has benefits for the simple formatting, but shines for
the complex formatting:
```
Comparing ./formatter_int.libcxx.out-baseline to ./formatter_int.libcxx.out
Benchmark Time CPU Time Old Time New CPU Old CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------
BM_Basic<uint32_t> -0.2307 -0.2307 60 46 60 46
BM_Basic<int32_t> -0.1985 -0.1985 61 49 61 49
BM_Basic<uint64_t> -0.3478 -0.3479 81 53 81 53
BM_Basic<int64_t> -0.3475 -0.3475 81 53 81 53
BM_BasicLow<__uint128_t> -0.3388 -0.3388 86 57 86 57
BM_BasicLow<__int128_t> -0.3431 -0.3431 86 57 86 57
BM_Basic<__uint128_t> -0.2822 -0.2822 236 170 236 170
BM_Basic<__int128_t> -0.3107 -0.3107 219 151 219 151
Integral_LocFalse_BaseBin_AlignNone_Int64 -0.5781 -0.5781 178 75 178 75
Integral_LocFalse_BaseBin_AlignmentLeft_Int64 -0.9231 -0.9231 1156 89 1156 89
Integral_LocFalse_BaseBin_AlignmentCenter_Int64 -0.9179 -0.9179 1107 91 1107 91
Integral_LocFalse_BaseBin_AlignmentRight_Int64 -0.9238 -0.9238 1147 87 1147 87
Integral_LocFalse_BaseBin_ZeroPadding_Int64 -0.9170 -0.9170 1137 94 1137 94
Integral_LocFalse_BaseBin_AlignNone_Uint64 -0.5923 -0.5923 175 71 175 71
Integral_LocFalse_BaseBin_AlignmentLeft_Uint64 -0.9251 -0.9251 1154 86 1154 86
Integral_LocFalse_BaseBin_AlignmentCenter_Uint64 -0.9204 -0.9204 1105 88 1105 88
Integral_LocFalse_BaseBin_AlignmentRight_Uint64 -0.9242 -0.9242 1125 85 1125 85
Integral_LocFalse_BaseBin_ZeroPadding_Uint64 -0.9232 -0.9232 1139 88 1139 88
Integral_LocFalse_BaseOct_AlignNone_Int64 -0.3241 -0.3241 100 67 100 67
Integral_LocFalse_BaseOct_AlignmentLeft_Int64 -0.9322 -0.9322 1166 79 1166 79
Integral_LocFalse_BaseOct_AlignmentCenter_Int64 -0.9251 -0.9251 1108 83 1108 83
Integral_LocFalse_BaseOct_AlignmentRight_Int64 -0.9303 -0.9303 1136 79 1136 79
Integral_LocFalse_BaseOct_ZeroPadding_Int64 -0.9264 -0.9264 1156 85 1156 85
Integral_LocFalse_BaseOct_AlignNone_Uint64 -0.3116 -0.3116 96 66 96 66
Integral_LocFalse_BaseOct_AlignmentLeft_Uint64 -0.9310 -0.9310 1168 81 1168 81
Integral_LocFalse_BaseOct_AlignmentCenter_Uint64 -0.9281 -0.9281 1128 81 1128 81
Integral_LocFalse_BaseOct_AlignmentRight_Uint64 -0.9299 -0.9299 1148 80 1148 80
Integral_LocFalse_BaseOct_ZeroPadding_Uint64 -0.9288 -0.9288 1153 82 1153 82
Integral_LocFalse_BaseDec_AlignNone_Int64 -0.3342 -0.3342 95 63 95 63
Integral_LocFalse_BaseDec_AlignmentLeft_Int64 -0.9360 -0.9360 1157 74 1157 74
Integral_LocFalse_BaseDec_AlignmentCenter_Int64 -0.9303 -0.9303 1128 79 1128 79
Integral_LocFalse_BaseDec_AlignmentRight_Int64 -0.9369 -0.9369 1164 73 1164 73
Integral_LocFalse_BaseDec_ZeroPadding_Int64 -0.9323 -0.9323 1157 78 1157 78
Integral_LocFalse_BaseDec_AlignNone_Uint64 -0.3198 -0.3198 93 63 93 63
Integral_LocFalse_BaseDec_AlignmentLeft_Uint64 -0.9351 -0.9351 1158 75 1158 75
Integral_LocFalse_BaseDec_AlignmentCenter_Uint64 -0.9298 -0.9298 1128 79 1128 79
Integral_LocFalse_BaseDec_AlignmentRight_Uint64 -0.9361 -0.9361 1157 74 1157 74
Integral_LocFalse_BaseDec_ZeroPadding_Uint64 -0.9333 -0.9333 1151 77 1151 77
Integral_LocFalse_BaseHex_AlignNone_Int64 -0.3020 -0.3020 89 62 89 62
Integral_LocFalse_BaseHex_AlignmentLeft_Int64 -0.9357 -0.9357 1174 75 1174 75
Integral_LocFalse_BaseHex_AlignmentCenter_Int64 -0.9319 -0.9319 1129 77 1129 77
Integral_LocFalse_BaseHex_AlignmentRight_Int64 -0.9350 -0.9350 1161 75 1161 75
Integral_LocFalse_BaseHex_ZeroPadding_Int64 -0.9293 -0.9293 1150 81 1150 81
Integral_LocFalse_BaseHex_AlignNone_Uint64 -0.3056 -0.3057 86 59 86 59
Integral_LocFalse_BaseHex_AlignmentLeft_Uint64 -0.9378 -0.9378 1174 73 1174 73
Integral_LocFalse_BaseHex_AlignmentCenter_Uint64 -0.9341 -0.9341 1129 74 1130 74
Integral_LocFalse_BaseHex_AlignmentRight_Uint64 -0.9361 -0.9361 1157 74 1157 74
Integral_LocFalse_BaseHex_ZeroPadding_Uint64 -0.9315 -0.9315 1147 79 1147 79
Integral_LocFalse_BaseHexUpper_AlignNone_Int64 -0.0019 -0.0019 91 90 91 90
Integral_LocFalse_BaseHexUpper_AlignmentLeft_Int64 -0.9099 -0.9099 1162 105 1162 105
Integral_LocFalse_BaseHexUpper_AlignmentCenter_Int64 -0.9041 -0.9041 1121 108 1121 108
Integral_LocFalse_BaseHexUpper_AlignmentRight_Int64 -0.9086 -0.9086 1162 106 1162 106
Integral_LocFalse_BaseHexUpper_ZeroPadding_Int64 -0.9057 -0.9057 1164 110 1164 110
Integral_LocFalse_BaseHexUpper_AlignNone_Uint64 +0.0110 +0.0110 86 87 86 87
Integral_LocFalse_BaseHexUpper_AlignmentLeft_Uint64 -0.9136 -0.9136 1161 100 1161 100
Integral_LocFalse_BaseHexUpper_AlignmentCenter_Uint64 -0.9078 -0.9078 1133 104 1133 104
Integral_LocFalse_BaseHexUpper_AlignmentRight_Uint64 -0.9132 -0.9132 1177 102 1177 102
Integral_LocFalse_BaseHexUpper_ZeroPadding_Uint64 -0.9091 -0.9091 1160 105 1160 105
```
Other benchmarks give similar results.
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D129964
Added:
Modified:
libcxx/include/__format/buffer.h
libcxx/include/__format/formatter_floating_point.h
libcxx/include/__format/formatter_integral.h
libcxx/include/__format/formatter_output.h
libcxx/test/std/utilities/format/format.formatter/format.formatter.spec/formatter.unsigned_integral.pass.cpp
libcxx/test/std/utilities/format/format.functions/format_tests.h
Removed:
################################################################################
diff --git a/libcxx/include/__format/buffer.h b/libcxx/include/__format/buffer.h
index 1837ad06ba18e..4f972264e481f 100644
--- a/libcxx/include/__format/buffer.h
+++ b/libcxx/include/__format/buffer.h
@@ -11,8 +11,10 @@
#define _LIBCPP___FORMAT_BUFFER_H
#include <__algorithm/copy_n.h>
+#include <__algorithm/fill_n.h>
#include <__algorithm/max.h>
#include <__algorithm/min.h>
+#include <__algorithm/transform.h>
#include <__algorithm/unwrap_iter.h>
#include <__config>
#include <__format/enable_insertable.h>
@@ -26,6 +28,7 @@
#include <__utility/move.h>
#include <concepts>
#include <cstddef>
+#include <string_view>
#include <type_traits>
#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
@@ -69,8 +72,6 @@ class _LIBCPP_TEMPLATE_VIS __output_buffer {
return back_insert_iterator{*this};
}
- // TODO FMT It would be nice to have an overload taking a
- // basic_string_view<_CharT> and append it directly.
_LIBCPP_HIDE_FROM_ABI void push_back(_CharT __c) {
__ptr_[__size_++] = __c;
@@ -80,6 +81,95 @@ class _LIBCPP_TEMPLATE_VIS __output_buffer {
flush();
}
+ /// Copies the input __str to the buffer.
+ ///
+ /// Since some of the input is generated by std::to_chars, there needs to be a
+ /// conversion when _CharT is wchar_t.
+ template <__formatter::__char_type _InCharT>
+ _LIBCPP_HIDE_FROM_ABI void __copy(basic_string_view<_InCharT> __str) {
+ // When the underlying iterator is a simple iterator the __capacity_ is
+ // infinite. For a string or container back_inserter it isn't. This means
+ // adding a large string the the buffer can cause some overhead. In that
+ // case a better approach could be:
+ // - flush the buffer
+ // - container.append(__str.begin(), __str.end());
+ // The same holds true for the fill.
+ // For transform it might be slightly harder, however the use case for
+ // transform is slightly less common; it converts hexadecimal values to
+ // upper case. For integral these strings are short.
+ // TODO FMT Look at the improvements above.
+ size_t __n = __str.size();
+
+ __flush_on_overflow(__n);
+ if (__n <= __capacity_) {
+ _VSTD::copy_n(__str.data(), __n, _VSTD::addressof(__ptr_[__size_]));
+ __size_ += __n;
+ return;
+ }
+
+ // The output doesn't fit in the internal buffer.
+ // Copy the data in "__capacity_" sized chunks.
+ _LIBCPP_ASSERT(__size_ == 0, "the buffer should be flushed by __flush_on_overflow");
+ const _InCharT* __first = __str.data();
+ do {
+ size_t __chunk = _VSTD::min(__n, __capacity_);
+ _VSTD::copy_n(__first, __chunk, _VSTD::addressof(__ptr_[__size_]));
+ __size_ = __chunk;
+ __first += __chunk;
+ __n -= __chunk;
+ flush();
+ } while (__n);
+ }
+
+ /// A std::transform wrapper.
+ ///
+ /// Like @ref __copy it may need to do type conversion.
+ template <__formatter::__char_type _InCharT, class _UnaryOperation>
+ _LIBCPP_HIDE_FROM_ABI void __transform(const _InCharT* __first, const _InCharT* __last, _UnaryOperation __operation) {
+ _LIBCPP_ASSERT(__first <= __last, "not a valid range");
+
+ size_t __n = static_cast<size_t>(__last - __first);
+ __flush_on_overflow(__n);
+ if (__n <= __capacity_) {
+ _VSTD::transform(__first, __last, _VSTD::addressof(__ptr_[__size_]), _VSTD::move(__operation));
+ __size_ += __n;
+ return;
+ }
+
+ // The output doesn't fit in the internal buffer.
+ // Transform the data in "__capacity_" sized chunks.
+ _LIBCPP_ASSERT(__size_ == 0, "the buffer should be flushed by __flush_on_overflow");
+ do {
+ size_t __chunk = _VSTD::min(__n, __capacity_);
+ _VSTD::transform(__first, __first + __chunk, _VSTD::addressof(__ptr_[__size_]), __operation);
+ __size_ = __chunk;
+ __first += __chunk;
+ __n -= __chunk;
+ flush();
+ } while (__n);
+ }
+
+ /// A \c fill_n wrapper.
+ _LIBCPP_HIDE_FROM_ABI void __fill(size_t __n, _CharT __value) {
+ __flush_on_overflow(__n);
+ if (__n <= __capacity_) {
+ _VSTD::fill_n(_VSTD::addressof(__ptr_[__size_]), __n, __value);
+ __size_ += __n;
+ return;
+ }
+
+ // The output doesn't fit in the internal buffer.
+ // Fill the buffer in "__capacity_" sized chunks.
+ _LIBCPP_ASSERT(__size_ == 0, "the buffer should be flushed by __flush_on_overflow");
+ do {
+ size_t __chunk = _VSTD::min(__n, __capacity_);
+ _VSTD::fill_n(_VSTD::addressof(__ptr_[__size_]), __chunk, __value);
+ __size_ = __chunk;
+ __n -= __chunk;
+ flush();
+ } while (__n);
+ }
+
_LIBCPP_HIDE_FROM_ABI void flush() {
__flush_(__ptr_, __size_, __obj_);
__size_ = 0;
@@ -91,6 +181,44 @@ class _LIBCPP_TEMPLATE_VIS __output_buffer {
size_t __size_{0};
void (*__flush_)(_CharT*, size_t, void*);
void* __obj_;
+
+ /// Flushes the buffer when the output operation would overflow the buffer.
+ ///
+ /// A simple approach for the overflow detection would be something along the
+ /// lines:
+ /// \code
+ /// // The internal buffer is large enough.
+ /// if (__n <= __capacity_) {
+ /// // Flush when we really would overflow.
+ /// if (__size_ + __n >= __capacity_)
+ /// flush();
+ /// ...
+ /// }
+ /// \endcode
+ ///
+ /// This approach works for all cases but one:
+ /// A __format_to_n_buffer_base where \ref __enable_direct_output is true.
+ /// In that case the \ref __capacity_ of the buffer changes during the first
+ /// \ref flush. During that operation the output buffer switches from its
+ /// __writer_ to its __storage_. The \ref __capacity_ of the former depends
+ /// on the value of n, of the latter is a fixed size. For example:
+ /// - a format_to_n call with a 10'000 char buffer,
+ /// - the buffer is filled with 9'500 chars,
+ /// - adding 1'000 elements would overflow the buffer so the buffer gets
+ /// changed and the \ref __capacity_ decreases from 10'000 to
+ /// __buffer_size (256 at the time of writing).
+ ///
+ /// This means that the \ref flush for this class may need to copy a part of
+ /// the internal buffer to the proper output. In this example there will be
+ /// 500 characters that need this copy operation.
+ ///
+ /// Note it would be more efficient to write 500 chars directly and then swap
+ /// the buffers. This would make the code more complex and \ref format_to_n is
+ /// not the most common use case. Therefore the optimization isn't done.
+ _LIBCPP_HIDE_FROM_ABI void __flush_on_overflow(size_t __n) {
+ if (__size_ + __n >= __capacity_)
+ flush();
+ }
};
/// A storage using an internal buffer.
@@ -280,12 +408,12 @@ struct _LIBCPP_TEMPLATE_VIS __format_to_n_buffer_base {
using _Size = iter_
diff erence_t<_OutIt>;
public:
- _LIBCPP_HIDE_FROM_ABI explicit __format_to_n_buffer_base(_OutIt __out_it, _Size __n)
- : __writer_(_VSTD::move(__out_it)), __n_(_VSTD::max(_Size(0), __n)) {}
+ _LIBCPP_HIDE_FROM_ABI explicit __format_to_n_buffer_base(_OutIt __out_it, _Size __max_size)
+ : __writer_(_VSTD::move(__out_it)), __max_size_(_VSTD::max(_Size(0), __max_size)) {}
_LIBCPP_HIDE_FROM_ABI void flush(_CharT* __ptr, size_t __size) {
- if (_Size(__size_) <= __n_)
- __writer_.flush(__ptr, _VSTD::min(_Size(__size), __n_ - __size_));
+ if (_Size(__size_) <= __max_size_)
+ __writer_.flush(__ptr, _VSTD::min(_Size(__size), __max_size_ - __size_));
__size_ += __size;
}
@@ -294,7 +422,7 @@ struct _LIBCPP_TEMPLATE_VIS __format_to_n_buffer_base {
__output_buffer<_CharT> __output_{__storage_.begin(), __storage_.__buffer_size, this};
typename __writer_selector<_OutIt, _CharT>::type __writer_;
- _Size __n_;
+ _Size __max_size_;
_Size __size_{0};
};
@@ -310,24 +438,35 @@ class _LIBCPP_TEMPLATE_VIS __format_to_n_buffer_base<_OutIt, _CharT, true> {
using _Size = iter_
diff erence_t<_OutIt>;
public:
- _LIBCPP_HIDE_FROM_ABI explicit __format_to_n_buffer_base(_OutIt __out_it, _Size __n)
- : __output_(_VSTD::__unwrap_iter(__out_it), __n, this), __writer_(_VSTD::move(__out_it)) {
- if (__n <= 0) [[unlikely]]
+ _LIBCPP_HIDE_FROM_ABI explicit __format_to_n_buffer_base(_OutIt __out_it, _Size __max_size)
+ : __output_(_VSTD::__unwrap_iter(__out_it), __max_size, this),
+ __writer_(_VSTD::move(__out_it)),
+ __max_size_(__max_size) {
+ if (__max_size <= 0) [[unlikely]]
__output_.reset(__storage_.begin(), __storage_.__buffer_size);
}
_LIBCPP_HIDE_FROM_ABI void flush(_CharT* __ptr, size_t __size) {
- // A flush to the direct writer happens in two occasions:
+ // A flush to the direct writer happens in the following occasions:
// - The format function has written the maximum number of allowed code
// units. At this point it's no longer valid to write to this writer. So
// switch to the internal storage. This internal storage doesn't need to
// be written anywhere so the flush for that storage writes no output.
+ // - Like above, but the next "mass write" operation would overflow the
+ // buffer. In that case the buffer is pre-emptively switched. The still
+ // valid code units will be written separately.
// - The format_to_n function is finished. In this case there's no need to
// switch the buffer, but for simplicity the buffers are still switched.
- // When the __n <= 0 the constructor already switched the buffers.
+ // When the __max_size <= 0 the constructor already switched the buffers.
if (__size_ == 0 && __ptr != __storage_.begin()) {
__writer_.flush(__ptr, __size);
__output_.reset(__storage_.begin(), __storage_.__buffer_size);
+ } else if (__size_ < __max_size_) {
+ // Copies a part of the internal buffer to the output up to n characters.
+ // See __output_buffer<_CharT>::__flush_on_overflow for more information.
+ _Size __s = _VSTD::min(_Size(__size), __max_size_ - __size_);
+ std::copy_n(__ptr, __s, __writer_.out());
+ __writer_.flush(__ptr, __s);
}
__size_ += __size;
@@ -338,6 +477,7 @@ class _LIBCPP_TEMPLATE_VIS __format_to_n_buffer_base<_OutIt, _CharT, true> {
__output_buffer<_CharT> __output_;
__writer_direct<_OutIt, _CharT> __writer_;
+ _Size __max_size_;
_Size __size_{0};
};
@@ -350,7 +490,8 @@ struct _LIBCPP_TEMPLATE_VIS __format_to_n_buffer final
using _Size = iter_
diff erence_t<_OutIt>;
public:
- _LIBCPP_HIDE_FROM_ABI explicit __format_to_n_buffer(_OutIt __out_it, _Size __n) : _Base(_VSTD::move(__out_it), __n) {}
+ _LIBCPP_HIDE_FROM_ABI explicit __format_to_n_buffer(_OutIt __out_it, _Size __max_size)
+ : _Base(_VSTD::move(__out_it), __max_size) {}
_LIBCPP_HIDE_FROM_ABI auto make_output_iterator() { return this->__output_.make_output_iterator(); }
_LIBCPP_HIDE_FROM_ABI format_to_n_result<_OutIt> result() && {
diff --git a/libcxx/include/__format/formatter_floating_point.h b/libcxx/include/__format/formatter_floating_point.h
index 16be89347b8ac..3a65ed436defc 100644
--- a/libcxx/include/__format/formatter_floating_point.h
+++ b/libcxx/include/__format/formatter_floating_point.h
@@ -10,9 +10,7 @@
#ifndef _LIBCPP___FORMAT_FORMATTER_FLOATING_POINT_H
#define _LIBCPP___FORMAT_FORMATTER_FLOATING_POINT_H
-#include <__algorithm/copy.h>
#include <__algorithm/copy_n.h>
-#include <__algorithm/fill_n.h>
#include <__algorithm/find.h>
#include <__algorithm/min.h>
#include <__algorithm/rotate.h>
@@ -528,13 +526,13 @@ _LIBCPP_HIDE_FROM_ABI _OutIt __format_locale_specific_form(
// sign and (zero padding or alignment)
if (__zero_padding && __first != __buffer.begin())
*__out_it++ = *__buffer.begin();
- __out_it = _VSTD::fill_n(_VSTD::move(__out_it), __padding.__before_, __specs.__fill_);
+ __out_it = __formatter::__fill(_VSTD::move(__out_it), __padding.__before_, __specs.__fill_);
if (!__zero_padding && __first != __buffer.begin())
*__out_it++ = *__buffer.begin();
// integral part
if (__grouping.empty()) {
- __out_it = _VSTD::copy_n(__first, __digits, _VSTD::move(__out_it));
+ __out_it = __formatter::__copy(__first, __digits, _VSTD::move(__out_it));
} else {
auto __r = __grouping.rbegin();
auto __e = __grouping.rend() - 1;
@@ -546,7 +544,7 @@ _LIBCPP_HIDE_FROM_ABI _OutIt __format_locale_specific_form(
// This loop achieves that process by testing the termination condition
// midway in the loop.
while (true) {
- __out_it = _VSTD::copy_n(__first, *__r, _VSTD::move(__out_it));
+ __out_it = __formatter::__copy(__first, *__r, _VSTD::move(__out_it));
__first += *__r;
if (__r == __e)
@@ -560,16 +558,16 @@ _LIBCPP_HIDE_FROM_ABI _OutIt __format_locale_specific_form(
// fractional part
if (__result.__radix_point != __result.__last) {
*__out_it++ = __np.decimal_point();
- __out_it = _VSTD::copy(__result.__radix_point + 1, __result.__exponent, _VSTD::move(__out_it));
- __out_it = _VSTD::fill_n(_VSTD::move(__out_it), __buffer.__num_trailing_zeros(), _CharT('0'));
+ __out_it = __formatter::__copy(__result.__radix_point + 1, __result.__exponent, _VSTD::move(__out_it));
+ __out_it = __formatter::__fill(_VSTD::move(__out_it), __buffer.__num_trailing_zeros(), _CharT('0'));
}
// exponent
if (__result.__exponent != __result.__last)
- __out_it = _VSTD::copy(__result.__exponent, __result.__last, _VSTD::move(__out_it));
+ __out_it = __formatter::__copy(__result.__exponent, __result.__last, _VSTD::move(__out_it));
// alignment
- return _VSTD::fill_n(_VSTD::move(__out_it), __padding.__after_, __specs.__fill_);
+ return __formatter::__fill(_VSTD::move(__out_it), __padding.__after_, __specs.__fill_);
}
# endif // _LIBCPP_HAS_NO_LOCALIZATION
@@ -651,14 +649,15 @@ __format_floating_point(_Tp __value, auto& __ctx, __format_spec::__parsed_specif
if (__size + __num_trailing_zeros >= __specs.__width_) {
if (__num_trailing_zeros && __result.__exponent != __result.__last)
// Insert trailing zeros before exponent character.
- return _VSTD::copy(
+ return __formatter::__copy(
__result.__exponent,
__result.__last,
- _VSTD::fill_n(
- _VSTD::copy(__buffer.begin(), __result.__exponent, __ctx.out()), __num_trailing_zeros, _CharT('0')));
+ __formatter::__fill(__formatter::__copy(__buffer.begin(), __result.__exponent, __ctx.out()),
+ __num_trailing_zeros,
+ _CharT('0')));
- return _VSTD::fill_n(
- _VSTD::copy(__buffer.begin(), __result.__last, __ctx.out()), __num_trailing_zeros, _CharT('0'));
+ return __formatter::__fill(
+ __formatter::__copy(__buffer.begin(), __result.__last, __ctx.out()), __num_trailing_zeros, _CharT('0'));
}
auto __out_it = __ctx.out();
diff --git a/libcxx/include/__format/formatter_integral.h b/libcxx/include/__format/formatter_integral.h
index b9ed5fe80f7f3..834a402081aa6 100644
--- a/libcxx/include/__format/formatter_integral.h
+++ b/libcxx/include/__format/formatter_integral.h
@@ -243,7 +243,7 @@ _LIBCPP_HIDE_FROM_ABI auto __format_integer(
// The zero padding is done like:
// - Write [sign][prefix]
// - Write data right aligned with '0' as fill character.
- __out_it = _VSTD::copy(__begin, __first, _VSTD::move(__out_it));
+ __out_it = __formatter::__copy(__begin, __first, _VSTD::move(__out_it));
__specs.__alignment_ = __format_spec::__alignment::__right;
__specs.__fill_ = _CharT('0');
int32_t __size = __first - __begin;
diff --git a/libcxx/include/__format/formatter_output.h b/libcxx/include/__format/formatter_output.h
index e09534c41dff0..1852c88ea4fb2 100644
--- a/libcxx/include/__format/formatter_output.h
+++ b/libcxx/include/__format/formatter_output.h
@@ -14,10 +14,13 @@
#include <__algorithm/copy_n.h>
#include <__algorithm/fill_n.h>
#include <__algorithm/transform.h>
+#include <__concepts/same_as.h>
#include <__config>
+#include <__format/buffer.h>
#include <__format/formatter.h>
#include <__format/parser_std_format_spec.h>
#include <__format/unicode.h>
+#include <__iterator/back_insert_iterator.h>
#include <__utility/move.h>
#include <__utility/unreachable.h>
#include <cstddef>
@@ -86,6 +89,63 @@ __padding_size(size_t __size, size_t __width, __format_spec::__alignment __align
__libcpp_unreachable();
}
+/// Copy wrapper.
+///
+/// This uses a "mass output function" of __format::__output_buffer when possible.
+template <__formatter::__char_type _CharT, __formatter::__char_type _OutCharT = _CharT>
+_LIBCPP_HIDE_FROM_ABI auto __copy(basic_string_view<_CharT> __str, output_iterator<const _OutCharT&> auto __out_it)
+ -> decltype(__out_it) {
+ if constexpr (_VSTD::same_as<decltype(__out_it), _VSTD::back_insert_iterator<__format::__output_buffer<_OutCharT>>>) {
+ __out_it.__get_container()->__copy(__str);
+ return __out_it;
+ } else {
+ return std::copy_n(__str.data(), __str.size(), _VSTD::move(__out_it));
+ }
+}
+
+template <__formatter::__char_type _CharT, __formatter::__char_type _OutCharT = _CharT>
+_LIBCPP_HIDE_FROM_ABI auto
+__copy(const _CharT* __first, const _CharT* __last, output_iterator<const _OutCharT&> auto __out_it)
+ -> decltype(__out_it) {
+ return __formatter::__copy(basic_string_view{__first, __last}, _VSTD::move(__out_it));
+}
+
+template <__formatter::__char_type _CharT, __formatter::__char_type _OutCharT = _CharT>
+_LIBCPP_HIDE_FROM_ABI auto __copy(const _CharT* __first, size_t __n, output_iterator<const _OutCharT&> auto __out_it)
+ -> decltype(__out_it) {
+ return __formatter::__copy(basic_string_view{__first, __n}, _VSTD::move(__out_it));
+}
+
+/// Transform wrapper.
+///
+/// This uses a "mass output function" of __format::__output_buffer when possible.
+template <__formatter::__char_type _CharT, __formatter::__char_type _OutCharT = _CharT, class _UnaryOperation>
+_LIBCPP_HIDE_FROM_ABI auto
+__transform(const _CharT* __first,
+ const _CharT* __last,
+ output_iterator<const _OutCharT&> auto __out_it,
+ _UnaryOperation __operation) -> decltype(__out_it) {
+ if constexpr (_VSTD::same_as<decltype(__out_it), _VSTD::back_insert_iterator<__format::__output_buffer<_OutCharT>>>) {
+ __out_it.__get_container()->__transform(__first, __last, _VSTD::move(__operation));
+ return __out_it;
+ } else {
+ return std::transform(__first, __last, _VSTD::move(__out_it), __operation);
+ }
+}
+
+/// Fill wrapper.
+///
+/// This uses a "mass output function" of __format::__output_buffer when possible.
+template <__formatter::__char_type _CharT, output_iterator<const _CharT&> _OutIt>
+_LIBCPP_HIDE_FROM_ABI _OutIt __fill(_OutIt __out_it, size_t __n, _CharT __value) {
+ if constexpr (_VSTD::same_as<decltype(__out_it), _VSTD::back_insert_iterator<__format::__output_buffer<_CharT>>>) {
+ __out_it.__get_container()->__fill(__n, __value);
+ return __out_it;
+ } else {
+ return std::fill_n(_VSTD::move(__out_it), __n, __value);
+ }
+}
+
template <class _OutIt, class _CharT>
_LIBCPP_HIDE_FROM_ABI _OutIt __write_using_decimal_separators(_OutIt __out_it, const char* __begin, const char* __first,
const char* __last, string&& __grouping, _CharT __sep,
@@ -97,22 +157,22 @@ _LIBCPP_HIDE_FROM_ABI _OutIt __write_using_decimal_separators(_OutIt __out_it, c
__padding_size_result __padding = {0, 0};
if (__specs.__alignment_ == __format_spec::__alignment::__zero_padding) {
// Write [sign][prefix].
- __out_it = _VSTD::copy(__begin, __first, _VSTD::move(__out_it));
+ __out_it = __formatter::__copy(__begin, __first, _VSTD::move(__out_it));
if (__specs.__width_ > __size) {
// Write zero padding.
__padding.__before_ = __specs.__width_ - __size;
- __out_it = _VSTD::fill_n(_VSTD::move(__out_it), __specs.__width_ - __size, _CharT('0'));
+ __out_it = __formatter::__fill(_VSTD::move(__out_it), __specs.__width_ - __size, _CharT('0'));
}
} else {
if (__specs.__width_ > __size) {
// Determine padding and write padding.
__padding = __padding_size(__size, __specs.__width_, __specs.__alignment_);
- __out_it = _VSTD::fill_n(_VSTD::move(__out_it), __padding.__before_, __specs.__fill_);
+ __out_it = __formatter::__fill(_VSTD::move(__out_it), __padding.__before_, __specs.__fill_);
}
// Write [sign][prefix].
- __out_it = _VSTD::copy(__begin, __first, _VSTD::move(__out_it));
+ __out_it = __formatter::__copy(__begin, __first, _VSTD::move(__out_it));
}
auto __r = __grouping.rbegin();
@@ -133,10 +193,10 @@ _LIBCPP_HIDE_FROM_ABI _OutIt __write_using_decimal_separators(_OutIt __out_it, c
while (true) {
if (__specs.__std_.__type_ == __format_spec::__type::__hexadecimal_upper_case) {
__last = __first + *__r;
- __out_it = _VSTD::transform(__first, __last, _VSTD::move(__out_it), __hex_to_upper);
+ __out_it = __formatter::__transform(__first, __last, _VSTD::move(__out_it), __hex_to_upper);
__first = __last;
} else {
- __out_it = _VSTD::copy_n(__first, *__r, _VSTD::move(__out_it));
+ __out_it = __formatter::__copy(__first, *__r, _VSTD::move(__out_it));
__first += *__r;
}
@@ -147,7 +207,7 @@ _LIBCPP_HIDE_FROM_ABI _OutIt __write_using_decimal_separators(_OutIt __out_it, c
*__out_it++ = __sep;
}
- return _VSTD::fill_n(_VSTD::move(__out_it), __padding.__after_, __specs.__fill_);
+ return __formatter::__fill(_VSTD::move(__out_it), __padding.__after_, __specs.__fill_);
}
/// Writes the input to the output with the required padding.
@@ -155,12 +215,10 @@ _LIBCPP_HIDE_FROM_ABI _OutIt __write_using_decimal_separators(_OutIt __out_it, c
/// Since the output column width is specified the function can be used for
/// ASCII and Unicode output.
///
-/// \pre [\a __first, \a __last) is a valid range.
/// \pre \a __size <= \a __width. Using this function when this pre-condition
/// doesn't hold incurs an unwanted overhead.
///
-/// \param __first Pointer to the first element to write.
-/// \param __last Pointer beyond the last element to write.
+/// \param __str The string to write.
/// \param __out_it The output iterator to write to.
/// \param __specs The parsed formatting specifications.
/// \param __size The (estimated) output column width. When the elements
@@ -174,31 +232,42 @@ _LIBCPP_HIDE_FROM_ABI _OutIt __write_using_decimal_separators(_OutIt __out_it, c
/// conversion, which means the [\a __first, \a __last) always contains elements
/// of the type \c char.
template <class _CharT, class _ParserCharT>
-_LIBCPP_HIDE_FROM_ABI auto __write(
- const _CharT* __first,
- const _CharT* __last,
- output_iterator<const _CharT&> auto __out_it,
- __format_spec::__parsed_specifications<_ParserCharT> __specs,
- ptr
diff _t __size) -> decltype(__out_it) {
- _LIBCPP_ASSERT(__first <= __last, "Not a valid range");
-
+_LIBCPP_HIDE_FROM_ABI auto
+__write(basic_string_view<_CharT> __str,
+ output_iterator<const _CharT&> auto __out_it,
+ __format_spec::__parsed_specifications<_ParserCharT> __specs,
+ ptr
diff _t __size) -> decltype(__out_it) {
if (__size >= __specs.__width_)
- return _VSTD::copy(__first, __last, _VSTD::move(__out_it));
+ return __formatter::__copy(__str, _VSTD::move(__out_it));
__padding_size_result __padding = __formatter::__padding_size(__size, __specs.__width_, __specs.__std_.__alignment_);
- __out_it = _VSTD::fill_n(_VSTD::move(__out_it), __padding.__before_, __specs.__fill_);
- __out_it = _VSTD::copy(__first, __last, _VSTD::move(__out_it));
- return _VSTD::fill_n(_VSTD::move(__out_it), __padding.__after_, __specs.__fill_);
+ __out_it = __formatter::__fill(_VSTD::move(__out_it), __padding.__before_, __specs.__fill_);
+ __out_it = __formatter::__copy(__str, _VSTD::move(__out_it));
+ return __formatter::__fill(_VSTD::move(__out_it), __padding.__after_, __specs.__fill_);
+}
+
+template <class _CharT, class _ParserCharT>
+_LIBCPP_HIDE_FROM_ABI auto
+__write(const _CharT* __first,
+ const _CharT* __last,
+ output_iterator<const _CharT&> auto __out_it,
+ __format_spec::__parsed_specifications<_ParserCharT> __specs,
+ ptr
diff _t __size) -> decltype(__out_it) {
+ _LIBCPP_ASSERT(__first <= __last, "Not a valid range");
+ return __formatter::__write(basic_string_view{__first, __last}, _VSTD::move(__out_it), __specs, __size);
}
/// \overload
///
/// Calls the function above where \a __size = \a __last - \a __first.
template <class _CharT, class _ParserCharT>
-_LIBCPP_HIDE_FROM_ABI auto __write(const _CharT* __first, const _CharT* __last,
- output_iterator<const _CharT&> auto __out_it,
- __format_spec::__parsed_specifications<_ParserCharT> __specs) -> decltype(__out_it) {
- return __write(__first, __last, _VSTD::move(__out_it), __specs, __last - __first);
+_LIBCPP_HIDE_FROM_ABI auto
+__write(const _CharT* __first,
+ const _CharT* __last,
+ output_iterator<const _CharT&> auto __out_it,
+ __format_spec::__parsed_specifications<_ParserCharT> __specs) -> decltype(__out_it) {
+ _LIBCPP_ASSERT(__first <= __last, "Not a valid range");
+ return __formatter::__write(__first, __last, _VSTD::move(__out_it), __specs, __last - __first);
}
template <class _CharT, class _ParserCharT, class _UnaryOperation>
@@ -210,12 +279,12 @@ _LIBCPP_HIDE_FROM_ABI auto __write_transformed(const _CharT* __first, const _Cha
ptr
diff _t __size = __last - __first;
if (__size >= __specs.__width_)
- return _VSTD::transform(__first, __last, _VSTD::move(__out_it), __op);
+ return __formatter::__transform(__first, __last, _VSTD::move(__out_it), __op);
__padding_size_result __padding = __padding_size(__size, __specs.__width_, __specs.__alignment_);
- __out_it = _VSTD::fill_n(_VSTD::move(__out_it), __padding.__before_, __specs.__fill_);
- __out_it = _VSTD::transform(__first, __last, _VSTD::move(__out_it), __op);
- return _VSTD::fill_n(_VSTD::move(__out_it), __padding.__after_, __specs.__fill_);
+ __out_it = __formatter::__fill(_VSTD::move(__out_it), __padding.__before_, __specs.__fill_);
+ __out_it = __formatter::__transform(__first, __last, _VSTD::move(__out_it), __op);
+ return __formatter::__fill(_VSTD::move(__out_it), __padding.__after_, __specs.__fill_);
}
/// Writes additional zero's for the precision before the exponent.
@@ -240,11 +309,11 @@ _LIBCPP_HIDE_FROM_ABI auto __write_using_trailing_zeros(
__padding_size_result __padding =
__padding_size(__size + __num_trailing_zeros, __specs.__width_, __specs.__alignment_);
- __out_it = _VSTD::fill_n(_VSTD::move(__out_it), __padding.__before_, __specs.__fill_);
- __out_it = _VSTD::copy(__first, __exponent, _VSTD::move(__out_it));
- __out_it = _VSTD::fill_n(_VSTD::move(__out_it), __num_trailing_zeros, _CharT('0'));
- __out_it = _VSTD::copy(__exponent, __last, _VSTD::move(__out_it));
- return _VSTD::fill_n(_VSTD::move(__out_it), __padding.__after_, __specs.__fill_);
+ __out_it = __formatter::__fill(_VSTD::move(__out_it), __padding.__before_, __specs.__fill_);
+ __out_it = __formatter::__copy(__first, __exponent, _VSTD::move(__out_it));
+ __out_it = __formatter::__fill(_VSTD::move(__out_it), __num_trailing_zeros, _CharT('0'));
+ __out_it = __formatter::__copy(__exponent, __last, _VSTD::move(__out_it));
+ return __formatter::__fill(_VSTD::move(__out_it), __padding.__after_, __specs.__fill_);
}
/// Writes a string using format's width estimation algorithm.
@@ -262,7 +331,7 @@ _LIBCPP_HIDE_FROM_ABI auto __write_string_no_precision(
// No padding -> copy the string
if (!__specs.__has_width())
- return _VSTD::copy(__str.begin(), __str.end(), _VSTD::move(__out_it));
+ return __formatter::__copy(__str, _VSTD::move(__out_it));
// Note when the estimated width is larger than size there's no padding. So
// there's no reason to get the real size when the estimate is larger than or
@@ -270,8 +339,7 @@ _LIBCPP_HIDE_FROM_ABI auto __write_string_no_precision(
size_t __size =
__format_spec::__estimate_column_width(__str, __specs.__width_, __format_spec::__column_width_rounding::__up)
.__width_;
-
- return __formatter::__write(__str.begin(), __str.end(), _VSTD::move(__out_it), __specs, __size);
+ return __formatter::__write(__str, _VSTD::move(__out_it), __specs, __size);
}
template <class _CharT>
diff --git a/libcxx/test/std/utilities/format/format.formatter/format.formatter.spec/formatter.unsigned_integral.pass.cpp b/libcxx/test/std/utilities/format/format.formatter/format.formatter.spec/formatter.unsigned_integral.pass.cpp
index c3e426fcba1cc..36f2dbd4b8b48 100644
--- a/libcxx/test/std/utilities/format/format.formatter/format.formatter.spec/formatter.unsigned_integral.pass.cpp
+++ b/libcxx/test/std/utilities/format/format.formatter/format.formatter.spec/formatter.unsigned_integral.pass.cpp
@@ -88,6 +88,8 @@ void test_unsigned_integral_type() {
test_termination_condition(
STR("340282366920938463463374607431768211455"), STR("}"), A(std::numeric_limits<__uint128_t>::max()));
#endif
+ // Test __formatter::__transform (libc++ specific).
+ test_termination_condition(STR("FF"), STR("X}"), A(255));
}
template <class CharT>
diff --git a/libcxx/test/std/utilities/format/format.functions/format_tests.h b/libcxx/test/std/utilities/format/format.functions/format_tests.h
index ba6f67b0e24fd..551e1dd066a99 100644
--- a/libcxx/test/std/utilities/format/format.functions/format_tests.h
+++ b/libcxx/test/std/utilities/format/format.functions/format_tests.h
@@ -2557,6 +2557,68 @@ void format_test_pointer(TestFunction check, ExceptionTest check_exception) {
format_test_pointer<const void*, CharT>(check, check_exception);
}
+/// Tests special buffer functions with a "large" input.
+///
+/// This is a test specific for libc++, however the code should behave the same
+/// on all implementations.
+/// In \c __format::__output_buffer there are some special functions to optimize
+/// outputting multiple characters, \c __copy, \c __transform, \c __fill. This
+/// test validates whether the functions behave properly when the output size
+/// doesn't fit in its internal buffer.
+template <class CharT, class TestFunction>
+void format_test_buffer_optimizations(TestFunction check) {
+#ifdef _LIBCPP_VERSION
+ // Used to validate our test sets are the proper size.
+ // To test the chunked operations it needs to be larger than the internal
+ // buffer. Picked a nice looking number.
+ constexpr int minimum = 3 * std::__format::__internal_storage<CharT>::__buffer_size;
+#else
+ constexpr int minimum = 1;
+#endif
+
+ // Copy
+ std::basic_string<CharT> str = STR(
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog."
+ "The quick brown fox jumps over the lazy dog.");
+ assert(str.size() > minimum);
+ check.template operator()<"{}">(std::basic_string_view<CharT>{str}, str);
+
+ // Fill
+ std::basic_string<CharT> fill(minimum, CharT('*'));
+ check.template operator()<"{:*<{}}">(std::basic_string_view<CharT>{str + fill}, str, str.size() + minimum);
+ check.template operator()<"{:*^{}}">(
+ std::basic_string_view<CharT>{fill + str + fill}, str, minimum + str.size() + minimum);
+ check.template operator()<"{:*>{}}">(std::basic_string_view<CharT>{fill + str}, str, minimum + str.size());
+}
+
template <class CharT, class TestFunction, class ExceptionTest>
void format_tests(TestFunction check, ExceptionTest check_exception) {
// *** Test escaping ***
@@ -2671,6 +2733,9 @@ void format_tests(TestFunction check, ExceptionTest check_exception) {
// *** Test handle formatter argument ***
format_test_handle<CharT>(check, check_exception);
+
+ // *** Test the interal buffer optimizations ***
+ format_test_buffer_optimizations<CharT>(check);
}
#ifndef TEST_HAS_NO_WIDE_CHARACTERS
More information about the libcxx-commits
mailing list