[libcxx-commits] [PATCH] D109510: [libc++] string: Allocate exactly when growing from short to long string.

Clement Courbet via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Thu Sep 9 07:31:56 PDT 2021


courbet created this revision.
courbet requested review of this revision.
Herald added a project: libc++.
Herald added a subscriber: libcxx-commits.
Herald added a reviewer: libc++.

When a string grows from short (`size < __min_cap`) to large, the
minimum capacity is currently `2*__min_cap`.

While growing this way allows amortized operations in general, I don't think
there is a good reason to force the minimal allocation to be `48` bytes. This
is wasteful for strings that are small but do not fit in the SSO buffer
(22 <= size < 32), as they use `48` rather than `32` bytes (note that
all allocation sizes are aligned to 16 bytes).

We have measured this overhead to be significant on our (Google) internal
workloads, as 64% of global `string` allocations (and 26% of allocated bytes)
are 48 bytes.

As an example of publicly available code, protobuf descriptors <https://developers.google.com/protocol-buffers/docs/reference/cpp/google.protobuf.descriptor>, which
use `string`s to store field names, have 95% of allocated buffers that
have a capacity of 48 bytes in our datacenters.

This patch changes the min capacity to `0` when a string moves from short to
long, which has the effect that, in that case, a string now always allocates
exactly instead of only when `size >= 32`. This does not change anything
for sizes outside of in `[22,32)`.

Adding the branch is neutral on speed on the string benchmarks (see numbers
below) and the branch is constant-folded when called from `__assign_no_alias`).
We have seen no runtime regressions on our production benchmarks.

libcxx string benchmarks:

  BM_StringAssignStr_Empty_Opaque                              1.64ns ± 5%
  1.64ns ± 5%    ~           (p=0.457 n=47+43)
  BM_StringAssignStr_Empty_Transparent                         1.35ns ± 5%
  1.34ns ± 6%    ~           (p=0.089 n=47+43)
  BM_StringAssignStr_Small_Opaque                              1.36ns ± 3%
  1.35ns ± 4%    ~           (p=0.248 n=44+44)
  BM_StringAssignStr_Small_Transparent                         1.33ns ± 5%
  1.33ns ± 6%    ~           (p=0.822 n=43+45)
  BM_StringAssignStr_Large_Opaque                              17.6ns ± 5%
  17.6ns ± 8%    ~           (p=0.909 n=44+43)
  BM_StringAssignStr_Large_Transparent                         16.4ns ± 4%
  16.4ns ± 4%    ~           (p=0.699 n=43+42)
  BM_StringAssignStr_Huge_Opaque                                233ns ±13%
  231ns ±10%    ~           (p=0.119 n=49+48)
  BM_StringAssignStr_Huge_Transparent                           232ns ±13%
  228ns ±10%    ~           (p=0.064 n=50+43)
  BM_StringAssignAsciiz_Empty_Opaque                           6.54ns ± 4%
  6.53ns ± 5%    ~           (p=0.822 n=44+42)
  BM_StringAssignAsciiz_Empty_Transparent                      0.82ns ± 5%
  0.82ns ± 4%    ~           (p=0.750 n=44+40)
  BM_StringAssignAsciiz_Small_Opaque                           7.81ns ± 5%
  7.82ns ± 8%    ~           (p=0.966 n=47+40)
  BM_StringAssignAsciiz_Small_Transparent                      1.33ns ± 6%
  1.33ns ± 6%    ~           (p=0.509 n=47+40)
  BM_StringAssignAsciiz_Large_Opaque                           23.3ns ± 4%
  23.3ns ± 4%    ~           (p=0.952 n=47+40)
  BM_StringAssignAsciiz_Large_Transparent                      18.5ns ± 3%
  18.4ns ± 5%  -0.66%        (p=0.029 n=47+40)
  BM_StringAssignAsciiz_Huge_Opaque                             269ns ± 6%
  278ns ±13%    ~           (p=0.078 n=44+50)
  BM_StringAssignAsciiz_Huge_Transparent                        227ns ± 8%
  232ns ±10%    ~           (p=0.099 n=44+50)
  BM_StringAssignAsciizMix_Opaque                              11.3ns ± 4%
  11.4ns ± 8%    ~           (p=0.386 n=42+40)
  BM_StringAssignAsciizMix_Transparent                         5.02ns ± 5%
  5.01ns ± 5%    ~           (p=0.301 n=41+44)


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D109510

Files:
  libcxx/include/string


Index: libcxx/include/string
===================================================================
--- libcxx/include/string
+++ libcxx/include/string
@@ -2278,8 +2278,11 @@
     if (__delta_cap > __ms - __old_cap - 1)
         this->__throw_length_error();
     pointer __old_p = __get_pointer();
+    const bool __was_short = __old_cap + 1 == __min_cap;
     size_type __cap = __old_cap < __ms / 2 - __alignment ?
-                          __recommend(_VSTD::max(__old_cap + __delta_cap, 2 * __old_cap)) :
+                          __recommend(
+                              _VSTD::max(__old_cap + __delta_cap,
+                                         __was_short ? 0 : 2 * __old_cap)) :
                           __ms - 1;
     pointer __p = __alloc_traits::allocate(__alloc(), __cap+1);
     __invalidate_all_iterators();
@@ -2292,7 +2295,7 @@
     if (__sec_cp_sz != 0)
         traits_type::copy(_VSTD::__to_address(__p) + __n_copy + __n_add,
                           _VSTD::__to_address(__old_p) + __n_copy + __n_del, __sec_cp_sz);
-    if (__old_cap+1 != __min_cap)
+    if (!__was_short)
         __alloc_traits::deallocate(__alloc(), __old_p, __old_cap+1);
     __set_long_pointer(__p);
     __set_long_cap(__cap+1);
@@ -2310,8 +2313,11 @@
     if (__delta_cap > __ms - __old_cap)
         this->__throw_length_error();
     pointer __old_p = __get_pointer();
+    const bool __was_short = __old_cap + 1 == __min_cap;
     size_type __cap = __old_cap < __ms / 2 - __alignment ?
-                          __recommend(_VSTD::max(__old_cap + __delta_cap, 2 * __old_cap)) :
+                          __recommend(
+                              _VSTD::max(__old_cap + __delta_cap,
+                                         __was_short ? 0 : 2 * __old_cap)) :
                           __ms - 1;
     pointer __p = __alloc_traits::allocate(__alloc(), __cap+1);
     __invalidate_all_iterators();
@@ -2323,7 +2329,7 @@
         traits_type::copy(_VSTD::__to_address(__p) + __n_copy + __n_add,
                           _VSTD::__to_address(__old_p) + __n_copy + __n_del,
                           __sec_cp_sz);
-    if (__old_cap+1 != __min_cap)
+    if (!__was_short)
         __alloc_traits::deallocate(__alloc(), __old_p, __old_cap+1);
     __set_long_pointer(__p);
     __set_long_cap(__cap+1);


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D109510.371591.patch
Type: text/x-patch
Size: 2311 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/libcxx-commits/attachments/20210909/59ea0d37/attachment-0001.bin>


More information about the libcxx-commits mailing list