[cfe-dev] Papers comparing SSO and COW strings

Alexey Salmin via cfe-dev cfe-dev at lists.llvm.org
Mon May 15 07:46:53 PDT 2017


The "Why" section of the libcxx documentation states that "it is
generally accepted that building std::string using the "short string
optimization" instead of using Copy On Write (COW) is a superior
approach for multicore machines". [1a] Similar considerations lie at
the core of N2668 that had effectively banned COW implementations in
C++11 [2].

The thing is that N2668 doesn't reference any particular research on
the speed and downsides of COW string implementations and I'm having a
hard time finding one. So far I've seen the well-known article by Herb
Sutter [3] and one more paper [4] but both are built around a few
synthetic benchmarks and are 10+ years old. Unfortunately I can't find
any benchmarks featuring real-world applications and measured on a
modern hardware which changed a lot since then. For instance, atomics
have in some sense became both cheaper (with improvements in SMP
systems) and more expensive (with a wider spread of NUMA and a
constantly growing number of cores that increases contention).

In theory I see two different kinds of speed-up that may come from
non-COW strings:
1) Improvements that make the existing code run faster. Possible reasons are:
    a) No need for atomic reference counters
    b) Improved data locality on NUMA systems for threads that
maintain own copies of their strings
    c) Short string optimization (which could technically co-exist
with COW but normally doesn't. A notable exception is fbstring [5])
2) Improvements that allow writing a better code. By limiting the
number of cases where pointers and iterators may be invalidated, the
C++11 standard allows a wider use of non-owning references to strings.
This goes well with the string_view in C++17.

At the same time, a code that relies heavily on the COW-ness of
strings may face a performance degradation with the non-COW
implementation. I wonder if anyone have reported seeing this on
practice.

I'm looking for papers and articles that cover these topics. Anything
from a documented and analyzed speed-up of a given application that
switched to libc++ (from e.g. pre-5.1 libstdc++), to a comprehensive
research. Regarding the hardware I'm primarily interested in x86_64
but data on other architectures would be also useful.

Does anyone have relevant links?

Alexey

[1a] http://libcxx.llvm.org/
[2] http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2668.htm
[3] http://www.gotw.ca/publications/optimizations.htm
[4] http://complement.sourceforge.net/compare.pdf
[5] https://github.com/facebook/folly/blob/master/folly/docs/FBString.md



More information about the cfe-dev mailing list