[libcxx-commits] [PATCH] D101752: Speedup to_string for integers using zero-copy.

Noel Grandin via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Mon May 3 06:12:20 PDT 2021

grandinj added inline comments.

Comment at: libcxx/src/string.cpp:442
 //  so we need +1 here.
     constexpr size_t bufsize = numeric_limits<V>::digits10 + 2;  // +1 for minus, +1 for digits10
     char buf[bufsize];
Quuxplusone wrote:
> grandinj wrote:
> > just a drive-by commentator - why is the wstring version so different from the string version? Surely they should be practically the same?
> I infer it's because `string().capacity() > bufsize`, but `wstring().capacity() < bufsize`. (A default-initialized `basic_string<Ch>` has non-zero capacity because of the Small String Optimization.)
> At the very least, line 429 should add `_LIBCPP_ASSERT(buf.capacity() >= bufsize);` in order to catch a failure of that invariant as early and predictably as possible.
> I also wondered if the `wstring` version should use the same trick — `buf.resize(bufsize)` followed by `buf.resize(res.ptr - buf.data())`. But:
> - `bufsize` varies from `11` (for int) to `20` (for long long)
> - `string().capacity()` is `22`
> - `wstring().capacity()` is `4` (i.e., 16 over sizeof(wchar_t))
> So if hypothetically we did `wstring buf(bufsize)`, that'd allocate between 44 and 80 bytes of heap every time, versus the current approach allocating between 0 and 80. (And in the case someone's stringifying a small number like `"0"` or `"42"`, the current approach allocates 0 bytes, which is great.) So, having them look at least a bit different is good IMO.
wchar is 2 bytes not 4, 
but yes, I agree with the rest of your analysis, this approach means that we only need to allocate zero or once.
Thanks for explaining.

  rG LLVM Github Monorepo



More information about the libcxx-commits mailing list