[libcxx-commits] [PATCH] D101752: Speedup to_string for integers using zero-copy.

Mon May 3 06:01:12 PDT 2021

Quuxplusone added inline comments.

================
Comment at: libcxx/src/string.cpp:442
 //  so we need +1 here.
     constexpr size_t bufsize = numeric_limits<V>::digits10 + 2;  // +1 for minus, +1 for digits10
     char buf[bufsize];
----------------
grandinj wrote:
> just a drive-by commentator - why is the wstring version so different from the string version? Surely they should be practically the same?
I infer it's because `string().capacity() > bufsize`, but `wstring().capacity() < bufsize`. (A default-initialized `basic_string<Ch>` has non-zero capacity because of the Small String Optimization.)
At the very least, line 429 should add `_LIBCPP_ASSERT(buf.capacity() >= bufsize);` in order to catch a failure of that invariant as early and predictably as possible.

I also wondered if the `wstring` version should use the same trick — `buf.resize(bufsize)` followed by `buf.resize(res.ptr - buf.data())`. But:

- `bufsize` varies from `11` (for int) to `20` (for long long)
- `string().capacity()` is `22`
- `wstring().capacity()` is `4` (i.e., 16 over sizeof(wchar_t))

So if hypothetically we did `wstring buf(bufsize)`, that'd allocate between 44 and 80 bytes of heap every time, versus the current approach allocating between 0 and 80. (And in the case someone's stringifying a small number like `"0"` or `"42"`, the current approach allocates 0 bytes, which is great.) So, having them look at least a bit different is good IMO.

================
Comment at: libcxx/src/string.cpp:445
     const auto res = to_chars(buf, buf + bufsize, v);
     _LIBCPP_ASSERT(res.ec == errc(), "bufsize must be large enough to accomodate the value");
+    return wstring(buf, res.ptr);
----------------
/accomodate/accommodate/ while you're here (or, look out for a merge-conflict if I fix it in main first ;))

================
Comment at: libcxx/src/string.cpp:463
+wstring to_wstring(unsigned long val)      { return i_to_wstring(val); }
+wstring to_wstring(unsigned long long val) { return i_to_wstring(val); }

----------------
Do you have a benchmark showing the performance impact? I think you //should// try to quantify the impact somehow — after all, all you're saving is a very predictable branch and a single stack-to-stack memcpy, right? (OTOH, I don't see how this could possibly //hurt// performance.)

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101752/new/

https://reviews.llvm.org/D101752