[libcxx-commits] [PATCH] D59178: [libc++] Speedup to_string and to_wstring for integers using stack buffer and SSO

Afanasyev Ivan via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Sun Mar 17 10:19:22 PDT 2019


ivafanas marked 2 inline comments as done.
ivafanas added inline comments.


================
Comment at: libcxx/src/string.cpp:358-359
+    typename S::value_type buf[size] = {};
+    const int len = sprintf_like(buf, size, fmt, a);
+    return S(buf, buf + len);
+}
----------------
ivafanas wrote:
> mclow.lists wrote:
> > ivafanas wrote:
> > > lebedev.ri wrote:
> > > > Missing error checking.
> > > > ```
> > > > RETURN VALUE
> > > >        Upon successful return, these functions return the number of characters printed (excluding the null byte used to end output to strings).
> > > > 
> > > > ```
> > > `size` is chosen in the way that there should be enough space for integer:
> > > `4 * sizeof(V)`: 2^8 = 256 - 3 digits + 1 digit for minus.
> > > 
> > > GCC implementation ignores negative return values so as they knows that space is enough:
> > > https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/ext/string_conversions.h#L99
> > > 
> > > In case of unexpected `sprintf` failure I will add fallback to the previous algorithm.
> > Because of licensing issues, we aren't able to refer to GCC's source code.
> > Please help us do so by not posting links, snippets, etc from GCC here.
> > 
> Ooooops, sorry, didn't know about that.
> 
> Also I didnn't found how to edit / delete comment with link to gcc repo. If there is a way, please help me.
Hi, Roman.

`to_string` implementation was rewritten to naive algorithm according to Ed proposal.
There is no need to check  `sprintf` return value anymore.


================
Comment at: libcxx/src/string.cpp:422
 {
-    return as_string(snprintf, initial_string<string, int>()(), "%d", val);
+    return i_to_string<string>(snprintf, "%d", val);
 }
----------------
ed wrote:
> ivafanas wrote:
> > ivafanas wrote:
> > > ed wrote:
> > > > Considering that performance is of the essence and that we only do simple conversions (without leading whitespace, zeroes, etc.), would it make sense to handroll this for the integer cases? This is likely a lot faster than using `snprintf()`.
> > > I will try to implement that and estimate performance gain.
> > Ed, I have locally implemented simple conversion and it gives significant performance boost (about 20x times when SSO plays).
> > 
> > There is one more question I'm looking for answer for days.
> > 
> > According to `std::to_string` documentation result for integer types must be equal to `sprintf` call with `%i` parameter (`%l` for long etc etc)
> > https://en.cppreference.com/w/cpp/string/basic_string/to_string
> > 
> > According to `to_chars` documentation `to_chars` is the only locale-independent formatter
> > https://en.cppreference.com/w/cpp/utility/to_chars
> > 
> > According to this note clang aims to work with the system C library
> > http://clang-developers.42468.n3.nabble.com/C-Standard-Library-td2127533.html
> > 
> > I've checked `glibc`, `ditelibc` and `musl` standard C lib implementations, and they provides the expected result for `sprintf(s, "%i", x)` calls without locale checks for thousands separator: `sprintf(s, "%i", 100500) --> 100500`
> > I've tested msvc `sprintf(s, "%i", x)` calls with the different locales, it also prints integers without thousands separator.
> > 
> > So, on the one side, performance boost is really impressive, on the other side there is no guarantee that `sprintf(s, "%i", x)` will always ignore thousands separator.
> > 
> > I want to ask your opinion as much more experienced compiler developer.
> > Should we use fast simple conversion or leave slow `sprintf` calls?
> > So, on the one side, performance boost is really impressive, on the other side there is no guarantee that `sprintf(s, "%i", x)` will always ignore thousands separator.
> 
> Even though this is POSIX and not ISO C, it seems that thousands separators are only added to integers when `%'d` is used (notice the apostrophe). See this page:
> 
> http://pubs.opengroup.org/onlinepubs/9699919799/functions/fprintf.html <- search for 'apostrophe'
> 
> ISO C also says the following about `printf()`'s 'd' conversion specifier:
> 
> > The int argument is converted to signed decimal in the style [−]dddd. The precision specifies the minimum number of digits to appear; if the value being converted can be represented in fewer digits, it is expanded with leading zeros. The default precision is 1. The result of converting a zero value with a precision of zero is no characters.
> 
> Notice that the description doesn't mention anything about thousand separators.
> 
> In other words, I think it's fine to bypass sprintf() for integer literals. For floating point literals it is a different story, because formatting those is locale dependent.
Thank you a lot!
Updated implementation has been uploaded to review.

It gives 5x - 18x speedup:
http://quick-bench.com/fnW_oJWH9D8BYNw-Y9mQaIYvD24


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D59178/new/

https://reviews.llvm.org/D59178





More information about the libcxx-commits mailing list