[libcxx-commits] [PATCH] D97705: [RFC][libc++] Improve std::to_chars for base != 10.

Mark de Wever via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Tue Mar 8 08:48:35 PST 2022


Mordante marked 5 inline comments as done.
Mordante added a comment.
Herald added a project: All.

In D97705#3349302 <https://reviews.llvm.org/D97705#3349302>, @ldionne wrote:

> In D97705#3262177 <https://reviews.llvm.org/D97705#3262177>, @Mordante wrote:
>
>> 
>
> Just to make sure I understand (the review summary is difficult to read for me due to formatting issues), we have a base implementation using `__to_chars_itoa`, and then we also have implementations specific to other bases that use `__to_chars_integral<2>` and friends. When those base-specific implementations are enabled, we get roughly 3x performance improvement on those bases, at the cost of < 256 bytes of size for a lookup table. That's right?

The size of the lookup table differs, 64-bytes for base 2, 128-bytes for base 8, and 512-bytes for base 16.
The speedup is 14 times for base 2, 3 times for base 8, and 2 times for base 16.



================
Comment at: libcxx/include/charconv:328
+struct _LIBCPP_HIDDEN __integral<2> {
+  enum : unsigned { _Base = 2 };
+
----------------
Quuxplusone wrote:
> The red flag here was the `: unsigned` (I was like, "why does the signedness matter? better study closer") but actually, why do we need an enumerator for this at all? It's used only twice, and in contexts like
> ```
>       _VSTD::memcpy(__p, &__base_2_lut[4 * __c], 4);
> ~~~
>       unsigned __c = __value % _Base;
>       __value /= _Base;
>       *--__p = "01"[__c];
> ```
> The number `2` is hard-coded so many places here, two more won't make a difference (and will save two lines and eliminate a red flag). Ditto in the `8` and `16` cases.
> 
> (If this was to appease clang-tidy: Don't appease it. :))
No it had nothing to do with clang-tidy ;-) This was from an earlier experiment it had a generic form for other integral values. However there the size/speed benefit wasn't great so most of that was removed. Now this is also gone.


================
Comment at: libcxx/include/charconv:357
+      __value /= _Base;
+      *--__p = "01"[__c];
+    } while (__value != 0);
----------------
Quuxplusone wrote:
> Is Clang smart enough to optimize this into `'0' + __c`? Would the latter be more readable anyway? (Note there is no issue with EBCDIC here; all charsets must have `'0'+1 == '1'` according to the standard.)
> Ditto line 413.
The entire file uses this style, including places are _Base > 10. So I prefer to keep it as is.


================
Comment at: libcxx/include/charconv:362
+
+  static _LIBCPP_CONSTEXPR char __base_2_lut[64] = {
+    '0', '0', '0', '0',
----------------
Quuxplusone wrote:
> Either `constexpr` if this is C++11-and-later, or `_LIBCPP_CONSTEXPR const`, surely?
This was copy-pasted from another part of this file. D121223 addresses these places.



================
Comment at: libcxx/include/charconv:489
+template <unsigned _Base, typename _Tp,
+          typename enable_if<(sizeof(_Tp) >= sizeof(unsigned)), int>::type = 0>
+_LIBCPP_AVAILABILITY_TO_CHARS _LIBCPP_INLINE_VISIBILITY int
----------------
Quuxplusone wrote:
> `__enable_if_t<sizeof(_Tp) >= sizeof(unsigned)>* = nullptr` is the more traditional libc++ism. (`enable_if_t` if this is C++14-and-later.)
This is the same as other places in this file, so I prefer to keep it.
Note the `enable_if` specifies the type as `int` so `nullptr` isn't an option.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D97705/new/

https://reviews.llvm.org/D97705



More information about the libcxx-commits mailing list