[libcxx-commits] [PATCH] D70631: Microsoft's floating-point to_chars powered by Ryu and Ryu Printf

Mon Nov 25 18:01:52 PST 2019

STL_MSFT marked 20 inline comments as done.
STL_MSFT added inline comments.

================
Comment at: libcxx/CREDITS.TXT:16
+N: Ulf Adams
+D: Invented the Ryu and Ryu Printf algorithms used in floating-point to_chars.
+
----------------
ulfjack wrote:
> Some of my comments are still alive, including some that don't make sense in this context. Maybe mention that I wrote the initial code?
My apologies for the incomplete credit! I've added that, using your phrasing, and avoiding wrapping to the next line.

Regarding the comments - in microsoft/STL I generally preserved your comments whenever possible, adjusting only identifier names. In a few places, I chopped out both code and comments simultaneously (e.g. the RYU_OPTIMIZE_SIZE codepath). I erred on the side of preserving too many comments, instead of too few. I can definitely drop/alter comments that don't make sense here, just point them out.

================
Comment at: libcxx/include/charconv:626
+template <class _Ty>
+constexpr const _Ty& _Min_value(const _Ty& _Left, const _Ty& _Right) noexcept {
+  return _Right < _Left ? _Right : _Left;
----------------
zoecarver wrote:
> Why not `std::min`?
In MSVC's STL, `std::min` is defined in `<algorithm>`, a heavyweight header that isn't otherwise needed by `<charconv>`. This appears to be the case for libcxx as well.

If you want, I can include `<algorithm>` and use `std::min`, although I recommend against doing so due to the throughput cost, if `<algorithm>` isn't already being dragged in.

In the general precision codepath, I use `std::lower_bound()` and `std::find_if()`, as in MSVC's STL they're defined in a central internal header. In libcxx, they appear to be defined in `<algorithm>`, so I'm not sure how this compiled for Jorg.

================
Comment at: libcxx/include/charconv:631
+_LIBCPP_END_NAMESPACE_STD
+#include <xcharconv_ryu.h>
+_LIBCPP_BEGIN_NAMESPACE_STD
----------------
zoecarver wrote:
> Can this go at the top of the file?
It requires `chars_format` and `to_chars_result` to be defined. (In MSVC's STL, a fourth header `xcharconv.h` defines just those types, so both `charconv` and `xcharconv_ryu.h` can include it, but Jorg eliminated that complexity as you already had definitions of these types in `charconv`.)

If you want, I can move this almost to the top of the file, after the enum and return type definitions.

================
Comment at: libcxx/include/charconv:638
+        0>
+_LIBCPP_NODISCARD_AFTER_CXX17 /*constexpr*/ _To _Bit_cast(const _From& _From_obj) noexcept {
+    _To _To_obj; // assumes default-init
----------------
zoecarver wrote:
> Not a problem with this patch, but, it might be good to implement `bit_cast` and `__libcpp_bitcast` while we wait for the builtin constexpr one. I can work on a patch.
Thanks! I'll mark your comment as Done; please let me know if your patch lands before `to_chars` does, and I can eliminate `_Bit_cast` accordingly. (I assume that `__libcpp_bitcast` will be the internal, callable-from-any-Standard-mode counterpart of C++20 `bit_cast`; microsoft/STL has similar conventions.)

================
Comment at: libcxx/include/charconv:710
+    // double explicitly stores 52 fraction bits. 52 / 4 == 13, which is 13 hexits.
+    constexpr int _Full_precision         = is_same_v<_Floating, float> ? 6 : 13;
+    constexpr int _Adjusted_explicit_bits = _Full_precision * 4;
----------------
zoecarver wrote:
> Can you use `_IsSame` so we can benefit from the builtin?
Done; I changed all occurrences in `charconv` and `xcharconv_ryu.h`. (MSVC's compiler just directly recognizes `is_same_v` for improved throughput.)

================
Comment at: libcxx/include/xcharconv_ryu.h:74
+
+// clang-format off
+// vvvvvvvvvv DERIVED FROM common.h vvvvvvvvvv
----------------
zoecarver wrote:
> I don't think we need this (no one should run clang-format on all of libc++).
Dropped `// clang-format off`, `// clang-format on` in both `xcharconv_ryu.h` and `xcharconv_ryu_tables.h`. (We clang-format all of microsoft/STL and it is an amazing time-saver. We usually suppress it only when clang-format really damages the code. In this case, we suppressed it to avoid adding yet another massive set of diffs with upstream Ryu.)

================
Comment at: libcxx/include/xcharconv_ryu.h:193
+  // The casts here help MSVC to avoid calls to the __allmul library function.
+  const uint32_t __aLo = static_cast<uint32_t>(__a);
+  const uint32_t __aHi = static_cast<uint32_t>(__a >> 32);
----------------
jorgbrown wrote:
> zoecarver wrote:
> > Is it well-defined behavior to cast from a larger integer to a smaller one? 
> Yes.  See http://eel.is/c++draft/conv#integral which says:
> 
> > A prvalue of an integer type can be converted to a prvalue of another integer type. A prvalue of an unscoped enumeration type can be converted to a prvalue of an integer type.
> >
> > If the destination type is bool, see [conv.bool]. If the source type is bool, the value false is converted to zero and the value true is converted to one.
> >
> > Otherwise, the result is the unique value of the destination type that is congruent to the source integer modulo N, where N is the width of the destination type.
Additional notes: the C++ Standard recently made this well-defined for signed destinations, but it has always been well-defined for unsigned destinations. There's an MSVC compiler option (/RTCc) that asserts when such conversions are value-modifying, even when performed with casts, but because this breaks Standard-conformant code, MSVC's STL explicitly does not support that compiler option.

================
Comment at: libcxx/include/xcharconv_ryu.h:224
+  _LIBCPP_ASSERT(__dist < 64, "");
+#ifdef _WIN64
+  _LIBCPP_ASSERT(__dist > 0, "");
----------------
jorgbrown wrote:
> zoecarver wrote:
> > Is there a way this could be generalized to more 64 bit machines? 
> This is inside of the third-part of a large 3-way #if, where the first of the 3 forks (line 146) handles _M_X64 (MSVC support for Intel x64), and the second of the 3 forks (line 165) handles the availability of __int128 in the compiler.  I'm not certain of the existence of 64-bit machines that wouldn't be handled by the those other two forks.
I suspect that `_WIN64` isn't the right macro to test, though. For microsoft/STL, `_WIN64` is the macro defined for all 64-bit architectures (i.e. x64 and ARM64).

================
Comment at: libcxx/include/xcharconv_ryu.h:973
+
+// This table is generated by PrintFloatLookupTable.
+inline constexpr int __FLOAT_POW5_INV_BITCOUNT = 59;
----------------
ulfjack wrote:
> This should probably be adjusted?
Good catch; your Java implementation isn't present here, so this is a dangling reference. I'll simply drop this comment; anyone who needs to delve into the tables should be looking at your codebase (which they can find given the URL and `DERIVED FROM f2s.c` comments).

================
Comment at: libcxx/include/xcharconv_ryu.h:975
+inline constexpr int __FLOAT_POW5_INV_BITCOUNT = 59;
+inline constexpr uint64_t __FLOAT_POW5_INV_SPLIT[31] = {
+  576460752303423489u, 461168601842738791u, 368934881474191033u, 295147905179352826u,
----------------
ulfjack wrote:
> I have changed upstream to (optionally) make the 32-bit float code path to use the 64-bit double tables. Since everything is in one file here, that would allow you to remove these.
Ah, excellent! I think I found the relevant commit. I haven't taken an update from Ryu upstream since 2019-04-02, so I'll file a microsoft/STL issue to do that. It's a labor-intensive process and I haven't figured out a better way to do it yet (all of the PRs you've accepted have been //enormously// helpful, but the need to uglify literally every identifier makes rebasing difficult). Keeping libcxx in sync will be additional work, but hopefully not too much, if we can keep the divergence between microsoft/STL and libcxx small.

Would libcxx prefer to go ahead with porting the current implementation (and figure out how to port enhancements later), or should we prioritize updating microsoft/STL and then resume this port? As the float lookup tables are small, I believe that it should be OK to ship them like MSVC has been doing, for the near future.

================
Comment at: libcxx/include/xcharconv_ryu.h:1025
+  _LIBCPP_ASSERT(__p < 32, "");
+  // return __builtin_ctz(__value) >= __p;
+  return (__value & ((1u << __p) - 1)) == 0;
----------------
ulfjack wrote:
> Maybe remove this line? We can probably also remove it upstream or replace it with a comment that using builtin_ctz isn't faster.
Changed to `// __builtin_ctz doesn't appear to be faster here.` and similarly for 64-bit. I'll submit this to ulfjack/ryu too.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D70631/new/

https://reviews.llvm.org/D70631