[libc-commits] [PATCH] D136799: [libc] Implement a high-precision floating point class.

Thu Oct 27 11:08:36 PDT 2022

orex added inline comments.

================
Comment at: libc/src/__support/FPUtil/big_float.h:32
+template <size_t MANTISSA_LENGTH> struct BigFloat {
+  static_assert(MANTISSA_LENGTH >= 64 && MANTISSA_LENGTH % 8 == 0,
+                "Unsupported mantissa length.");
----------------
static_assert in UInt is stricter than this one (https://github.com/llvm/llvm-project/blob/7f93ae808634e33e4dc9bce753c909aa5f9a6eb4/libc/src/__support/UInt.h#L25) Probably you should remove or adjust this?

================
Comment at: libc/src/__support/FPUtil/big_float.h:55
+  template <size_t M_LENGTH, typename = typename cpp::enable_if_t<
+                                 (M_LENGTH > MANTISSA_LENGTH), void>>
+  explicit operator BigFloat<M_LENGTH>() const {
----------------
Will it be better to put `M_LENGTH >= MANTISSA_LENGTH`? For "highly templated" cases it can be easy to generalize such conversion.

================
Comment at: libc/src/__support/FPUtil/big_float.h:75
+  // to be normals.
+  double to_double() const {
+    if (is_zero())
----------------
michaelrj wrote:
> same as above, I think you can make this either an operator or a templated `to_float<T>`
General idea. Probably it is good to have conversion to double double? It can be precise for 64 bits mantissa?

================
Comment at: libc/src/__support/FPUtil/big_float.h:84
+    // Move leading 1 (if non-zero) to the most significant bit of mantissa.
+    if (shift) {
+      e -= shift;
----------------
The if is not needed. The operations in the if statement probably can be even faster than check and conditional jump? 

================
Comment at: libc/src/__support/FPUtil/big_float.h:90
+
+    mantissa_type m_hi(m >> (MANTISSA_LENGTH - 53));
+    auto d_hi = FPBits<double>::create_value(
----------------
Nit: Probably it is good to have 53 54 and 55 like some constexpr?

================
Comment at: libc/src/__support/FPUtil/big_float.h:120
+template <size_t LENGTH>
+constexpr inline pair<BigFloat<LENGTH / 2>> split(const BigFloat<LENGTH> &a) {
+  constexpr size_t SHIFT_LENGTH = LENGTH / 2;
----------------
I'm not an expert in C++ standard, but pair with one template argument looks suspicious for me. Does C++ standard saying how it should be expanded? 

================
Comment at: libc/src/__support/FPUtil/big_float.h:123
+  using mtype = typename BigFloat<SHIFT_LENGTH>::mantissa_type;
+  return {{a.sign, a.exponent, static_cast<mtype>(a.mantissa)},
+          {a.sign, a.exponent + static_cast<int>(SHIFT_LENGTH),
----------------
Is this static cast by standard drop higher part of the variable?

================
Comment at: libc/src/__support/FPUtil/big_float.h:129
+template <size_t LENGTH>
+constexpr BigFloat<LENGTH> quick_add(BigFloat<LENGTH> a, BigFloat<LENGTH> b) {
+  if (unlikely(a.is_zero()))
----------------
Am I understand right, that rounding and sign of zero in this function is not important?

================
Comment at: libc/src/__support/FPUtil/big_float.h:136
+  // Move the leading 1's to the most significant bits.
+  int shift_a = static_cast<int>(a.mantissa.clz());
+  if (shift_a)
----------------
Probably to put this normalization to dedicated function?
1) It is quite general.
2) Remove the code duplication.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136799/new/

https://reviews.llvm.org/D136799