[libc-commits] [PATCH] D136799: [libc] Implement a high-precision floating point class.

Mon Dec 12 16:00:03 PST 2022

sivachandra added a comment.

Mostly LGTM. I am not yet accepting as I have asked for definitions.

================
Comment at: libc/src/__support/FPUtil/dyadic_float.h:1
+//===-- A class to store high precision floating point numbers --*- C++ -*-===//
+//
----------------
Will this class be ever used from outside of the `math` directory? If no, then it probably should live in `src/math`.

================
Comment at: libc/src/__support/FPUtil/dyadic_float.h:21
+
+// A class for computations of high precision floating point.  We store the
+// value in dyadic format: The exponent field will be the exponent value of the
----------------
This sentence feels incomplete.

================
Comment at: libc/src/__support/FPUtil/dyadic_float.h:24
+// least significant bit of the mantissa.  So the real value that is stored is:
+//   real value = (-1)^sign * 2^exponent * (mantissa as unsigned integer)
+template <size_t Bits> struct DyadicFloat {
----------------
One can read the code but couple of things should be described here I think:

1. That bits is the number of bits of mantissa and that it should be a multiple of 64.
2. Define the meaning of normalization.
3. Add a note next to the constructors that the object created is normalized as per the above definition.

================
Comment at: libc/src/__support/FPUtil/dyadic_float.h:26
+template <size_t Bits> struct DyadicFloat {
+  using mantissa_type = __llvm_libc::cpp::UInt<Bits>;
+
----------------
Type name should be `MantissaType`.

================
Comment at: libc/src/__support/FPUtil/dyadic_float.h:34
+
+  template <typename T, cpp::enable_if_t<cpp::is_floating_point_v<T>, int> = 0>
+  DyadicFloat(T x) {
----------------
This should like have additional constraints. For example, should this allow a quad-precision number to be converted to `DyadicFloat<64>`?

================
Comment at: libc/src/__support/FPUtil/dyadic_float.h:73
+
+  // Assume that it is already normalized and output is also normal.
+  // Output is rounded correctly with respect to the current rounding mode.
----------------
Why is this assumption required?

================
Comment at: libc/src/__support/FPUtil/dyadic_float.h:108
+
+// Quick add.
+// Assume inputs are normalized.
----------------
Add definition of "Quick Add".

================
Comment at: libc/src/__support/FPUtil/dyadic_float.h:109
+// Quick add.
+// Assume inputs are normalized.
+// Output will be normalized.
----------------
Again, why?

================
Comment at: libc/src/__support/FPUtil/dyadic_float.h:159
+// Output will be normalized.
+// Error bound: 2 * errors of quick_mul_hi.
+template <size_t Bits>
----------------
Same comments as those for `quick_add`.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136799/new/

https://reviews.llvm.org/D136799