[libc-commits] [PATCH] D136799: [libc] Implement a high-precision floating point class.
Siva Chandra via Phabricator via libc-commits
libc-commits at lists.llvm.org
Tue Dec 13 12:41:19 PST 2022
sivachandra accepted this revision.
sivachandra added inline comments.
This revision is now accepted and ready to land.
================
Comment at: libc/src/__support/FPUtil/dyadic_float.h:47
+ exponent = x_bits.get_exponent() - FloatProperties<T>::MANTISSA_WIDTH;
+ mantissa = MantissaType(x_bits.get_explicit_mantissa());
+ normalize();
----------------
This can lead to truncation if `Bits` is less than `MANTISSA_WIDTH`. Should we add a static check here, may be extending the above `enable_if`:
```
template <typename T, cpp::enable_if_t<cpp::is_floating_point_v<T> && FloatProperties<T>::MANTISSA_WIDTH <= Bits, int> = 0>
```
================
Comment at: libc/src/__support/FPUtil/dyadic_float.h:121
+// output. The absolute errors compared to the mathematical sum is bounded by:
+// | quick_add(a, b) - (a + b) | < MSB(a + b) * 2^(-Bits + 2),
+// i.e., errors are up to 2 ULPs.
----------------
Add a mathematical expression which illustrates what actually is quick_add doing. Something like:
```
quick_add.exponent = max(...)
// aligning exponents - explain why
quick_add.mantissa = a.mantissa + b.mantissa;
```
================
Comment at: libc/src/__support/FPUtil/dyadic_float.h:174
+// compared to the mathematical product is bounded by:
+// 2 * errors of quick_mul_hi = 2 * (UInt<Bits>::WordCount - 1) in ULPs.
+// Assume inputs are normalized (by constructors or other functions) so that we
----------------
Same mathematical explanation as that for `quick_add`.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D136799/new/
https://reviews.llvm.org/D136799
More information about the libc-commits
mailing list