[libc-commits] [PATCH] D127046: [libc][math] fmod/fmodf implementation.
Tue Ly via Phabricator via libc-commits
libc-commits at lists.llvm.org
Thu Jun 23 10:08:34 PDT 2022
lntue added a comment.
Thanks for your patient and sorry it took a bit too long for reviewing! Just a few nits and it's good to go.
Also feel free to sync and add `fmod` to the math status page at `libc/docs/math.rst`.
================
Comment at: libc/src/__support/FPUtil/generic/FMod.h:25
+// Some common cases, like abs(x) < abs(y) or abs(x) < 1000 * abs(y) are
+// treated specially to increase performance. The part of checking special
+// cases, numbers NaN, INF etc. is inspired by GCC libm implementation.
----------------
orex wrote:
> lntue wrote:
> > nit: treated *separately*
> Sure.
Nit: you don't need to wrap the word separately in *'s, I just use it to highlight my suggestion.
================
Comment at: libc/src/__support/FPUtil/generic/FMod.h:31
+// Input x,y in the algorithm is represented (mathematically) like hx * 2^ix
+// and hy * 2^iy. This is an unambiguous number representation. For example:
+// h * 2^i = (2 * h) * 2^(i-1)
----------------
Nit: actually without extra conditions on `hx, hy, ix, iy`, the representation `x = hx * 2^ix` is ambiguous (i.e., not unique), just from your example. It is only unambiguous if `hx, hy, ix, iy` are integers and `hx, hy` are odd, i.e., not divisible by 2.
================
Comment at: libc/src/__support/FPUtil/generic/FMod.h:32-68
+// h * 2^i = (2 * h) * 2^(i-1)
+// The algorithm uses the fact that
+// r = a % b = (a % (N * b)) % b,
+// where N is positive integer number. a and b - positive. Let's adopt the
+// formula for representation above.
+// a = hx * 2^ix, b = hy * 2^iy, N = 2^k
+// r(k) = a % b = (hx * 2^ix) % (2 ^k * hy * 2^iy)
----------------
So here is my understanding of the algorithm and the math behind it, please correct me if I'm wrong:
The two main properties about the (integer) modulus operation, denoted by `mod` or `%`, that we will use are:
For any positive integers `a, b, c`:
```
1) a mod b = (a mod (b * c)) mod b
2) (a * c) mod (b * c) = (a mod b) * c
```
First, let write `x = hx * 2^ix` and `y = hy * 2^iy` with `hx, hy, ix, iy` are integers (assumed to be positive for simplification).
Then the naive implementation of the `fmod` function with a simple `for/while` loop:
```
while (ix > iy) {
hx = (hx * 2) % hy;
--ix;
}
```
is mathematically equivalent to:
```
x mod y = (hx * 2^ix) mod (hy * 2^iy)
= ((hx * 2^(ix - iy)) mod hy) * 2^iy (apply property 2)
= (( ... (((hx * 2^(ix - iy)) mod (hy * 2^(ix - iy - 1))) mod (hy * 2^(ix - iy - 2)) ... ) mod hy) * 2^iy (apply property 1 repeatedly)
= (( ... (( (hx * 2) mod hy) * 2^(ix - iy - 1)) mod (hy * 2^(ix - iy - 2)) ... ) mod hy) * 2^iy (apply property 2)
= (( ... (( (hx * 2) mod hy) * 2 ) mod hy ) ... ) * 2) mod hy) * 2^iy (apply property 2 repeatedly)
```
And the total number of iterations is `ix - iy`.
On the other hand, your algorithm exploits the fact that hx, hy are the mantissas of floating point numbers, which use less bits than the storage integers: 24 / 32 for floats and 53 / 64 for doubles, so if in each step of the iteration, we can left shift `hx` as many bits as the storage integer type can hold, the exponent reduction per step will be at least 32 - 24 = 8 for floats and 64 - 53 = 9 for doubles:
```
x mod y = (hx * 2^ix) mod (hy * 2^iy)
= ((hx * 2^(ix - iy) mod hy) * 2^iy
= (( ... (( (( hx * 2^r1) mod hy ) * 2^r2) mod hy ) ... ) * 2^rk) mod hy) * 2^iy
```
where `r1 + r2 + ... + rk = ix - iy` and `ri >= sizeof(UInt) - hy length` for `i = 1..k`.
And so the number of iterations is at most by: `(ix - iy) / (sizeof(UInt) - mantissa_length)`.
Feel free to use this to update the comments if it makes sense.
================
Comment at: libc/src/__support/FPUtil/generic/FMod.h:116
+public:
+ inline constexpr static intU_t execute(int n, int hyltzeroes, intU_t hx,
+ intU_t hy) {
----------------
`exp_diff` for `n` and `max_shift` for `hyltzeroes` or other more descriptive names.
================
Comment at: libc/src/__support/FPUtil/generic/FMod.h:178
+
+ int ix = sx.get_unbiased_exponent();
+ int iy = sy.get_unbiased_exponent();
----------------
Maybe using `mx, ex` or `m_x, e_x` which are closer to mantissa and exponent of `x`?
================
Comment at: libc/src/__support/FPUtil/generic/FMod.h:209
+ hy = sy.get_mantissa();
+ lzhy = clz(hy);
+ }
----------------
Use more descriptive names such as `lead_zeros_hy`, `tail_zeros_hy`?
================
Comment at: libc/src/__support/FPUtil/generic/FMod.h:214
+ int tzhy = ctz(hy);
+ int hyltzeroes = lzhy + tzhy;
+ // n > 0 byt conditions above
----------------
`max_shift` / `max_scale_factor` maybe more descriptive?
================
Comment at: libc/test/src/math/exhaustive/CMakeLists.txt:189
+add_fp_unittest(
+ FMod_test
+ NO_RUN_POSTBUILD
----------------
Use lower case for the target name `fmod_test`.
================
Comment at: libc/test/src/math/exhaustive/CMakeLists.txt:195
+ SRCS
+ FMod_test.cpp
+ DEPENDS
----------------
Use lower case for the test file.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D127046/new/
https://reviews.llvm.org/D127046
More information about the libc-commits
mailing list