[libc-commits] [libc] [libc][math] Implement atan2f correctly rounded to all rounding modes. (PR #86716)

Fri Mar 29 08:35:14 PDT 2024

lntue wrote:

> about split_d, I wonder why you use C=0x1p33+1, which makes hi fit into 53-33=20 bits, thus hi * s * k fits into 20+24+4=48 bits. You could make hi fit into 5 more bits with C=0x1p28+1. Also note that Veltkamp's algorithm also works for directed roundings, see https://inria.hal.science/hal-04480440

Hi Paul, sorry the comment was not entirely correct and need to be updated.  The correct requirement for splitting is that the following needs to be exact:
```
   hi * den_r = hi * (k_d * num_d + den_d)
```
where
```
  num_d, den_d are single precision
  1/16 <= k_d <= 1
```
So the correct precision of `den_r` is:
```
  prec(den_r) = (1 + log2(msb(den_d)) - log2(lsb(k_d * num_d))) + 1                    (the last +1 is for overflow from addition)
              = (1 + log2(msb(num_d)) + 4 - log2(lsb(k_d)) - log2(lsb(num_d))) + 1     (num_d / den_d >= 1/16)
              = (1 + 23 + log2(lsb(num_d)) + 4 - (-4) - log2(lsb(num_d))) + 1
              = 33
```
The following example will demonstrate that 33 is sharp:
For `atan2f( 0x1.781fcp+28f, 0x1.dcb3cap+23f )`, if we use `C = 0x1.0p32 + 1.0` in the Veltkamp's algorithm
```
  num_d = 0x1.dcb3cap+23
  den_d = 0x1.781fcp+28
  num_r = -0x1.138bb6p+23
  den_r = 0x1.790e19e5p+28
   q = -0x1.76294b4d835fap-6
  hi = -0x1.76295p-6
                hi * den_r = -0x1.138bb9758de248p23
  round(hi * den_r, D, RN) = -0x1.138bb9758de24p23
```

https://github.com/llvm/llvm-project/pull/86716