[all-commits] [llvm/llvm-project] 82d6e7: [libc] Implement tanf function correctly rounded f...
lntue via All-commits
all-commits at lists.llvm.org
Fri Aug 12 06:21:27 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 82d6e7704899ee4f9265680e38465f6e4da9c223
https://github.com/llvm/llvm-project/commit/82d6e7704899ee4f9265680e38465f6e4da9c223
Author: Tue Ly <lntue at google.com>
Date: 2022-08-12 (Fri, 12 Aug 2022)
Changed paths:
M libc/config/darwin/arm/entrypoints.txt
M libc/config/linux/aarch64/entrypoints.txt
M libc/config/linux/x86_64/entrypoints.txt
M libc/docs/math.rst
M libc/src/math/CMakeLists.txt
M libc/src/math/generic/CMakeLists.txt
M libc/src/math/generic/sincosf_utils.h
A libc/src/math/generic/tanf.cpp
A libc/src/math/tanf.h
M libc/test/src/math/CMakeLists.txt
M libc/test/src/math/exhaustive/CMakeLists.txt
A libc/test/src/math/exhaustive/tanf_test.cpp
A libc/test/src/math/tanf_test.cpp
Log Message:
-----------
[libc] Implement tanf function correctly rounded for all rounding modes.
Implement tanf function correctly rounded for all rounding modes.
We use the range reduction that is shared with `sinf`, `cosf`, and `sincosf`:
```
k = round(x * 32/pi) and y = x * (32/pi) - k.
```
Then we use the tangent of sum formula:
```
tan(x) = tan((k + y)* pi/32) = tan((k mod 32) * pi / 32 + y * pi/32)
= (tan((k mod 32) * pi/32) + tan(y * pi/32)) / (1 - tan((k mod 32) * pi/32) * tan(y * pi/32))
```
We need to make a further reduction when `k mod 32 >= 16` due to the pole at `pi/2` of `tan(x)` function:
```
if (k mod 32 >= 16): k = k - 31, y = y - 1.0
```
And to compute the final result, we store `tan(k * pi/32)` for `k = -15..15` in a table of 32 double values,
and evaluate `tan(y * pi/32)` with a degree-11 minimax odd polynomial generated by Sollya with:
```
> P = fpminimax(tan(y * pi/32)/y, [|0, 2, 4, 6, 8, 10|], [|D...|], [0, 1.5]);
```
Performance benchmark using `perf` tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanf
CORE-MATH reciprocal throughput : 18.586
System LIBC reciprocal throughput : 50.068
LIBC reciprocal throughput : 33.823
LIBC reciprocal throughput : 25.161 (with `-msse4.2` flag)
LIBC reciprocal throughput : 19.157 (with `-mfma` flag)
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanf --latency
GNU libc version: 2.31
GNU libc release: stable
CORE-MATH latency : 55.630
System LIBC latency : 106.264
LIBC latency : 96.060
LIBC latency : 90.727 (with `-msse4.2` flag)
LIBC latency : 82.361 (with `-mfma` flag)
```
Reviewed By: orex
Differential Revision: https://reviews.llvm.org/D131715
More information about the All-commits
mailing list