[all-commits] [llvm/llvm-project] 4973ee: [libc][math] Improve tanhf performance.
lntue via All-commits
all-commits at lists.llvm.org
Mon Sep 19 05:43:24 PDT 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 4973eee1228674c80f9441a36019c8a83ee3458a
https://github.com/llvm/llvm-project/commit/4973eee1228674c80f9441a36019c8a83ee3458a
Author: Tue Ly <lntue at google.com>
Date: 2022-09-19 (Mon, 19 Sep 2022)
Changed paths:
M libc/docs/math.rst
M libc/src/math/generic/exp2f.cpp
M libc/src/math/generic/explogxf.cpp
M libc/src/math/generic/explogxf.h
M libc/src/math/generic/tanhf.cpp
M libc/test/src/math/explogxf_test.cpp
Log Message:
-----------
[libc][math] Improve tanhf performance.
Optimize the core part of `tanhf` implementation that is to compute `e^x`
similar to https://reviews.llvm.org/D133870. Factor the constants and
polynomial approximation out so that it can be used for `exp10f`
Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700:
```
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanhf
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH reciprocal throughput : 13.377
System LIBC reciprocal throughput : 55.046
BEFORE:
LIBC reciprocal throughput : 75.674
LIBC reciprocal throughput : 33.242 (with `-msse4.2` flag)
LIBC reciprocal throughput : 25.927 (with `-mfma` flag)
AFTER:
LIBC reciprocal throughput : 26.359
LIBC reciprocal throughput : 18.888 (with `-msse4.2` flag)
LIBC reciprocal throughput : 14.243 (with `-mfma` flag)
$ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh tanhf --latency
GNU libc version: 2.35
GNU libc release: stable
CORE-MATH latency : 43.365
System LIBC latency : 123.499
BEFORE
LIBC latency : 112.968
LIBC latency : 104.908 (with `-msse4.2` flag)
LIBC latency : 92.310 (with `-mfma` flag)
AFTER
LIBC latency : 69.828
LIBC latency : 63.874 (with `-msse4.2` flag)
LIBC latency : 57.427 (with `-mfma` flag)
```
Reviewed By: orex, zimmermann6
Differential Revision: https://reviews.llvm.org/D134002
More information about the All-commits
mailing list