[all-commits] [llvm/llvm-project] a68bbf: [libc][math] Implement double precision log functi...
lntue via All-commits
all-commits at lists.llvm.org
Tue May 23 07:35:41 PDT 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: a68bbf42fa680c7159165be8915d73e012c50f96
https://github.com/llvm/llvm-project/commit/a68bbf42fa680c7159165be8915d73e012c50f96
Author: Tue Ly <lntue at google.com>
Date: 2023-05-23 (Tue, 23 May 2023)
Changed paths:
M libc/config/darwin/arm/entrypoints.txt
M libc/config/linux/aarch64/entrypoints.txt
M libc/config/linux/x86_64/entrypoints.txt
M libc/config/windows/entrypoints.txt
M libc/spec/stdc.td
M libc/src/math/CMakeLists.txt
M libc/src/math/generic/CMakeLists.txt
M libc/src/math/generic/common_constants.cpp
M libc/src/math/generic/common_constants.h
A libc/src/math/generic/log.cpp
M libc/src/math/generic/log10.cpp
A libc/src/math/generic/log_range_reduction.h
A libc/src/math/log.h
M libc/test/src/math/CMakeLists.txt
A libc/test/src/math/log_test.cpp
Log Message:
-----------
[libc][math] Implement double precision log function correctly rounded to all rounding modes.
Implement double precision log function correctly rounded to all
rounding modes.
See https://reviews.llvm.org/D150014 for a more detail description of the algorithm.
**Performance**
- For `0.5 <= x <= 2`, the fast pass hitting rate is about 99.93%.
- Reciprocal throughput from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 17.465 + 0.596 clc/call; Median-Min = 0.602 clc/call; Max = 18.389 clc/call;
-- CORE-MATH reciprocal throughput -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 54.961 + 2.606 clc/call; Median-Min = 2.180 clc/call; Max = 59.583 clc/call;
-- System LIBC reciprocal throughput --
[####################] 100 %
Ntrial = 20 ; Min = 12.608 + 0.276 clc/call; Median-Min = 0.359 clc/call; Max = 13.147 clc/call;
-- LIBC reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 20.952 + 0.468 clc/call; Median-Min = 0.602 clc/call; Max = 21.881 clc/call;
-- LIBC reciprocal throughput -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 18.569 + 0.552 clc/call; Median-Min = 0.601 clc/call; Max = 19.259 clc/call;
```
- Latency from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log --latency
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 48.431 + 0.699 clc/call; Median-Min = 0.073 clc/call; Max = 51.269 clc/call;
-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 64.865 + 3.235 clc/call; Median-Min = 3.475 clc/call; Max = 71.788 clc/call;
-- System LIBC latency --
[####################] 100 %
Ntrial = 20 ; Min = 42.151 + 2.090 clc/call; Median-Min = 2.270 clc/call; Max = 44.773 clc/call;
-- LIBC latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 35.266 + 0.479 clc/call; Median-Min = 0.373 clc/call; Max = 36.798 clc/call;
-- LIBC latency -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 48.518 + 0.484 clc/call; Median-Min = 0.500 clc/call; Max = 49.896 clc/call;
```
- Accurate pass latency:
```
$ ./perf.sh log --latency --simple_stat
GNU libc version: 2.35
GNU libc release: stable
-- CORE-MATH latency -- with FMA
598.306
-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
632.925
-- LIBC latency -- with FMA
455.632
-- LIBC latency -- without FMA
488.564
```
Reviewed By: zimmermann6
Differential Revision: https://reviews.llvm.org/D150131
More information about the All-commits
mailing list