[all-commits] [llvm/llvm-project] a68bbf: [libc][math] Implement double precision log functi...

lntue via All-commits all-commits at lists.llvm.org
Tue May 23 07:35:41 PDT 2023


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: a68bbf42fa680c7159165be8915d73e012c50f96
      https://github.com/llvm/llvm-project/commit/a68bbf42fa680c7159165be8915d73e012c50f96
  Author: Tue Ly <lntue at google.com>
  Date:   2023-05-23 (Tue, 23 May 2023)

  Changed paths:
    M libc/config/darwin/arm/entrypoints.txt
    M libc/config/linux/aarch64/entrypoints.txt
    M libc/config/linux/x86_64/entrypoints.txt
    M libc/config/windows/entrypoints.txt
    M libc/spec/stdc.td
    M libc/src/math/CMakeLists.txt
    M libc/src/math/generic/CMakeLists.txt
    M libc/src/math/generic/common_constants.cpp
    M libc/src/math/generic/common_constants.h
    A libc/src/math/generic/log.cpp
    M libc/src/math/generic/log10.cpp
    A libc/src/math/generic/log_range_reduction.h
    A libc/src/math/log.h
    M libc/test/src/math/CMakeLists.txt
    A libc/test/src/math/log_test.cpp

  Log Message:
  -----------
  [libc][math] Implement double precision log function correctly rounded to all rounding modes.

Implement double precision log function correctly rounded to all
rounding modes.

See https://reviews.llvm.org/D150014 for a more detail description of the algorithm.

**Performance**

  - For `0.5 <= x <= 2`, the fast pass hitting rate is about 99.93%.

  - Reciprocal throughput from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log
GNU libc version: 2.35
GNU libc release: stable

-- CORE-MATH reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 17.465 + 0.596 clc/call; Median-Min = 0.602 clc/call; Max = 18.389 clc/call;

-- CORE-MATH reciprocal throughput -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 54.961 + 2.606 clc/call; Median-Min = 2.180 clc/call; Max = 59.583 clc/call;

-- System LIBC reciprocal throughput --
[####################] 100 %
Ntrial = 20 ; Min = 12.608 + 0.276 clc/call; Median-Min = 0.359 clc/call; Max = 13.147 clc/call;

-- LIBC reciprocal throughput -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 20.952 + 0.468 clc/call; Median-Min = 0.602 clc/call; Max = 21.881 clc/call;

-- LIBC reciprocal throughput -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 18.569 + 0.552 clc/call; Median-Min = 0.601 clc/call; Max = 19.259 clc/call;

```
  - Latency from CORE-MATH's perf tool on Ryzen 5900X:
```
$ ./perf.sh log --latency
GNU libc version: 2.35
GNU libc release: stable

-- CORE-MATH latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 48.431 + 0.699 clc/call; Median-Min = 0.073 clc/call; Max = 51.269 clc/call;

-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
[####################] 100 %
Ntrial = 20 ; Min = 64.865 + 3.235 clc/call; Median-Min = 3.475 clc/call; Max = 71.788 clc/call;

-- System LIBC latency --
[####################] 100 %
Ntrial = 20 ; Min = 42.151 + 2.090 clc/call; Median-Min = 2.270 clc/call; Max = 44.773 clc/call;

-- LIBC latency -- with FMA
[####################] 100 %
Ntrial = 20 ; Min = 35.266 + 0.479 clc/call; Median-Min = 0.373 clc/call; Max = 36.798 clc/call;

-- LIBC latency -- without FMA
[####################] 100 %
Ntrial = 20 ; Min = 48.518 + 0.484 clc/call; Median-Min = 0.500 clc/call; Max = 49.896 clc/call;
```
  - Accurate pass latency:
```
$ ./perf.sh log --latency --simple_stat
GNU libc version: 2.35
GNU libc release: stable

-- CORE-MATH latency -- with FMA
598.306

-- CORE-MATH latency -- without FMA (-march=x86-64-v2)
632.925

-- LIBC latency -- with FMA
455.632

-- LIBC latency -- without FMA
488.564
```

Reviewed By: zimmermann6

Differential Revision: https://reviews.llvm.org/D150131




More information about the All-commits mailing list