[libc-commits] [libc] [libc][math] Implement atan2f correctly rounded to all rounding modes. (PR #86716)

via libc-commits libc-commits at lists.llvm.org
Thu Mar 28 10:56:18 PDT 2024


lntue wrote:

> on the performance side, on a Intel(R) Xeon(R) Silver 4214, I get:
> 
> ```
> zimmerma at croissant:~/svn/core-math$ LIBM=$L CORE_MATH_PERF_MODE=rdtsc ./perf.sh atan2f
> GNU libc version: 2.37
> GNU libc release: stable
> [####################] 100 %
> Ntrial = 20 ; Min = 21.403 + 0.730 clc/call; Median-Min = 0.526 clc/call; Max = 23.978 clc/call;
> [####################] 100 %
> Ntrial = 20 ; Min = 72.942 + 1.122 clc/call; Median-Min = 1.153 clc/call; Max = 75.873 clc/call;
> [####################] 100 %
> Ntrial = 20 ; Min = 25.242 + 0.662 clc/call; Median-Min = 0.613 clc/call; Max = 26.245 clc/call;
> zimmerma at croissant:~/svn/core-math$ PERF_ARGS=--latency LIBM=$L CORE_MATH_PERF_MODE=rdtsc ./perf.sh atan2f
> GNU libc version: 2.37
> GNU libc release: stable
> [####################] 100 %
> Ntrial = 20 ; Min = 59.164 + 1.289 clc/call; Median-Min = 1.221 clc/call; Max = 61.282 clc/call;
> [####################] 100 %
> Ntrial = 20 ; Min = 99.246 + 2.177 clc/call; Median-Min = 2.411 clc/call; Max = 103.292 clc/call;
> [####################] 100 %
> Ntrial = 20 ; Min = 63.599 + 0.823 clc/call; Median-Min = 0.804 clc/call; Max = 65.413 clc/call;
> ```
> 
> which means a reciprocal throughput of 25.2 for LLVM (against 21.4 for CORE-MATH and 72.9 for the GNU libc), and a latency of 63.6 (against 59.2 and 99.2)

Thanks Paul for checking it!  I've updated the patch to use a different polynomial for small quotients, so that a separate branch is not needed.  It helps slightly improve the performance.
@zimmermann6 

https://github.com/llvm/llvm-project/pull/86716


More information about the libc-commits mailing list