[libc-commits] [libc] [llvm] [libc][math] Implement single precision lgamma function (PR #205490)
via libc-commits
libc-commits at lists.llvm.org
Tue Jun 23 23:22:59 PDT 2026
llvmorg-github-actions[bot] wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-backend-risc-v
Author: Anonmiraj (AnonMiraj)
<details>
<summary>Changes</summary>
The implementation reduces to `|x|` and selects a path
- **Tiny** ($|x| < 2^{-23}$): truncated Laurent series $\text{lgamma}(x) = -\log|x| - \gamma\,|x|$.
- **Small** ($|x| < 0.66$): degree-22 Chebyshev for $h(t) = \frac{\text{lgamma}(t) + \log(t)}{t}$, recovering $\text{lgamma}(t) = t \cdot h(t) - \log(t)$.
- **Medium** ($|x| \in [0.66, 3.37]$): degree-22 Chebyshev split into three sub-ranges that factor out the function's roots —
- $[0.66, 1)$: $\text{lgamma}(t) = (t-1)\,P(t)$
- $[1, 2)$: $\text{lgamma}(t) = (t-1)(t-2)\,P(t)$
- $[2, 3.37)$: $\text{lgamma}(t) = (t-2)\,P(t)$
- **Stirling** ($|x| > 3.37$): $(x-0.5)\log(x) - x + \frac{\log(2\pi)}{2}$ plus 1/2/4/8-term Bernoulli corrections selected by sub-range (the dominant term is evaluated in double-double for large $|x|$).
Negative $x$ uses the reflection formula $\text{lgamma}(x) = \log\pi - \log|\sin(\pi x)| - \text{lgamma}(|x|)$.
Exhaustive tests pass in all 4 rounding modes.
```
-- CORE-MATH reciprocal throughput --
Ntrial = 20 ; Min = 54.476 + 7.003 clc/call; Median-Min = 1.211 clc/call; Max = 67.906 clc/call;
-- System LIBC reciprocal throughput --
Ntrial = 20 ; Min = 70.125 + 8.204 clc/call; Median-Min = 4.416 clc/call; Max = 90.469 clc/call;
```
Depends on #<!-- -->192834 so lgammaf16 must be merged first.
---
Patch is 67.45 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/205490.diff
38 Files Affected:
- (modified) libc/config/baremetal/aarch64/entrypoints.txt (+2)
- (modified) libc/config/baremetal/arm/entrypoints.txt (+2)
- (modified) libc/config/baremetal/riscv/entrypoints.txt (+2)
- (modified) libc/config/darwin/aarch64/entrypoints.txt (+1)
- (modified) libc/config/gpu/amdgpu/entrypoints.txt (+1)
- (modified) libc/config/gpu/nvptx/entrypoints.txt (+1)
- (modified) libc/config/linux/aarch64/entrypoints.txt (+2)
- (modified) libc/config/linux/riscv/entrypoints.txt (+2)
- (modified) libc/config/linux/x86_64/entrypoints.txt (+2)
- (modified) libc/include/math.yaml (+7)
- (modified) libc/shared/math.h (+2)
- (added) libc/shared/math/lgammaf.h (+23)
- (added) libc/shared/math/lgammaf16.h (+29)
- (modified) libc/src/__support/math/CMakeLists.txt (+50)
- (added) libc/src/__support/math/gamma_util.h (+116)
- (added) libc/src/__support/math/lgammaf.h (+374)
- (added) libc/src/__support/math/lgammaf16.h (+220)
- (modified) libc/src/math/CMakeLists.txt (+3)
- (modified) libc/src/math/generic/CMakeLists.txt (+26)
- (added) libc/src/math/generic/lgammaf.cpp (+18)
- (added) libc/src/math/generic/lgammaf16.cpp (+20)
- (added) libc/src/math/lgammaf.h (+20)
- (added) libc/src/math/lgammaf16.h (+21)
- (modified) libc/test/shared/CMakeLists.txt (+1)
- (modified) libc/test/shared/shared_math_test.cpp (+1)
- (modified) libc/test/src/math/CMakeLists.txt (+23)
- (modified) libc/test/src/math/exhaustive/CMakeLists.txt (+16)
- (added) libc/test/src/math/exhaustive/lgammaf_test.cpp (+35)
- (added) libc/test/src/math/lgammaf16_test.cpp (+42)
- (added) libc/test/src/math/lgammaf_test.cpp (+100)
- (modified) libc/test/src/math/smoke/CMakeLists.txt (+22)
- (added) libc/test/src/math/smoke/lgammaf16_test.cpp (+61)
- (added) libc/test/src/math/smoke/lgammaf_test.cpp (+62)
- (modified) libc/utils/MPFRWrapper/MPCommon.cpp (+7)
- (modified) libc/utils/MPFRWrapper/MPCommon.h (+1)
- (modified) libc/utils/MPFRWrapper/MPFRUtils.cpp (+2)
- (modified) libc/utils/MPFRWrapper/MPFRUtils.h (+1)
- (modified) utils/bazel/llvm-project-overlay/libc/BUILD.bazel (+58)
``````````diff
diff --git a/libc/config/baremetal/aarch64/entrypoints.txt b/libc/config/baremetal/aarch64/entrypoints.txt
index dcb50135232e2..3f05a2f0ed004 100644
--- a/libc/config/baremetal/aarch64/entrypoints.txt
+++ b/libc/config/baremetal/aarch64/entrypoints.txt
@@ -478,6 +478,7 @@ set(TARGET_LIBM_ENTRYPOINTS
libc.src.math.ldexp
libc.src.math.ldexpf
libc.src.math.ldexpl
+ libc.src.math.lgammaf
libc.src.math.llogb
libc.src.math.llogbf
libc.src.math.llogbl
@@ -655,6 +656,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
libc.src.math.iscanonicalf16
libc.src.math.issignalingf16
libc.src.math.ldexpf16
+ libc.src.math.lgammaf16
libc.src.math.llogbf16
libc.src.math.llrintf16
libc.src.math.llroundf16
diff --git a/libc/config/baremetal/arm/entrypoints.txt b/libc/config/baremetal/arm/entrypoints.txt
index fac62bac939cc..bc70b5797af84 100644
--- a/libc/config/baremetal/arm/entrypoints.txt
+++ b/libc/config/baremetal/arm/entrypoints.txt
@@ -490,6 +490,7 @@ set(TARGET_LIBM_ENTRYPOINTS
libc.src.math.ldexp
libc.src.math.ldexpf
libc.src.math.ldexpl
+ libc.src.math.lgammaf
libc.src.math.llogb
libc.src.math.llogbf
libc.src.math.llogbl
@@ -667,6 +668,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
libc.src.math.iscanonicalf16
libc.src.math.issignalingf16
libc.src.math.ldexpf16
+ libc.src.math.lgammaf16
libc.src.math.llogbf16
libc.src.math.llrintf16
libc.src.math.llroundf16
diff --git a/libc/config/baremetal/riscv/entrypoints.txt b/libc/config/baremetal/riscv/entrypoints.txt
index a3b96225ff09d..33fc0591bc84c 100644
--- a/libc/config/baremetal/riscv/entrypoints.txt
+++ b/libc/config/baremetal/riscv/entrypoints.txt
@@ -486,6 +486,7 @@ set(TARGET_LIBM_ENTRYPOINTS
libc.src.math.ldexp
libc.src.math.ldexpf
libc.src.math.ldexpl
+ libc.src.math.lgammaf
libc.src.math.llogb
libc.src.math.llogbf
libc.src.math.llogbl
@@ -663,6 +664,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
libc.src.math.iscanonicalf16
libc.src.math.issignalingf16
libc.src.math.ldexpf16
+ libc.src.math.lgammaf16
libc.src.math.llogbf16
libc.src.math.llrintf16
libc.src.math.llroundf16
diff --git a/libc/config/darwin/aarch64/entrypoints.txt b/libc/config/darwin/aarch64/entrypoints.txt
index 914d2b7918da1..cdd04d8bbb74f 100644
--- a/libc/config/darwin/aarch64/entrypoints.txt
+++ b/libc/config/darwin/aarch64/entrypoints.txt
@@ -476,6 +476,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
libc.src.math.iscanonicalf16
libc.src.math.issignalingf16
libc.src.math.ldexpf16
+ libc.src.math.lgammaf16
libc.src.math.llogbf16
libc.src.math.llrintf16
libc.src.math.llroundf16
diff --git a/libc/config/gpu/amdgpu/entrypoints.txt b/libc/config/gpu/amdgpu/entrypoints.txt
index c76900a8371f7..c1b29ff893891 100644
--- a/libc/config/gpu/amdgpu/entrypoints.txt
+++ b/libc/config/gpu/amdgpu/entrypoints.txt
@@ -597,6 +597,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
libc.src.math.iscanonicalf16
libc.src.math.issignalingf16
libc.src.math.ldexpf16
+ libc.src.math.lgammaf16
libc.src.math.llogbf16
libc.src.math.llrintf16
libc.src.math.llroundf16
diff --git a/libc/config/gpu/nvptx/entrypoints.txt b/libc/config/gpu/nvptx/entrypoints.txt
index 6538732f785f5..73e2e62c9f3fa 100644
--- a/libc/config/gpu/nvptx/entrypoints.txt
+++ b/libc/config/gpu/nvptx/entrypoints.txt
@@ -599,6 +599,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
libc.src.math.iscanonicalf16
libc.src.math.issignalingf16
libc.src.math.ldexpf16
+ libc.src.math.lgammaf16
libc.src.math.llogbf16
libc.src.math.llrintf16
libc.src.math.llroundf16
diff --git a/libc/config/linux/aarch64/entrypoints.txt b/libc/config/linux/aarch64/entrypoints.txt
index c7458ef36c941..64d87df1dc32f 100644
--- a/libc/config/linux/aarch64/entrypoints.txt
+++ b/libc/config/linux/aarch64/entrypoints.txt
@@ -595,6 +595,7 @@ set(TARGET_LIBM_ENTRYPOINTS
libc.src.math.ldexp
libc.src.math.ldexpf
libc.src.math.ldexpl
+ libc.src.math.lgammaf
libc.src.math.llogb
libc.src.math.llogbf
libc.src.math.llogbl
@@ -769,6 +770,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
libc.src.math.iscanonicalf16
libc.src.math.issignalingf16
libc.src.math.ldexpf16
+ libc.src.math.lgammaf16
libc.src.math.llogbf16
libc.src.math.llrintf16
libc.src.math.llroundf16
diff --git a/libc/config/linux/riscv/entrypoints.txt b/libc/config/linux/riscv/entrypoints.txt
index e3336420cbbcc..df2441dff5056 100644
--- a/libc/config/linux/riscv/entrypoints.txt
+++ b/libc/config/linux/riscv/entrypoints.txt
@@ -604,6 +604,7 @@ set(TARGET_LIBM_ENTRYPOINTS
libc.src.math.ldexp
libc.src.math.ldexpf
libc.src.math.ldexpl
+ libc.src.math.lgammaf
libc.src.math.llogb
libc.src.math.llogbf
libc.src.math.llogbl
@@ -785,6 +786,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
libc.src.math.iscanonicalf16
libc.src.math.issignalingf16
libc.src.math.ldexpf16
+ libc.src.math.lgammaf16
libc.src.math.llogbf16
libc.src.math.llrintf16
libc.src.math.llroundf16
diff --git a/libc/config/linux/x86_64/entrypoints.txt b/libc/config/linux/x86_64/entrypoints.txt
index 3c3b0e835429f..8d49c2674de9c 100644
--- a/libc/config/linux/x86_64/entrypoints.txt
+++ b/libc/config/linux/x86_64/entrypoints.txt
@@ -664,6 +664,7 @@ set(TARGET_LIBM_ENTRYPOINTS
libc.src.math.ldexp
libc.src.math.ldexpf
libc.src.math.ldexpl
+ libc.src.math.lgammaf
libc.src.math.llogb
libc.src.math.llogbf
libc.src.math.llogbl
@@ -849,6 +850,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
libc.src.math.iscanonicalf16
libc.src.math.issignalingf16
libc.src.math.ldexpf16
+ libc.src.math.lgammaf16
libc.src.math.llogbf16
libc.src.math.llrintf16
libc.src.math.llroundf16
diff --git a/libc/include/math.yaml b/libc/include/math.yaml
index 3984a665b234b..67d19399b99ae 100644
--- a/libc/include/math.yaml
+++ b/libc/include/math.yaml
@@ -1682,6 +1682,13 @@ functions:
arguments:
- type: long double
- type: int *
+ - name: lgammaf16
+ standards:
+ - stdc
+ return_type: _Float16
+ arguments:
+ - type: _Float16
+ guard: LIBC_TYPES_HAS_FLOAT16
- name: llogb
standards:
- stdc
diff --git a/libc/shared/math.h b/libc/shared/math.h
index 2baeb07294a3b..cc95ab2b74234 100644
--- a/libc/shared/math.h
+++ b/libc/shared/math.h
@@ -305,6 +305,8 @@
#include "math/ldexpf128.h"
#include "math/ldexpf16.h"
#include "math/ldexpl.h"
+#include "math/lgammaf.h"
+#include "math/lgammaf16.h"
#include "math/llogb.h"
#include "math/llogbbf16.h"
#include "math/llogbf.h"
diff --git a/libc/shared/math/lgammaf.h b/libc/shared/math/lgammaf.h
new file mode 100644
index 0000000000000..6976109fe9832
--- /dev/null
+++ b/libc/shared/math/lgammaf.h
@@ -0,0 +1,23 @@
+//===-- Shared lgammaf function ---------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SHARED_MATH_LGAMMAF_H
+#define LLVM_LIBC_SHARED_MATH_LGAMMAF_H
+
+#include "shared/libc_common.h"
+#include "src/__support/math/lgammaf.h"
+
+namespace LIBC_NAMESPACE_DECL {
+namespace shared {
+
+using math::lgammaf;
+
+} // namespace shared
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LLVM_LIBC_SHARED_MATH_LGAMMAF_H
diff --git a/libc/shared/math/lgammaf16.h b/libc/shared/math/lgammaf16.h
new file mode 100644
index 0000000000000..07de5ef6dd13d
--- /dev/null
+++ b/libc/shared/math/lgammaf16.h
@@ -0,0 +1,29 @@
+//===-- Shared lgammaf16 function -------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SHARED_MATH_LGAMMAF16_H
+#define LLVM_LIBC_SHARED_MATH_LGAMMAF16_H
+
+#include "include/llvm-libc-macros/float16-macros.h"
+
+#ifdef LIBC_TYPES_HAS_FLOAT16
+
+#include "shared/libc_common.h"
+#include "src/__support/math/lgammaf16.h"
+
+namespace LIBC_NAMESPACE_DECL {
+namespace shared {
+
+using math::lgammaf16;
+
+} // namespace shared
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LIBC_TYPES_HAS_FLOAT16
+
+#endif // LLVM_LIBC_SHARED_MATH_LGAMMAF16_H
diff --git a/libc/src/__support/math/CMakeLists.txt b/libc/src/__support/math/CMakeLists.txt
index 9e3ec26cdc881..b568ef151f85e 100644
--- a/libc/src/__support/math/CMakeLists.txt
+++ b/libc/src/__support/math/CMakeLists.txt
@@ -3713,6 +3713,17 @@ add_header_library(
libc.src.__support.FPUtil.generic.sqrt
)
+add_header_library(
+ gamma_util
+ HDRS
+ gamma_util.h
+ DEPENDS
+ libc.src.__support.CPP.bit
+ libc.src.__support.FPUtil.fp_bits
+ libc.src.__support.FPUtil.multiply_add
+ libc.src.__support.macros.config
+)
+
add_header_library(
getpayload
HDRS
@@ -3881,6 +3892,45 @@ add_header_library(
libc.include.llvm-libc-types.float128
)
+add_header_library(
+ lgammaf
+ HDRS
+ lgammaf.h
+ DEPENDS
+ libc.hdr.errno_macros
+ libc.hdr.fenv_macros
+ libc.src.__support.FPUtil.cast
+ libc.src.__support.FPUtil.double_double
+ libc.src.__support.FPUtil.except_value_utils
+ libc.src.__support.FPUtil.fenv_impl
+ libc.src.__support.FPUtil.fp_bits
+ libc.src.__support.FPUtil.multiply_add
+ libc.src.__support.FPUtil.nearest_integer_operations
+ libc.src.__support.FPUtil.polyeval
+ libc.src.__support.macros.config
+ libc.src.__support.macros.optimization
+ libc.src.__support.math.gamma_util
+ libc.src.__support.math.log
+)
+
+add_header_library(
+ lgammaf16
+ HDRS
+ lgammaf16.h
+ DEPENDS
+ libc.include.llvm-libc-macros.float16_macros
+ libc.src.__support.FPUtil.cast
+ libc.src.__support.FPUtil.fenv_impl
+ libc.src.__support.FPUtil.fp_bits
+ libc.src.__support.FPUtil.multiply_add
+ libc.src.__support.FPUtil.nearest_integer_operations
+ libc.src.__support.FPUtil.polyeval
+ libc.src.__support.macros.config
+ libc.src.__support.macros.optimization
+ libc.src.__support.macros.properties.types
+ libc.src.__support.math.gamma_util
+)
+
add_header_library(
llogbf16
HDRS
diff --git a/libc/src/__support/math/gamma_util.h b/libc/src/__support/math/gamma_util.h
new file mode 100644
index 0000000000000..09024a521ff77
--- /dev/null
+++ b/libc/src/__support/math/gamma_util.h
@@ -0,0 +1,116 @@
+//===-- Shared helpers for gamma family math functions ----------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC___SUPPORT_MATH_GAMMA_UTIL_H
+#define LLVM_LIBC_SRC___SUPPORT_MATH_GAMMA_UTIL_H
+
+#include "src/__support/CPP/bit.h"
+#include "src/__support/FPUtil/FPBits.h"
+#include "src/__support/FPUtil/multiply_add.h"
+#include "src/__support/macros/attributes.h"
+#include "src/__support/macros/config.h"
+
+namespace LIBC_NAMESPACE_DECL {
+
+namespace math {
+
+namespace gamma_internal {
+
+template <typename T> LIBC_INLINE constexpr bool is_integer(T x) {
+ using FPBits = fputil::FPBits<T>;
+ using StorageType = typename FPBits::StorageType;
+ FPBits xbits(x);
+ StorageType x_u = xbits.uintval();
+ unsigned x_e = static_cast<unsigned>(xbits.get_biased_exponent());
+ unsigned lsb = static_cast<unsigned>(
+ cpp::countr_zero(static_cast<StorageType>(x_u | FPBits::EXP_MASK)));
+ constexpr unsigned UNIT_EXPONENT =
+ static_cast<unsigned>(FPBits::EXP_BIAS + FPBits::FRACTION_LEN);
+ return x_e + lsb >= UNIT_EXPONENT;
+}
+
+// sin(pi*x) for x in (0, 1), written as (0.25 - u^2) * P(u^2) where u = x-0.5.
+// Coefficients for P are a degree-7 polynomial in u^2
+// approximating sin(pi*(u+0.5)) / (pi*(0.25-u^2))
+// with max error ~2^{-55}, generated by Sollya with:
+// > P = fpminimax(sin(pi*(x+0.5))/(pi*(0.25-x^2)),
+// [|0,2,4,6,8,10,12,14|], [|D...|], [-0.5, 0.5]);
+LIBC_INLINE constexpr double lg_sinpi(double x) {
+ constexpr double COEFFS[8] = {0x000000000000001p+2, -0x1.de9e64df22ea4p+1,
+ 0x1.472be122401f8p+0, -0x1.d4fcd82df91bp-3,
+ 0x1.9f05c97e0aab2p-6, -0x1.f3091c427b611p-10,
+ 0x1.b22c9bfdca547p-14, -0x1.15484325ef569p-18};
+ double u = x - 0.5;
+ double u2 = u * u, u4 = u2 * u2, u8 = u4 * u4;
+ double p01 = fputil::multiply_add(u2, COEFFS[1], COEFFS[0]);
+ double p23 = fputil::multiply_add(u2, COEFFS[3], COEFFS[2]);
+ double p45 = fputil::multiply_add(u2, COEFFS[5], COEFFS[4]);
+ double p67 = fputil::multiply_add(u2, COEFFS[7], COEFFS[6]);
+ double p03 = fputil::multiply_add(u4, p23, p01);
+ double p47 = fputil::multiply_add(u4, p67, p45);
+ // Compute (0.25 - u^2) = (0.5 - u) * (0.5 + u) to avoid ~10-digit
+ // catastrophic cancellation when |u| ~ 0.5 (i.e. x near 0 or 1).
+ return (0.5 - u) * (0.5 + u) * fputil::multiply_add(u8, p47, p03);
+}
+
+// Natural logarithm of x (x > 0), using 16-entry table + degree-7 polynomial.
+// Range reduction: x = 2^e * m where m in [1, 2), decomposed as
+// m = (1+i/16)*(1+z) so log(x) = e*log(2) + log(1+i/16) + log(1+z)
+// = e*log(2) + IL[i] + z*P(z).
+// P approximates log(1+z)/z on z in [-1/16, 1/16] with max
+// error ~2^{-54}, generated by Sollya with:
+// > P = fpminimax(log(1+x)/x, [|0,1,2,3,4,5,6,7|], [|D...|], [-1/16, 1/16]);
+LIBC_INLINE double lg_ln(double x) {
+ using FPBits = fputil::FPBits<double>;
+ uint64_t u = FPBits(x).uintval();
+ int e = static_cast<int>(FPBits(x).get_biased_exponent()) - 0x3ff;
+
+ // Coefficients for log(1 + z)/z on z in [-1/16, 1/16]
+ constexpr double COEFFS[8] = {0x1.fffffffffff24p-1, -0x1.ffffffffd1d67p-2,
+ 0x1.55555537802dep-2, -0x1.ffffeca81b866p-3,
+ 0x1.999611761d772p-3, -0x1.54f3e581b61bfp-3,
+ 0x1.1e642b4cb5143p-3, -0x1.9115a5af1e1edp-4};
+ // IL[i] = log(1 + i/16) for i = 0..15
+ constexpr double IL[16] = {
+ 0x1.59caeec280116p-57, 0x1.f0a30c01162aap-5, 0x1.e27076e2af2ebp-4,
+ 0x1.5ff3070a793d6p-3, 0x1.c8ff7c79a9a2p-3, 0x1.1675cababa60fp-2,
+ 0x1.4618bc21c5ec2p-2, 0x1.739d7f6bbd007p-2, 0x1.9f323ecbf984dp-2,
+ 0x1.c8ff7c79a9a21p-2, 0x1.f128f5faf06ecp-2, 0x1.0be72e4252a83p-1,
+ 0x1.1e85f5e7040d1p-1, 0x1.307d7334f10bep-1, 0x1.41d8fe84672afp-1,
+ 0x1.52a2d265bc5abp-1};
+ // IX[i] = 1 / (1 + i/16) for i = 0..15
+ constexpr double IX[16] = {
+ 0x000000000000001p+0, 0x1.e1e1e1e1e1e1ep-1, 0x1.c71c71c71c71cp-1,
+ 0x1.af286bca1af28p-1, 0x1.999999999999ap-1, 0x1.8618618618618p-1,
+ 0x1.745d1745d1746p-1, 0x1.642c8590b2164p-1, 0x1.5555555555555p-1,
+ 0x1.47ae147ae147bp-1, 0x1.3b13b13b13b14p-1, 0x1.2f684bda12f68p-1,
+ 0x1.2492492492492p-1, 0x1.1a7b9611a7b96p-1, 0x1.1111111111111p-1,
+ 0x1.0842108421084p-1};
+
+ int i = static_cast<int>((u >> 48) & 0xf);
+ // Reduce to mantissa in [1, 2)
+ uint64_t mant_u = (u & (~uint64_t(0) >> 12)) | (uint64_t(0x3ff) << 52);
+ double mant = FPBits(mant_u).get_val();
+ double z = IX[i] * mant - 1.0, z2 = z * z, z4 = z2 * z2;
+ double q01 = fputil::multiply_add(z, COEFFS[1], COEFFS[0]);
+ double q23 = fputil::multiply_add(z, COEFFS[3], COEFFS[2]);
+ double q45 = fputil::multiply_add(z, COEFFS[5], COEFFS[4]);
+ double q67 = fputil::multiply_add(z, COEFFS[7], COEFFS[6]);
+ double q03 = fputil::multiply_add(z2, q23, q01);
+ double q47 = fputil::multiply_add(z2, q67, q45);
+ return e * 0x1.62e42fefa39efp-1 + IL[i] +
+ z * fputil::multiply_add(z4, q47, q03);
+}
+
+} // namespace gamma_internal
+
+} // namespace math
+
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LLVM_LIBC_SRC___SUPPORT_MATH_GAMMA_UTIL_H
diff --git a/libc/src/__support/math/lgammaf.h b/libc/src/__support/math/lgammaf.h
new file mode 100644
index 0000000000000..9e4f4825d15c0
--- /dev/null
+++ b/libc/src/__support/math/lgammaf.h
@@ -0,0 +1,374 @@
+//===-- Implementation header for lgammaf -----------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC___SUPPORT_MATH_LGAMMAF_H
+#define LLVM_LIBC_SRC___SUPPORT_MATH_LGAMMAF_H
+
+#include "src/__support/FPUtil/FEnvImpl.h"
+#include "src/__support/FPUtil/FPBits.h"
+#include "src/__support/FPUtil/NearestIntegerOperations.h"
+#include "src/__support/FPUtil/PolyEval.h"
+#include "src/__support/FPUtil/cast.h"
+#include "src/__support/FPUtil/double_double.h"
+#include "src/__support/FPUtil/except_value_utils.h"
+#include "src/__support/FPUtil/multiply_add.h"
+#include "src/__support/macros/config.h"
+#include "src/__support/macros/optimization.h"
+#include "src/__support/math/gamma_util.h"
+#include "src/__support/math/log.h"
+
+namespace LIBC_NAMESPACE_DECL {
+
+namespace math {
+
+LIBC_INLINE float lgammaf(float x) {
+ using namespace gamma_internal;
+ using FPBits = fputil::FPBits<float>;
+
+ FPBits xbits(x);
+ uint32_t x_abs = xbits.abs().uintval();
+
+ // NaN / Inf
+ if (LIBC_UNLIKELY(x_abs >= 0x7f800000u)) {
+ if (x_abs == 0x7f800000u)
+ return FPBits::inf().get_val();
+ if (xbits.is_signaling_nan()) {
+ fputil::raise_except_if_required(FE_INVALID);
+ return FPBits::quiet_nan().get_val();
+ }
+ return x;
+ }
+
+ // +/- 0 -> +Inf pole
+ if (LIBC_UNLIKELY(x_abs == 0)) {
+ fputil::raise_except_if_required(FE_DIVBYZERO);
+ fputil::set_errno_if_required(ERANGE);
+ return FPBits::inf().get_val();
+ }
+
+ // Negative integers and lgamma(1) = lgamma(2) = 0.
+ if (LIBC_UNLIKELY(is_integer(x))) {
+ if (xbits.is_neg()) {
+ fputil::raise_except_if_required(FE_DIVBYZERO);
+ fputil::set_errno_if_required(ERANGE);
+ return FPBits::inf().get_val();
+ }
+ if (x_abs == 0x3f800000u || x_abs == 0x40000000u)
+ return FPBits::zero().get_val();
+ }
+
+ constexpr fputil::ExceptValues<float, 11> LGAMMAF_EXCEPTS{{
+ // input, toward-zero result, RU, RD, RN
+ {0x3b7c53aau, 0x40b1d661u, 1, 0, 0},
+ {0x42468b59u, 0x430f25a7u, 1, 0, 0},
+ {0x50522f52u, 0x5292ee5du, 1, 0, 1},
+ {0x9b7679ffu, 0x4247c72cu, 1, 0, 1},
+ {0x9e88452du, 0x4236bd8bu, 1, 0, 1},
+ {0xa77a8e47u, 0x42052b94u, 1, 0, 1},
+ {0xb0e17820u, 0x41a1d37bu, 1, 0, 0},
+ {0xc02c060fu, 0xbdabc3e2u, 0, 1, 1},
+ {0xc134eb14u, 0xc1875615u, 0, 1, 0},
+ {0xc33139a3u, 0xc43991afu, 0, 1, 0},
+ {0xc6f7e151u, 0xc89116deu, 0, 1, 0},
+ }};
+ if (auto r = LGAMMAF_EXCEPTS.lookup(xbits.uintval());
+ LIBC_UNLIKELY(r.has_value()))
+ return r.value();
+
+ double xd = fputil::cast<double>(x);
+ double abs_xd = xd < 0.0 ? -xd : xd;
+ double lgamma_val;
+
+ // For very tiny |x| (< 2^-23), use the truncated Laurent series:
+ // lgamma(x) = -log(x) - gamma*x + O(x^2) for tiny x > 0
+ // lgamma(-y) = -log(y) + gamma*y + O(y^2) for tiny y > 0
+ // Even though gamma*|x| < 2^-25 is below 1 float ULP of |result|, it tips
+ // rounding at boundary cases. The x^2 term is at most gamma*2^-46 << 2^-50.
+ if (x_abs < 0x34000000u) { // |x| < 2^-23
+ constexpr double EULER_GAMMA = 0x1.2788cfc6fb619p-1;
+ double sign_corr = xbits.is_neg() ? EULER_GAMMA : -EULER_GAMMA;
+ return fputil::cast<float>(
+ fputil::multiply_add(sign_corr, abs_xd, -math::log(abs_xd)));
+ }
+
+ if (x_abs < 0x3f290000u) {
+ // Small: t = |x| < 0.66015625. Degree-22 Chebyshev (23 coeffs), Clenshaw.
+ // P(u) approximates h(t) = (lgamma(t) + log(t)) / t (smooth on [0, 0.66]).
+ constexpr double MID_S = 0x1.5200000000000p-2;
+ constexpr double HW_S = 0x1.5200000000000p-2;
+ constexpr double CHEB_S[23] = {
+ -0x1.6ab12743e97f5p-2, 0x1.ac5810962c7c9p-3,...
[truncated]
``````````
</details>
https://github.com/llvm/llvm-project/pull/205490
More information about the libc-commits
mailing list