[libc-commits] [libc] [llvm] [libc][math] Implement single precision lgamma function (PR #205490)

via libc-commits libc-commits at lists.llvm.org
Tue Jun 23 23:23:00 PDT 2026


llvmorg-github-actions[bot] wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-libc

Author: Anonmiraj (AnonMiraj)

<details>
<summary>Changes</summary>


The implementation reduces to `|x|` and selects a path

- **Tiny** ($|x| < 2^{-23}$): truncated Laurent series $\text{lgamma}(x) = -\log|x| - \gamma\,|x|$.
- **Small** ($|x| < 0.66$): degree-22 Chebyshev for $h(t) = \frac{\text{lgamma}(t) + \log(t)}{t}$, recovering $\text{lgamma}(t) = t \cdot h(t) - \log(t)$.
- **Medium** ($|x| \in [0.66, 3.37]$): degree-22 Chebyshev split into three sub-ranges that factor out the function's roots —
  - $[0.66, 1)$: $\text{lgamma}(t) = (t-1)\,P(t)$
  - $[1, 2)$: $\text{lgamma}(t) = (t-1)(t-2)\,P(t)$
  - $[2, 3.37)$: $\text{lgamma}(t) = (t-2)\,P(t)$
- **Stirling** ($|x| > 3.37$): $(x-0.5)\log(x) - x + \frac{\log(2\pi)}{2}$ plus 1/2/4/8-term Bernoulli corrections selected by sub-range (the dominant term is evaluated in double-double for large $|x|$).

Negative $x$ uses the reflection formula $\text{lgamma}(x) = \log\pi - \log|\sin(\pi x)| - \text{lgamma}(|x|)$. 

Exhaustive tests pass in all 4 rounding modes.

```
-- CORE-MATH reciprocal throughput --
Ntrial = 20 ; Min = 54.476 + 7.003 clc/call; Median-Min = 1.211 clc/call; Max = 67.906 clc/call;
-- System LIBC reciprocal throughput --
Ntrial = 20 ; Min = 70.125 + 8.204 clc/call; Median-Min = 4.416 clc/call; Max = 90.469 clc/call;
```

Depends on #<!-- -->192834 so lgammaf16 must be merged first.


---

Patch is 67.45 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/205490.diff


38 Files Affected:

- (modified) libc/config/baremetal/aarch64/entrypoints.txt (+2) 
- (modified) libc/config/baremetal/arm/entrypoints.txt (+2) 
- (modified) libc/config/baremetal/riscv/entrypoints.txt (+2) 
- (modified) libc/config/darwin/aarch64/entrypoints.txt (+1) 
- (modified) libc/config/gpu/amdgpu/entrypoints.txt (+1) 
- (modified) libc/config/gpu/nvptx/entrypoints.txt (+1) 
- (modified) libc/config/linux/aarch64/entrypoints.txt (+2) 
- (modified) libc/config/linux/riscv/entrypoints.txt (+2) 
- (modified) libc/config/linux/x86_64/entrypoints.txt (+2) 
- (modified) libc/include/math.yaml (+7) 
- (modified) libc/shared/math.h (+2) 
- (added) libc/shared/math/lgammaf.h (+23) 
- (added) libc/shared/math/lgammaf16.h (+29) 
- (modified) libc/src/__support/math/CMakeLists.txt (+50) 
- (added) libc/src/__support/math/gamma_util.h (+116) 
- (added) libc/src/__support/math/lgammaf.h (+374) 
- (added) libc/src/__support/math/lgammaf16.h (+220) 
- (modified) libc/src/math/CMakeLists.txt (+3) 
- (modified) libc/src/math/generic/CMakeLists.txt (+26) 
- (added) libc/src/math/generic/lgammaf.cpp (+18) 
- (added) libc/src/math/generic/lgammaf16.cpp (+20) 
- (added) libc/src/math/lgammaf.h (+20) 
- (added) libc/src/math/lgammaf16.h (+21) 
- (modified) libc/test/shared/CMakeLists.txt (+1) 
- (modified) libc/test/shared/shared_math_test.cpp (+1) 
- (modified) libc/test/src/math/CMakeLists.txt (+23) 
- (modified) libc/test/src/math/exhaustive/CMakeLists.txt (+16) 
- (added) libc/test/src/math/exhaustive/lgammaf_test.cpp (+35) 
- (added) libc/test/src/math/lgammaf16_test.cpp (+42) 
- (added) libc/test/src/math/lgammaf_test.cpp (+100) 
- (modified) libc/test/src/math/smoke/CMakeLists.txt (+22) 
- (added) libc/test/src/math/smoke/lgammaf16_test.cpp (+61) 
- (added) libc/test/src/math/smoke/lgammaf_test.cpp (+62) 
- (modified) libc/utils/MPFRWrapper/MPCommon.cpp (+7) 
- (modified) libc/utils/MPFRWrapper/MPCommon.h (+1) 
- (modified) libc/utils/MPFRWrapper/MPFRUtils.cpp (+2) 
- (modified) libc/utils/MPFRWrapper/MPFRUtils.h (+1) 
- (modified) utils/bazel/llvm-project-overlay/libc/BUILD.bazel (+58) 


``````````diff
diff --git a/libc/config/baremetal/aarch64/entrypoints.txt b/libc/config/baremetal/aarch64/entrypoints.txt
index dcb50135232e2..3f05a2f0ed004 100644
--- a/libc/config/baremetal/aarch64/entrypoints.txt
+++ b/libc/config/baremetal/aarch64/entrypoints.txt
@@ -478,6 +478,7 @@ set(TARGET_LIBM_ENTRYPOINTS
     libc.src.math.ldexp
     libc.src.math.ldexpf
     libc.src.math.ldexpl
+    libc.src.math.lgammaf
     libc.src.math.llogb
     libc.src.math.llogbf
     libc.src.math.llogbl
@@ -655,6 +656,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
     libc.src.math.iscanonicalf16
     libc.src.math.issignalingf16
     libc.src.math.ldexpf16
+    libc.src.math.lgammaf16
     libc.src.math.llogbf16
     libc.src.math.llrintf16
     libc.src.math.llroundf16
diff --git a/libc/config/baremetal/arm/entrypoints.txt b/libc/config/baremetal/arm/entrypoints.txt
index fac62bac939cc..bc70b5797af84 100644
--- a/libc/config/baremetal/arm/entrypoints.txt
+++ b/libc/config/baremetal/arm/entrypoints.txt
@@ -490,6 +490,7 @@ set(TARGET_LIBM_ENTRYPOINTS
     libc.src.math.ldexp
     libc.src.math.ldexpf
     libc.src.math.ldexpl
+    libc.src.math.lgammaf
     libc.src.math.llogb
     libc.src.math.llogbf
     libc.src.math.llogbl
@@ -667,6 +668,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
     libc.src.math.iscanonicalf16
     libc.src.math.issignalingf16
     libc.src.math.ldexpf16
+    libc.src.math.lgammaf16
     libc.src.math.llogbf16
     libc.src.math.llrintf16
     libc.src.math.llroundf16
diff --git a/libc/config/baremetal/riscv/entrypoints.txt b/libc/config/baremetal/riscv/entrypoints.txt
index a3b96225ff09d..33fc0591bc84c 100644
--- a/libc/config/baremetal/riscv/entrypoints.txt
+++ b/libc/config/baremetal/riscv/entrypoints.txt
@@ -486,6 +486,7 @@ set(TARGET_LIBM_ENTRYPOINTS
     libc.src.math.ldexp
     libc.src.math.ldexpf
     libc.src.math.ldexpl
+    libc.src.math.lgammaf
     libc.src.math.llogb
     libc.src.math.llogbf
     libc.src.math.llogbl
@@ -663,6 +664,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
     libc.src.math.iscanonicalf16
     libc.src.math.issignalingf16
     libc.src.math.ldexpf16
+    libc.src.math.lgammaf16
     libc.src.math.llogbf16
     libc.src.math.llrintf16
     libc.src.math.llroundf16
diff --git a/libc/config/darwin/aarch64/entrypoints.txt b/libc/config/darwin/aarch64/entrypoints.txt
index 914d2b7918da1..cdd04d8bbb74f 100644
--- a/libc/config/darwin/aarch64/entrypoints.txt
+++ b/libc/config/darwin/aarch64/entrypoints.txt
@@ -476,6 +476,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
     libc.src.math.iscanonicalf16
     libc.src.math.issignalingf16
     libc.src.math.ldexpf16
+    libc.src.math.lgammaf16
     libc.src.math.llogbf16
     libc.src.math.llrintf16
     libc.src.math.llroundf16
diff --git a/libc/config/gpu/amdgpu/entrypoints.txt b/libc/config/gpu/amdgpu/entrypoints.txt
index c76900a8371f7..c1b29ff893891 100644
--- a/libc/config/gpu/amdgpu/entrypoints.txt
+++ b/libc/config/gpu/amdgpu/entrypoints.txt
@@ -597,6 +597,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
     libc.src.math.iscanonicalf16
     libc.src.math.issignalingf16
     libc.src.math.ldexpf16
+    libc.src.math.lgammaf16
     libc.src.math.llogbf16
     libc.src.math.llrintf16
     libc.src.math.llroundf16
diff --git a/libc/config/gpu/nvptx/entrypoints.txt b/libc/config/gpu/nvptx/entrypoints.txt
index 6538732f785f5..73e2e62c9f3fa 100644
--- a/libc/config/gpu/nvptx/entrypoints.txt
+++ b/libc/config/gpu/nvptx/entrypoints.txt
@@ -599,6 +599,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
     libc.src.math.iscanonicalf16
     libc.src.math.issignalingf16
     libc.src.math.ldexpf16
+    libc.src.math.lgammaf16
     libc.src.math.llogbf16
     libc.src.math.llrintf16
     libc.src.math.llroundf16
diff --git a/libc/config/linux/aarch64/entrypoints.txt b/libc/config/linux/aarch64/entrypoints.txt
index c7458ef36c941..64d87df1dc32f 100644
--- a/libc/config/linux/aarch64/entrypoints.txt
+++ b/libc/config/linux/aarch64/entrypoints.txt
@@ -595,6 +595,7 @@ set(TARGET_LIBM_ENTRYPOINTS
     libc.src.math.ldexp
     libc.src.math.ldexpf
     libc.src.math.ldexpl
+    libc.src.math.lgammaf
     libc.src.math.llogb
     libc.src.math.llogbf
     libc.src.math.llogbl
@@ -769,6 +770,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
     libc.src.math.iscanonicalf16
     libc.src.math.issignalingf16
     libc.src.math.ldexpf16
+    libc.src.math.lgammaf16
     libc.src.math.llogbf16
     libc.src.math.llrintf16
     libc.src.math.llroundf16
diff --git a/libc/config/linux/riscv/entrypoints.txt b/libc/config/linux/riscv/entrypoints.txt
index e3336420cbbcc..df2441dff5056 100644
--- a/libc/config/linux/riscv/entrypoints.txt
+++ b/libc/config/linux/riscv/entrypoints.txt
@@ -604,6 +604,7 @@ set(TARGET_LIBM_ENTRYPOINTS
     libc.src.math.ldexp
     libc.src.math.ldexpf
     libc.src.math.ldexpl
+    libc.src.math.lgammaf
     libc.src.math.llogb
     libc.src.math.llogbf
     libc.src.math.llogbl
@@ -785,6 +786,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
     libc.src.math.iscanonicalf16
     libc.src.math.issignalingf16
     libc.src.math.ldexpf16
+    libc.src.math.lgammaf16
     libc.src.math.llogbf16
     libc.src.math.llrintf16
     libc.src.math.llroundf16
diff --git a/libc/config/linux/x86_64/entrypoints.txt b/libc/config/linux/x86_64/entrypoints.txt
index 3c3b0e835429f..8d49c2674de9c 100644
--- a/libc/config/linux/x86_64/entrypoints.txt
+++ b/libc/config/linux/x86_64/entrypoints.txt
@@ -664,6 +664,7 @@ set(TARGET_LIBM_ENTRYPOINTS
     libc.src.math.ldexp
     libc.src.math.ldexpf
     libc.src.math.ldexpl
+    libc.src.math.lgammaf
     libc.src.math.llogb
     libc.src.math.llogbf
     libc.src.math.llogbl
@@ -849,6 +850,7 @@ if(LIBC_TYPES_HAS_FLOAT16)
     libc.src.math.iscanonicalf16
     libc.src.math.issignalingf16
     libc.src.math.ldexpf16
+    libc.src.math.lgammaf16
     libc.src.math.llogbf16
     libc.src.math.llrintf16
     libc.src.math.llroundf16
diff --git a/libc/include/math.yaml b/libc/include/math.yaml
index 3984a665b234b..67d19399b99ae 100644
--- a/libc/include/math.yaml
+++ b/libc/include/math.yaml
@@ -1682,6 +1682,13 @@ functions:
     arguments:
       - type: long double
       - type: int *
+  - name: lgammaf16
+    standards:
+      - stdc
+    return_type: _Float16
+    arguments:
+      - type: _Float16
+    guard: LIBC_TYPES_HAS_FLOAT16
   - name: llogb
     standards:
       - stdc
diff --git a/libc/shared/math.h b/libc/shared/math.h
index 2baeb07294a3b..cc95ab2b74234 100644
--- a/libc/shared/math.h
+++ b/libc/shared/math.h
@@ -305,6 +305,8 @@
 #include "math/ldexpf128.h"
 #include "math/ldexpf16.h"
 #include "math/ldexpl.h"
+#include "math/lgammaf.h"
+#include "math/lgammaf16.h"
 #include "math/llogb.h"
 #include "math/llogbbf16.h"
 #include "math/llogbf.h"
diff --git a/libc/shared/math/lgammaf.h b/libc/shared/math/lgammaf.h
new file mode 100644
index 0000000000000..6976109fe9832
--- /dev/null
+++ b/libc/shared/math/lgammaf.h
@@ -0,0 +1,23 @@
+//===-- Shared lgammaf function ---------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SHARED_MATH_LGAMMAF_H
+#define LLVM_LIBC_SHARED_MATH_LGAMMAF_H
+
+#include "shared/libc_common.h"
+#include "src/__support/math/lgammaf.h"
+
+namespace LIBC_NAMESPACE_DECL {
+namespace shared {
+
+using math::lgammaf;
+
+} // namespace shared
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LLVM_LIBC_SHARED_MATH_LGAMMAF_H
diff --git a/libc/shared/math/lgammaf16.h b/libc/shared/math/lgammaf16.h
new file mode 100644
index 0000000000000..07de5ef6dd13d
--- /dev/null
+++ b/libc/shared/math/lgammaf16.h
@@ -0,0 +1,29 @@
+//===-- Shared lgammaf16 function -------------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SHARED_MATH_LGAMMAF16_H
+#define LLVM_LIBC_SHARED_MATH_LGAMMAF16_H
+
+#include "include/llvm-libc-macros/float16-macros.h"
+
+#ifdef LIBC_TYPES_HAS_FLOAT16
+
+#include "shared/libc_common.h"
+#include "src/__support/math/lgammaf16.h"
+
+namespace LIBC_NAMESPACE_DECL {
+namespace shared {
+
+using math::lgammaf16;
+
+} // namespace shared
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LIBC_TYPES_HAS_FLOAT16
+
+#endif // LLVM_LIBC_SHARED_MATH_LGAMMAF16_H
diff --git a/libc/src/__support/math/CMakeLists.txt b/libc/src/__support/math/CMakeLists.txt
index 9e3ec26cdc881..b568ef151f85e 100644
--- a/libc/src/__support/math/CMakeLists.txt
+++ b/libc/src/__support/math/CMakeLists.txt
@@ -3713,6 +3713,17 @@ add_header_library(
     libc.src.__support.FPUtil.generic.sqrt
 )
 
+add_header_library(
+  gamma_util
+  HDRS
+    gamma_util.h
+  DEPENDS
+    libc.src.__support.CPP.bit
+    libc.src.__support.FPUtil.fp_bits
+    libc.src.__support.FPUtil.multiply_add
+    libc.src.__support.macros.config
+)
+
 add_header_library(
   getpayload
   HDRS
@@ -3881,6 +3892,45 @@ add_header_library(
     libc.include.llvm-libc-types.float128
 )
 
+add_header_library(
+  lgammaf
+  HDRS
+    lgammaf.h
+  DEPENDS
+    libc.hdr.errno_macros
+    libc.hdr.fenv_macros
+    libc.src.__support.FPUtil.cast
+    libc.src.__support.FPUtil.double_double
+    libc.src.__support.FPUtil.except_value_utils
+    libc.src.__support.FPUtil.fenv_impl
+    libc.src.__support.FPUtil.fp_bits
+    libc.src.__support.FPUtil.multiply_add
+    libc.src.__support.FPUtil.nearest_integer_operations
+    libc.src.__support.FPUtil.polyeval
+    libc.src.__support.macros.config
+    libc.src.__support.macros.optimization
+    libc.src.__support.math.gamma_util
+    libc.src.__support.math.log
+)
+
+add_header_library(
+  lgammaf16
+  HDRS
+    lgammaf16.h
+  DEPENDS
+    libc.include.llvm-libc-macros.float16_macros
+    libc.src.__support.FPUtil.cast
+    libc.src.__support.FPUtil.fenv_impl
+    libc.src.__support.FPUtil.fp_bits
+    libc.src.__support.FPUtil.multiply_add
+    libc.src.__support.FPUtil.nearest_integer_operations
+    libc.src.__support.FPUtil.polyeval
+    libc.src.__support.macros.config
+    libc.src.__support.macros.optimization
+    libc.src.__support.macros.properties.types
+    libc.src.__support.math.gamma_util
+)
+
 add_header_library(
   llogbf16
   HDRS
diff --git a/libc/src/__support/math/gamma_util.h b/libc/src/__support/math/gamma_util.h
new file mode 100644
index 0000000000000..09024a521ff77
--- /dev/null
+++ b/libc/src/__support/math/gamma_util.h
@@ -0,0 +1,116 @@
+//===-- Shared helpers for gamma family math functions ----------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC___SUPPORT_MATH_GAMMA_UTIL_H
+#define LLVM_LIBC_SRC___SUPPORT_MATH_GAMMA_UTIL_H
+
+#include "src/__support/CPP/bit.h"
+#include "src/__support/FPUtil/FPBits.h"
+#include "src/__support/FPUtil/multiply_add.h"
+#include "src/__support/macros/attributes.h"
+#include "src/__support/macros/config.h"
+
+namespace LIBC_NAMESPACE_DECL {
+
+namespace math {
+
+namespace gamma_internal {
+
+template <typename T> LIBC_INLINE constexpr bool is_integer(T x) {
+  using FPBits = fputil::FPBits<T>;
+  using StorageType = typename FPBits::StorageType;
+  FPBits xbits(x);
+  StorageType x_u = xbits.uintval();
+  unsigned x_e = static_cast<unsigned>(xbits.get_biased_exponent());
+  unsigned lsb = static_cast<unsigned>(
+      cpp::countr_zero(static_cast<StorageType>(x_u | FPBits::EXP_MASK)));
+  constexpr unsigned UNIT_EXPONENT =
+      static_cast<unsigned>(FPBits::EXP_BIAS + FPBits::FRACTION_LEN);
+  return x_e + lsb >= UNIT_EXPONENT;
+}
+
+// sin(pi*x) for x in (0, 1), written as (0.25 - u^2) * P(u^2) where u = x-0.5.
+// Coefficients for P are a degree-7 polynomial in u^2
+// approximating sin(pi*(u+0.5)) / (pi*(0.25-u^2))
+// with max error ~2^{-55}, generated by Sollya with:
+//   > P = fpminimax(sin(pi*(x+0.5))/(pi*(0.25-x^2)),
+//                   [|0,2,4,6,8,10,12,14|], [|D...|], [-0.5, 0.5]);
+LIBC_INLINE constexpr double lg_sinpi(double x) {
+  constexpr double COEFFS[8] = {0x000000000000001p+2,  -0x1.de9e64df22ea4p+1,
+                                0x1.472be122401f8p+0,  -0x1.d4fcd82df91bp-3,
+                                0x1.9f05c97e0aab2p-6,  -0x1.f3091c427b611p-10,
+                                0x1.b22c9bfdca547p-14, -0x1.15484325ef569p-18};
+  double u = x - 0.5;
+  double u2 = u * u, u4 = u2 * u2, u8 = u4 * u4;
+  double p01 = fputil::multiply_add(u2, COEFFS[1], COEFFS[0]);
+  double p23 = fputil::multiply_add(u2, COEFFS[3], COEFFS[2]);
+  double p45 = fputil::multiply_add(u2, COEFFS[5], COEFFS[4]);
+  double p67 = fputil::multiply_add(u2, COEFFS[7], COEFFS[6]);
+  double p03 = fputil::multiply_add(u4, p23, p01);
+  double p47 = fputil::multiply_add(u4, p67, p45);
+  // Compute (0.25 - u^2) = (0.5 - u) * (0.5 + u) to avoid ~10-digit
+  // catastrophic cancellation when |u| ~ 0.5 (i.e. x near 0 or 1).
+  return (0.5 - u) * (0.5 + u) * fputil::multiply_add(u8, p47, p03);
+}
+
+// Natural logarithm of x (x > 0), using 16-entry table + degree-7 polynomial.
+// Range reduction: x = 2^e * m where m in [1, 2), decomposed as
+// m = (1+i/16)*(1+z) so log(x) = e*log(2) + log(1+i/16) + log(1+z)
+// = e*log(2) + IL[i] + z*P(z).
+// P approximates log(1+z)/z on z in [-1/16, 1/16] with max
+// error ~2^{-54}, generated by Sollya with:
+//   > P = fpminimax(log(1+x)/x, [|0,1,2,3,4,5,6,7|], [|D...|], [-1/16, 1/16]);
+LIBC_INLINE double lg_ln(double x) {
+  using FPBits = fputil::FPBits<double>;
+  uint64_t u = FPBits(x).uintval();
+  int e = static_cast<int>(FPBits(x).get_biased_exponent()) - 0x3ff;
+
+  // Coefficients for log(1 + z)/z on z in [-1/16, 1/16]
+  constexpr double COEFFS[8] = {0x1.fffffffffff24p-1, -0x1.ffffffffd1d67p-2,
+                                0x1.55555537802dep-2, -0x1.ffffeca81b866p-3,
+                                0x1.999611761d772p-3, -0x1.54f3e581b61bfp-3,
+                                0x1.1e642b4cb5143p-3, -0x1.9115a5af1e1edp-4};
+  // IL[i] = log(1 + i/16) for i = 0..15
+  constexpr double IL[16] = {
+      0x1.59caeec280116p-57, 0x1.f0a30c01162aap-5, 0x1.e27076e2af2ebp-4,
+      0x1.5ff3070a793d6p-3,  0x1.c8ff7c79a9a2p-3,  0x1.1675cababa60fp-2,
+      0x1.4618bc21c5ec2p-2,  0x1.739d7f6bbd007p-2, 0x1.9f323ecbf984dp-2,
+      0x1.c8ff7c79a9a21p-2,  0x1.f128f5faf06ecp-2, 0x1.0be72e4252a83p-1,
+      0x1.1e85f5e7040d1p-1,  0x1.307d7334f10bep-1, 0x1.41d8fe84672afp-1,
+      0x1.52a2d265bc5abp-1};
+  // IX[i] = 1 / (1 + i/16) for i = 0..15
+  constexpr double IX[16] = {
+      0x000000000000001p+0, 0x1.e1e1e1e1e1e1ep-1, 0x1.c71c71c71c71cp-1,
+      0x1.af286bca1af28p-1, 0x1.999999999999ap-1, 0x1.8618618618618p-1,
+      0x1.745d1745d1746p-1, 0x1.642c8590b2164p-1, 0x1.5555555555555p-1,
+      0x1.47ae147ae147bp-1, 0x1.3b13b13b13b14p-1, 0x1.2f684bda12f68p-1,
+      0x1.2492492492492p-1, 0x1.1a7b9611a7b96p-1, 0x1.1111111111111p-1,
+      0x1.0842108421084p-1};
+
+  int i = static_cast<int>((u >> 48) & 0xf);
+  // Reduce to mantissa in [1, 2)
+  uint64_t mant_u = (u & (~uint64_t(0) >> 12)) | (uint64_t(0x3ff) << 52);
+  double mant = FPBits(mant_u).get_val();
+  double z = IX[i] * mant - 1.0, z2 = z * z, z4 = z2 * z2;
+  double q01 = fputil::multiply_add(z, COEFFS[1], COEFFS[0]);
+  double q23 = fputil::multiply_add(z, COEFFS[3], COEFFS[2]);
+  double q45 = fputil::multiply_add(z, COEFFS[5], COEFFS[4]);
+  double q67 = fputil::multiply_add(z, COEFFS[7], COEFFS[6]);
+  double q03 = fputil::multiply_add(z2, q23, q01);
+  double q47 = fputil::multiply_add(z2, q67, q45);
+  return e * 0x1.62e42fefa39efp-1 + IL[i] +
+         z * fputil::multiply_add(z4, q47, q03);
+}
+
+} // namespace gamma_internal
+
+} // namespace math
+
+} // namespace LIBC_NAMESPACE_DECL
+
+#endif // LLVM_LIBC_SRC___SUPPORT_MATH_GAMMA_UTIL_H
diff --git a/libc/src/__support/math/lgammaf.h b/libc/src/__support/math/lgammaf.h
new file mode 100644
index 0000000000000..9e4f4825d15c0
--- /dev/null
+++ b/libc/src/__support/math/lgammaf.h
@@ -0,0 +1,374 @@
+//===-- Implementation header for lgammaf -----------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC___SUPPORT_MATH_LGAMMAF_H
+#define LLVM_LIBC_SRC___SUPPORT_MATH_LGAMMAF_H
+
+#include "src/__support/FPUtil/FEnvImpl.h"
+#include "src/__support/FPUtil/FPBits.h"
+#include "src/__support/FPUtil/NearestIntegerOperations.h"
+#include "src/__support/FPUtil/PolyEval.h"
+#include "src/__support/FPUtil/cast.h"
+#include "src/__support/FPUtil/double_double.h"
+#include "src/__support/FPUtil/except_value_utils.h"
+#include "src/__support/FPUtil/multiply_add.h"
+#include "src/__support/macros/config.h"
+#include "src/__support/macros/optimization.h"
+#include "src/__support/math/gamma_util.h"
+#include "src/__support/math/log.h"
+
+namespace LIBC_NAMESPACE_DECL {
+
+namespace math {
+
+LIBC_INLINE float lgammaf(float x) {
+  using namespace gamma_internal;
+  using FPBits = fputil::FPBits<float>;
+
+  FPBits xbits(x);
+  uint32_t x_abs = xbits.abs().uintval();
+
+  // NaN / Inf
+  if (LIBC_UNLIKELY(x_abs >= 0x7f800000u)) {
+    if (x_abs == 0x7f800000u)
+      return FPBits::inf().get_val();
+    if (xbits.is_signaling_nan()) {
+      fputil::raise_except_if_required(FE_INVALID);
+      return FPBits::quiet_nan().get_val();
+    }
+    return x;
+  }
+
+  // +/- 0 -> +Inf pole
+  if (LIBC_UNLIKELY(x_abs == 0)) {
+    fputil::raise_except_if_required(FE_DIVBYZERO);
+    fputil::set_errno_if_required(ERANGE);
+    return FPBits::inf().get_val();
+  }
+
+  // Negative integers and lgamma(1) = lgamma(2) = 0.
+  if (LIBC_UNLIKELY(is_integer(x))) {
+    if (xbits.is_neg()) {
+      fputil::raise_except_if_required(FE_DIVBYZERO);
+      fputil::set_errno_if_required(ERANGE);
+      return FPBits::inf().get_val();
+    }
+    if (x_abs == 0x3f800000u || x_abs == 0x40000000u)
+      return FPBits::zero().get_val();
+  }
+
+  constexpr fputil::ExceptValues<float, 11> LGAMMAF_EXCEPTS{{
+      // input,      toward-zero result, RU, RD, RN
+      {0x3b7c53aau, 0x40b1d661u, 1, 0, 0},
+      {0x42468b59u, 0x430f25a7u, 1, 0, 0},
+      {0x50522f52u, 0x5292ee5du, 1, 0, 1},
+      {0x9b7679ffu, 0x4247c72cu, 1, 0, 1},
+      {0x9e88452du, 0x4236bd8bu, 1, 0, 1},
+      {0xa77a8e47u, 0x42052b94u, 1, 0, 1},
+      {0xb0e17820u, 0x41a1d37bu, 1, 0, 0},
+      {0xc02c060fu, 0xbdabc3e2u, 0, 1, 1},
+      {0xc134eb14u, 0xc1875615u, 0, 1, 0},
+      {0xc33139a3u, 0xc43991afu, 0, 1, 0},
+      {0xc6f7e151u, 0xc89116deu, 0, 1, 0},
+  }};
+  if (auto r = LGAMMAF_EXCEPTS.lookup(xbits.uintval());
+      LIBC_UNLIKELY(r.has_value()))
+    return r.value();
+
+  double xd = fputil::cast<double>(x);
+  double abs_xd = xd < 0.0 ? -xd : xd;
+  double lgamma_val;
+
+  // For very tiny |x| (< 2^-23), use the truncated Laurent series:
+  //   lgamma(x)  = -log(x) - gamma*x + O(x^2)  for tiny x > 0
+  //   lgamma(-y) = -log(y) + gamma*y + O(y^2)  for tiny y > 0
+  // Even though gamma*|x| < 2^-25 is below 1 float ULP of |result|, it tips
+  // rounding at boundary cases. The x^2 term is at most gamma*2^-46 << 2^-50.
+  if (x_abs < 0x34000000u) { // |x| < 2^-23
+    constexpr double EULER_GAMMA = 0x1.2788cfc6fb619p-1;
+    double sign_corr = xbits.is_neg() ? EULER_GAMMA : -EULER_GAMMA;
+    return fputil::cast<float>(
+        fputil::multiply_add(sign_corr, abs_xd, -math::log(abs_xd)));
+  }
+
+  if (x_abs < 0x3f290000u) {
+    // Small: t = |x| < 0.66015625. Degree-22 Chebyshev (23 coeffs), Clenshaw.
+    // P(u) approximates h(t) = (lgamma(t) + log(t)) / t (smooth on [0, 0.66]).
+    constexpr double MID_S = 0x1.5200000000000p-2;
+    constexpr double HW_S = 0x1.5200000000000p-2;
+    constexpr double CHEB_S[23] = {
+        -0x1.6ab12743e97f5p-2,  0x1.ac5810962c7c9p-3,...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/205490


More information about the libc-commits mailing list