[libc-commits] [libc] [llvm] [libc][math] Impl bfloat16 lgamma function. (PR #199312)
Sneaky Weasel via libc-commits
libc-commits at lists.llvm.org
Tue Jun 23 00:49:11 PDT 2026
================
@@ -0,0 +1,196 @@
+//===-- Implementation of lgammabf16 ----------------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_SRC___SUPPORT_MATH_LGAMMABF16_H
+#define LLVM_LIBC_SRC___SUPPORT_MATH_LGAMMABF16_H
+
+#include "hdr/errno_macros.h"
+#include "hdr/fenv_macros.h"
+#include "src/__support/FPUtil/FEnvImpl.h"
+#include "src/__support/FPUtil/FPBits.h"
+#include "src/__support/FPUtil/PolyEval.h"
+#include "src/__support/FPUtil/bfloat16.h"
+#include "src/__support/FPUtil/cast.h"
+#include "src/__support/macros/config.h"
+#include "src/__support/macros/optimization.h"
+#include "src/__support/math/log.h"
+
+namespace LIBC_NAMESPACE_DECL {
+namespace math {
+
+// lgamma_positive_d: compute lgamma(x) for x > 0, returning double.
+//
+// Takes double so callers can pass (1.0 + ax) without float precision loss.
+//
+// For x < 8, applies the recurrence lgamma(x) = lgamma(x+1) - ln(x) until
+// x reaches [4, 8), then evaluates the polynomial. This is critical because
+// the [2,3) polynomial has max_rel_err=9.98e-6 which, near its edges (t near
+// +0.5 or -0.5), causes ~6e-6 absolute error. After subtracting ln(x), the
+// result near x=1 or x=2 can be as small as ~0.002, giving ~0.45 ULP error --
+// which fails the directed-rounding tolerance test. Polynomials for [4,5) and
+// above have max_rel_err <= 6.61e-7, keeping final error well under 0.1 ULP.
+LIBC_INLINE double lgamma_positive_d(double x) {
----------------
udaykiriti wrote:
The polynomial gives a float result. Float has "23" mantissa bits, `bfloat16 has only 8`. When we cast float tot bfloat16, the float value can land exactly on a bfloat16 midpnt and the tie-break picks the wrong direction because the float already threw away the bits that would have told you which side you were really on....
so if we return as double then the remainining bits will survive till end,,. then if we cast it to bfloat16 we can see the true value , not a pre rounded approxx.
https://github.com/llvm/llvm-project/pull/199312
More information about the libc-commits
mailing list