[libc-commits] [libc] [libc][docs] Add GPU math conformance test results to support page (PR #156263)
Leandro Lacerda via libc-commits
libc-commits at lists.llvm.org
Sun Aug 31 13:09:54 PDT 2025
https://github.com/leandrolcampos created https://github.com/llvm/llvm-project/pull/156263
This patch enhances the GPU support documentation page (`support.html`) by adding a new, detailed section for `math.h`. This new section presents the results of the GPU math conformance tests, providing quantitative data on the accuracy of the supported higher math functions.
>From 8b3ecee2692d4a8d50788600aa780492379430ca Mon Sep 17 00:00:00 2001
From: Leandro Augusto Lacerda Campos <leandrolcampos at yahoo.com.br>
Date: Sun, 31 Aug 2025 17:05:14 -0300
Subject: [PATCH] Add GPU math conformance test results to support page
---
libc/docs/gpu/support.rst | 166 ++++++++++++++++++++++++++++++++++++++
1 file changed, 166 insertions(+)
diff --git a/libc/docs/gpu/support.rst b/libc/docs/gpu/support.rst
index 3fb2df8e6f2ca..4243900fb81d4 100644
--- a/libc/docs/gpu/support.rst
+++ b/libc/docs/gpu/support.rst
@@ -281,3 +281,169 @@ Function Name Available RPC Required
assert |check| |check|
__assert_fail |check| |check|
============= ========= ============
+
+math.h
+------
+
+The following table presents the conformance test results for higher math functions on the GPU. The results show the maximum observed ULP (Units in the Last Place) distance when comparing the GPU implementation against a correctly rounded reference computed on the host CPU. In addition to the C standard math library (LLVM-libm), these tests are conducted against CUDA Math and HIP Math, for comparison only.
+
++------------------------+-------------+---------------+-----------------------------------------------------------------------------------+
+| Function | Test Method | ULP Tolerance | Max ULP Distance |
+| | | +--------------------+--------------------+--------------------+--------------------+
+| | | | llvm-libm | llvm-libm | cuda-math | hip-math |
+| | | | (AMDGPU) | (CUDA) | (CUDA) | (AMDGPU) |
++========================+=============+===============+====================+====================+====================+====================+
+| acos | Randomized | 4 | 6 (FAILED) | 6 (FAILED) | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| acosf | Exhaustive | 4 | 1 | 1 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| acosf16 | Exhaustive | 2 | 1 | 1 | | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| acoshf | Exhaustive | 4 | 1 | 1 | 2 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| acoshf16 | Exhaustive | 2 | 0 | 0 | | 0 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| acospif16 | Exhaustive | 2 | 0 | 0 | | |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| asin | Randomized | 4 | 6 (FAILED) | 6 (FAILED) | 2 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| asinf | Exhaustive | 4 | 1 | 1 | 1 | 3 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| asinf16 | Exhaustive | 2 | 0 | 0 | | 2 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| asinhf | Exhaustive | 4 | 1 | 1 | 2 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| asinhf16 | Exhaustive | 2 | 1 | 1 | | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| atanf | Exhaustive | 5 | 0 | 0 | 1 | 2 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| atanf16 | Exhaustive | 2 | 1 | 1 | | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| atan2f | Randomized | 6 | 1 | 1 | 2 | 3 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| atanhf | Exhaustive | 5 | 0 | 0 | 3 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| atanhf16 | Exhaustive | 2 | 0 | 0 | | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| cbrt | Randomized | 2 | 1 | 1 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| cbrtf | Exhaustive | 2 | 0 | 0 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| cos | Randomized | 4 | 1 | 1 | 2 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| cosf | Exhaustive | 4 | 1 | 1 | 2 | 2 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| cosf16 | Exhaustive | 2 | 1 | 1 | | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| coshf | Exhaustive | 4 | 0 | 0 | 2 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| coshf16 | Exhaustive | 2 | 1 | 0 | | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| cospif | Exhaustive | 4 | 0 | 0 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| cospif16 | Exhaustive | 2 | 0 | 0 | | |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| erff | Exhaustive | 16 | 0 | 0 | 1 | 2 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| exp | Randomized | 3 | 1 | 1 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| expf | Exhaustive | 3 | 0 | 0 | 2 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| expf16 | Exhaustive | 2 | 1 | 1 | | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| exp10 | Randomized | 3 | 1 | 1 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| exp10f | Exhaustive | 3 | 0 | 0 | 2 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| exp10f16 | Exhaustive | 2 | 1 | 1 | | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| exp2 | Randomized | 3 | 1 | 1 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| exp2f | Exhaustive | 3 | 1 | 1 | 2 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| exp2f16 | Exhaustive | 2 | 1 | 1 | | 0 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| expm1 | Randomized | 3 | 0 | 0 | 1 | 2 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| expm1f | Exhaustive | 3 | 1 | 1 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| expm1f16 | Exhaustive | 2 | 1 | 1 | | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| hypot | Randomized | 4 | 0 | 0 | 2 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| hypotf | Randomized | 4 | 0 | 0 | 1 | 2 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| hypotf16 | Exhaustive | 2 | 0 | 0 | | |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| log | Randomized | 3 | 1 | 1 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| logf | Exhaustive | 3 | 1 | 1 | 1 | 2 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| logf16 | Exhaustive | 2 | 1 | 1 | | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| log10 | Randomized | 3 | 1 | 1 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| log10f | Exhaustive | 3 | 1 | 1 | 2 | 2 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| log10f16 | Exhaustive | 2 | 1 | 1 | | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| log1p | Randomized | 2 | 1 | 1 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| log1pf | Exhaustive | 2 | 1 | 1 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| log2 | Randomized | 3 | 1 | 1 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| log2f | Exhaustive | 3 | 0 | 0 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| log2f16 | Exhaustive | 2 | 1 | 1 | | 0 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| powf (integer exp.) | Randomized | 16 | 0 | 0 | 2 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| powf (real exp.) | Randomized | 16 | 0 | 0 | 2 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| sin | Randomized | 4 | 1 | 1 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| sinf | Exhaustive | 4 | 1 | 1 | 1 | 2 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| sinf16 | Exhaustive | 2 | 1 | 1 | | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| sincos (cos part) | Randomized | 4 | 1 | 1 | 2 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| sincos (sin part) | Randomized | 4 | 1 | 1 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| sincosf (cos part) | Exhaustive | 4 | 1 | 1 | 2 | 2 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| sincosf (sin part) | Exhaustive | 4 | 1 | 1 | 1 | 2 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| sinhf | Exhaustive | 4 | 1 | 1 | 3 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| sinhf16 | Exhaustive | 2 | 1 | 1 | | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| sinpif | Exhaustive | 4 | 0 | 0 | 1 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| sinpif16 | Exhaustive | 2 | 0 | 0 | | |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| tan | Randomized | 5 | 2 | 2 | 2 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| tanf | Exhaustive | 5 | 0 | 0 | 3 | 2 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| tanf16 | Exhaustive | 2 | 1 | 1 | | 2 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| tanhf | Exhaustive | 5 | 0 | 0 | 2 | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| tanhf16 | Exhaustive | 2 | 0 | 0 | | 1 |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| tanpif | Exhaustive | 6 | 0 | 0 | | |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+| tanpif16 | Exhaustive | 2 | 1 | 1 | | |
++------------------------+-------------+---------------+--------------------+--------------------+--------------------+--------------------+
+
+**Notes on Conformance Test Results:**
+
+* **Test Method**:
+ * **Exhaustive**: Every representable point in the input space is tested. This method is used for half-precision functions and single-precision univariate functions.
+ * **Randomized**: A large, deterministic subset of the input space is tested, typically using 2\ :sup:`32` samples. This method is used for functions with larger input spaces, such as single-precision bivariate and double-precision functions.
+* ULP tolerances are based on *The Khronos Group, The OpenCL C Specification v3.0.19, Sec. 7.4, Khronos Registry [July 10, 2025]*.
+* The AMD GPU used for testing is *gfx1030*.
+* The NVIDIA GPU used for testing is *NVIDIA RTX 4000 SFF Ada Generation*.
+* For more details on the tests, please refer to the `GPU Math Conformance Tests <https://github.com/llvm/llvm-project/tree/main/offload/unittests/Conformance>`_.
More information about the libc-commits
mailing list