[libc-commits] [libc] [libc] Use `rint` builtin for rounding on the GPU (PR #98345)

Wed Jul 10 09:26:28 PDT 2024

https://github.com/jhuber6 created https://github.com/llvm/llvm-project/pull/98345

Summary:
Previously this went through the generic bit-twiddling implementation
instead of using the dedicated GPU instruction. This patch adds this in
to the utility, mirroring the special-casing of the x64 and aarch
targets. This results in much nicer code. The following example shows
the opencl device libs implementation on the left and the LLVM libc on
the right, https://godbolt.org/z/3ch48ccf5. The libc version is
"branchier", but the results seem similar.


>From a95173a02f76b711e7200adf662ee512f9ecc9cf Mon Sep 17 00:00:00 2001
From: Joseph Huber <huberjn at outlook.com>
Date: Wed, 10 Jul 2024 11:22:31 -0500
Subject: [PATCH] [libc] Use `rint` builtin for rounding on the GPU

Summary:
Previously this went through the generic bit-twiddling implementation
instead of using the dedicated GPU instruction. This patch adds this in
to the utility, mirroring the special-casing of the x64 and aarch
targets. This results in much nicer code. The following example shows
the opencl device libs implementation on the left and the LLVM libc on
the right, https://godbolt.org/z/3ch48ccf5. The libc version is
"branchier", but the results seem similar.
---
 libc/src/__support/FPUtil/nearest_integer.h | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/libc/src/__support/FPUtil/nearest_integer.h b/libc/src/__support/FPUtil/nearest_integer.h
index bc98667c8a3b3..f9d952cfa163b 100644
--- a/libc/src/__support/FPUtil/nearest_integer.h
+++ b/libc/src/__support/FPUtil/nearest_integer.h
@@ -17,6 +17,18 @@
 #include "x86_64/nearest_integer.h"
 #elif defined(LIBC_TARGET_ARCH_IS_AARCH64)
 #include "aarch64/nearest_integer.h"
+#elif defined(LIBC_TARGET_ARCH_IS_GPU)
+
+namespace LIBC_NAMESPACE {
+namespace fputil {
+
+LIBC_INLINE float nearest_integer(float x) { return __builtin_rintf(x); }
+
+LIBC_INLINE double nearest_integer(double x) { return __builtin_rint(x); }
+
+} // namespace fputil
+} // namespace LIBC_NAMESPACE
+
 #else
 
 namespace LIBC_NAMESPACE {