[libc-commits] [libc] [libc][gpu] Add Atan2 Benchmarks (PR #104708)
Joseph Huber via libc-commits
libc-commits at lists.llvm.org
Sun Aug 18 07:14:14 PDT 2024
================
@@ -121,6 +121,35 @@ throughput(F f, const cpp::array<T, N> &inputs) {
// Return the time elapsed.
return stop - start;
}
+
+// Provides throughput benchmarking for 2 arguments (e.g. atan2())
+template <typename F, typename T, size_t N>
+[[gnu::noinline]] static LIBC_INLINE uint64_t throughput(
+ F f, const cpp::array<T, N> &inputs1, const cpp::array<T, N> &inputs2) {
+ asm("" ::"r"(&inputs1), "r"(&inputs2));
+
+ gpu::memory_fence();
+ uint64_t start = gpu::processor_clock();
+
+ asm("" ::"llr"(start));
+
+ uint64_t result;
+ for (size_t i = 0; i < inputs1.size(); i++) {
+ auto arg1 = inputs1[i];
+ auto arg2 = inputs2[i];
+ asm("" ::"r"(arg1), "r"(arg2));
----------------
jhuber6 wrote:
I don't think this is necessary, since the inputs are considered used because the output is considered used with the asm below.
https://github.com/llvm/llvm-project/pull/104708
More information about the libc-commits
mailing list