[libc-commits] [libc] [libc][gpu] Add Atan2 Benchmarks (PR #104708)

Joseph Huber via libc-commits libc-commits at lists.llvm.org
Sun Aug 18 07:14:14 PDT 2024


================
@@ -121,6 +121,35 @@ throughput(F f, const cpp::array<T, N> &inputs) {
   // Return the time elapsed.
   return stop - start;
 }
+
+// Provides throughput benchmarking for 2 arguments (e.g. atan2())
+template <typename F, typename T, size_t N>
+[[gnu::noinline]] static LIBC_INLINE uint64_t throughput(
+    F f, const cpp::array<T, N> &inputs1, const cpp::array<T, N> &inputs2) {
+  asm("" ::"r"(&inputs1), "r"(&inputs2));
+
+  gpu::memory_fence();
+  uint64_t start = gpu::processor_clock();
+
+  asm("" ::"llr"(start));
+
+  uint64_t result;
+  for (size_t i = 0; i < inputs1.size(); i++) {
+    auto arg1 = inputs1[i];
+    auto arg2 = inputs2[i];
+    asm("" ::"r"(arg1), "r"(arg2));
----------------
jhuber6 wrote:

I don't think this is necessary, since the inputs are considered used because the output is considered used with the asm below.

https://github.com/llvm/llvm-project/pull/104708


More information about the libc-commits mailing list