[libc-commits] [libc] [libc] Add Timing Utils for AMDGPU (PR #96828)

via libc-commits libc-commits at lists.llvm.org
Wed Jun 26 20:31:56 PDT 2024


================
@@ -0,0 +1,73 @@
+//===------------- AMDGPU implementation of timing utils --------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_UTILS_GPU_TIMING_AMDGPU
+#define LLVM_LIBC_UTILS_GPU_TIMING_AMDGPU
+
+#include "src/__support/GPU/utils.h"
+#include "src/__support/common.h"
+#include "src/__support/macros/attributes.h"
+#include "src/__support/macros/config.h"
+
+#include <stdint.h>
+
+namespace LIBC_NAMESPACE {
+
+// Returns the overhead associated with calling the profiling region. This
+// allows us to substract the constant-time overhead from the latency to
+// obtain a true result. This can vary with system load.
+[[gnu::noinline]] static LIBC_INLINE uint64_t overhead() {
+  gpu::memory_fence();
+  uint64_t start = gpu::processor_clock();
+  uint32_t result = 0.0;
+  asm("v_or_b32 %[v_reg], 0, %[v_reg]\n" ::[v_reg] "v"(result) :);
+  asm("" ::"s"(start));
+  uint64_t stop = gpu::processor_clock();
+  return stop - start;
+}
+
+// Profile a simple function and obtain its latency in clock cycles on the
+// system. This function cannot be inlined or else it will disturb the very
+// delicate balance of hard-coded dependencies.
+template <typename F, typename T>
+[[gnu::noinline]] static LIBC_INLINE uint64_t latency(F f, T t) {
+  // We need to store the input somewhere to guarantee that the compiler will
+  // not constant propagate it and remove the profiling region.
+  volatile uint32_t storage = t;
+  float arg = storage;
----------------
jameshu15869 wrote:

Are there problems with trying to link when it's generic? I didn't notice before but trying to use `T` causes `lld` to throw errors such as `error: couldn't allocate input reg for constraint 'r'`(And for the `s` asm input constraint) when I switched from `uint64_t` to `T`

https://github.com/llvm/llvm-project/pull/96828


More information about the libc-commits mailing list