[libc-commits] [libc] [libc] NVPTX Profiling Draft (PR #92009)

Joseph Huber via libc-commits libc-commits at lists.llvm.org
Tue May 14 08:58:58 PDT 2024


================
@@ -0,0 +1,108 @@
+//===------------- NVPTX implementation of timing utils ---------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_LIBC_UTILS_GPU_TIMING_NVPTX
+#define LLVM_LIBC_UTILS_GPU_TIMING_NVPTX
+
+#include "src/__support/GPU/utils.h"
+#include "src/__support/common.h"
+#include "src/__support/macros/attributes.h"
+#include "src/__support/macros/config.h"
+
+#include <stdint.h>
+
+namespace LIBC_NAMESPACE {
+
+// Returns the overhead associated with calling the profiling region. This
+// allows us to substract the constant-time overhead from the latency to
+// obtain a true result. This can vary with system load.
+[[gnu::noinline]] static uint64_t overhead() {
+  volatile uint32_t x = 1;
+  uint32_t y = x;
+  gpu::sync_threads();
----------------
jhuber6 wrote:

```suggestion
  gpu::sync_threads();
  gpu::memory_barrier();
```
Below as well, plus capture the `y` store afterwards.

https://github.com/llvm/llvm-project/pull/92009


More information about the libc-commits mailing list