[Openmp-commits] [compiler-rt] [llvm] [openmp] [compiler-rt] Define GPU specific handling of profiling functions (PR #185763)
Shilei Tian via Openmp-commits
openmp-commits at lists.llvm.org
Thu Mar 12 10:26:43 PDT 2026
================
@@ -0,0 +1,35 @@
+/*===- InstrProfilingPlatformGPU.c - GPU profiling support ----------------===*\
+|*
+|* Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+|* See https://llvm.org/LICENSE.txt for license information.
+|* SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+|*
+\*===----------------------------------------------------------------------===*/
+
+// GPU-specific profiling functions for AMDGPU and NVPTX targets. This file
+// provides:
+//
+// Platform plumbing (section boundaries, binary IDs, VNodes) are handled by
+// InstrProfilingPlatformLinux.c via the COMPILER_RT_PROFILE_BAREMETAL path.
+
+#if defined(__NVPTX__) || defined(__AMDGPU__)
+
+#include "InstrProfiling.h"
+#include <gpuintrin.h>
+
+// Wave-cooperative counter increment. The instrumentation pass emits calls to
+// this in place of the default non-atomic load/add/store or atomicrmw sequence.
+COMPILER_RT_VISIBILITY void __llvm_profile_instrument_gpu(uint64_t *counter,
----------------
shiltian wrote:
Hmm, why is an atomic add on the first lane called "instrument gpu"?
https://github.com/llvm/llvm-project/pull/185763
More information about the Openmp-commits
mailing list