[clang] [llvm] [Offload] Provide a kernel library useable by the offload runtime (PR #104168)

Wed Aug 14 14:45:49 PDT 2024

================
@@ -0,0 +1,53 @@
+//===-- Kenrels/Memory.cpp - Memory related kernel definitions ------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+//===----------------------------------------------------------------------===//
+
+#include <cstdint>
+
+#define LAUNCH_BOUNDS(MIN, MAX)                                                \
+  __attribute__((launch_bounds(MAX), amdgpu_flat_work_group_size(MIN, MAX)))
+#define INLINE [[clang::always_inline]] inline
+#define KERNEL [[gnu::weak]] __global__
----------------
jhuber6 wrote:

I wonder if it's possible to make these `weak_odr` or something. It also shouldn't be a shock that I'm more in favor of just using pure C++ for this. I really want to make a resource dir header for these common routines so that we can use them with `-nostdlibinc`. Though I remember there being a bug w/ launch bounds when using `[[clang::amdgpu_kernel]]` directly.

Also, I feel like there might be existing solutions for this. Maybe we should ask someone like @arsenm or @yxsamliu whether or not there's some precedent for emitting kernels for `memset` that the runtime calls.

https://github.com/llvm/llvm-project/pull/104168