[clang] [llvm] [Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (PR #94549)

Fri Jun 7 17:24:24 PDT 2024

================
@@ -0,0 +1,31 @@
+/*===------ LLVM/Offload helpers for kernel languages (CUDA/HIP) -*- c++ -*-===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+ *
+ *===-----------------------------------------------------------------------===
+ */
+
+#include <stdlib.h>
+
+#define __host__ __attribute__((host))
+#define __device__ __attribute__((device))
+#define __global__ __attribute__((global))
+#define __shared__ __attribute__((shared))
+#define __constant__ __attribute__((constant))
+#define __managed__ __attribute__((managed))
+
+extern "C" {
+
+typedef struct dim3 {
+  dim3() {}
+  dim3(unsigned x) : x(x) {}
+  unsigned x = 0, y = 0, z = 0;
+} dim3;
+
+// TODO: For some reason the CUDA device compilation requires this declaration
+// to be present but it should not.
----------------
jdoerfert wrote:

I mean that the device code generation never emits __XXXPushCallConfiguration. That is a host only call generated by the compiler. Nevertheless, it is set up such that the device side requires a valid declaration for now, which is, at least, weird.

https://github.com/llvm/llvm-project/pull/94549