[clang] [SYCL] Basic code generation for SYCL kernel caller offload entry point functions. (PR #133030)
Tom Honermann via cfe-commits
cfe-commits at lists.llvm.org
Mon Apr 7 11:09:29 PDT 2025
================
@@ -0,0 +1,186 @@
+// RUN: %clang_cc1 -fsycl-is-host -emit-llvm -triple x86_64-unknown-linux-gnu %s -o - | FileCheck --check-prefixes=CHECK-HOST,CHECK-HOST-LINUX %s
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm -aux-triple x86_64-unknown-linux-gnu -triple amdgcn-amd-amdhsa %s -o - | FileCheck --check-prefixes=CHECK-DEVICE,CHECK-AMDGCN %s
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm -aux-triple x86_64-unknown-linux-gnu -triple nvptx-nvidia-cuda %s -o - | FileCheck --check-prefixes=CHECK-DEVICE,CHECK-NVPTX %s
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm -aux-triple x86_64-unknown-linux-gnu -triple nvptx64-nvidia-cuda %s -o - | FileCheck --check-prefixes=CHECK-DEVICE,CHECK-NVPTX %s
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm -aux-triple x86_64-unknown-linux-gnu -triple spir-unknown-unknown %s -o - | FileCheck --check-prefixes=CHECK-DEVICE,CHECK-SPIR %s
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm -aux-triple x86_64-unknown-linux-gnu -triple spir64-unknown-unknown %s -o - | FileCheck --check-prefixes=CHECK-DEVICE,CHECK-SPIR %s
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm -aux-triple x86_64-unknown-linux-gnu -triple spirv32-unknown-unknown %s -o - | FileCheck --check-prefixes=CHECK-DEVICE,CHECK-SPIR %s
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm -aux-triple x86_64-unknown-linux-gnu -triple spirv64-unknown-unknown %s -o - | FileCheck --check-prefixes=CHECK-DEVICE,CHECK-SPIR %s
+// RUN: %clang_cc1 -fsycl-is-host -emit-llvm -triple x86_64-pc-windows-msvc %s -o - | FileCheck --check-prefixes=CHECK-HOST,CHECK-HOST-WINDOWS %s
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm -aux-triple x86_64-pc-windows-msvc -triple amdgcn-amd-amdhsa %s -o - | FileCheck --check-prefixes=CHECK-DEVICE,CHECK-AMDGCN %s
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm -aux-triple x86_64-pc-windows-msvc -triple nvptx-nvidia-cuda %s -o - | FileCheck --check-prefixes=CHECK-DEVICE,CHECK-NVPTX %s
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm -aux-triple x86_64-pc-windows-msvc -triple nvptx64-nvidia-cuda %s -o - | FileCheck --check-prefixes=CHECK-DEVICE,CHECK-NVPTX %s
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm -aux-triple x86_64-pc-windows-msvc -triple spir-unknown-unknown %s -o - | FileCheck --check-prefixes=CHECK-DEVICE,CHECK-SPIR %s
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm -aux-triple x86_64-pc-windows-msvc -triple spir64-unknown-unknown %s -o - | FileCheck --check-prefixes=CHECK-DEVICE,CHECK-SPIR %s
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm -aux-triple x86_64-pc-windows-msvc -triple spirv32-unknown-unknown %s -o - | FileCheck --check-prefixes=CHECK-DEVICE,CHECK-SPIR %s
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm -aux-triple x86_64-pc-windows-msvc -triple spirv64-unknown-unknown %s -o - | FileCheck --check-prefixes=CHECK-DEVICE,CHECK-SPIR %s
+
+// Test the generation of SYCL kernel caller functions. These functions are
+// generated from functions declared with the sycl_kernel_entry_point attribute
+// and emited during device compilation. They are not emitted during device
+// compilation.
+
+template <typename KernelName, typename KernelType>
+[[clang::sycl_kernel_entry_point(KernelName)]]
+void kernel_single_task(KernelType kernelFunc) {
+ kernelFunc();
+}
+
+struct single_purpose_kernel_name;
+struct single_purpose_kernel {
+ void operator()() const {}
+};
+
+[[clang::sycl_kernel_entry_point(single_purpose_kernel_name)]]
+void single_purpose_kernel_task(single_purpose_kernel kernelFunc) {
+ kernelFunc();
+}
+
+int main() {
+ int capture;
+ kernel_single_task<class lambda_kernel_name>(
+ [=]() {
+ (void) capture;
+ });
+ single_purpose_kernel obj;
+ single_purpose_kernel_task(obj);
+}
+
+// Verify that SYCL kernel caller functions are not emitted during host
+// compilation.
+//
+// CHECK-HOST-NOT: _ZTS26single_purpose_kernel_name
+// CHECK-HOST-NOT: _ZTSZ4mainE18lambda_kernel_name
+
+// Verify that sycl_kernel_entry_point attributed functions are not emitted
+// during device compilation.
+//
+// CHECK-DEVICE-NOT: single_purpose_kernel_task
+// CHECK-DEVICE-NOT: kernel_single_task
+
+// Verify that no code is generated for the bodies of sycl_kernel_entry_point
+// attributed functions during host compilation. ODR-use of these functions may
+// require them to be emitted, but they have no effect if called.
+//
+// CHECK-HOST-LINUX: define dso_local void @_Z26single_purpose_kernel_task21single_purpose_kernel() #[[LINUX_ATTR0:[0-9]+]] {
+// CHECK-HOST-LINUX-NEXT: entry:
+// CHECK-HOST-LINUX-NEXT: %kernelFunc = alloca %struct.single_purpose_kernel, align 1
+// CHECK-HOST-LINUX-NEXT: ret void
----------------
tahonermann wrote:
These functions may be called on the host. Any ODR-use (for a function template) suffices for the device-side entry point function to be emitted, but the common case is that such ODR-use will be due to a function call; that is what is done in the Intel SYCL library. When such calls occur, the function body is actually evaluated on the host; that is why the body is emitted as a no-op. This is what was established by https://github.com/llvm/llvm-project/pull/122379 per the following addition to `CodeGenFunction::EmitSimpleStmt()` in `clang/lib/CodeGen/CGStmt.cpp`:
```
504 bool CodeGenFunction::EmitSimpleStmt(const Stmt *S,
505 ArrayRef<const Attr *> Attrs) {
...
541 case Stmt::SYCLKernelCallStmtClass:
542 // SYCL kernel call statements are generated as wrappers around the body
543 // of functions declared with the sycl_kernel_entry_point attribute. Such
544 // functions are used to specify how a SYCL kernel (a function object) is
545 // to be invoked; the SYCL kernel call statement contains a transformed
546 // variation of the function body and is used to generate a SYCL kernel
547 // caller function; a function that serves as the device side entry point
548 // used to execute the SYCL kernel. The sycl_kernel_entry_point attributed
549 // function is invoked by host code in order to trigger emission of the
550 // device side SYCL kernel caller function and to generate metadata needed
551 // by SYCL run-time library implementations; the function is otherwise
552 // intended to have no effect. As such, the function body is not evaluated
553 // as part of the invocation during host compilation (and the function
554 // should not be called or emitted during device compilation); the SYCL
555 // kernel call statement is thus handled as a null statement for the
556 // purpose of code generation.
557 break;
...
560 }
```
https://github.com/llvm/llvm-project/pull/133030
More information about the cfe-commits
mailing list