[clang] [clang] Fix inconsistencies with the device_kernel attr on different targets (PR #161905)
Nick Sarnie via cfe-commits
cfe-commits at lists.llvm.org
Mon Oct 6 07:22:08 PDT 2025
================
@@ -419,9 +419,11 @@ void AMDGPUTargetCodeGenInfo::setTargetAttributes(
return;
const FunctionDecl *FD = dyn_cast_or_null<FunctionDecl>(D);
- if (FD)
+ if (FD) {
setFunctionDeclAttributes(FD, F, M);
-
+ if (FD->hasAttr<DeviceKernelAttr>() && !M.getLangOpts().OpenCL)
+ F->setCallingConv(llvm::CallingConv::AMDGPU_KERNEL);
----------------
sarnex wrote:
There's no CC for OpenCL specifically, it uses the target one, but OCL is really particular on what it wants to be a kernel in IR, it has this kernel stub concept that seems to complicate things.
These tests fail if I remove the `!M.getLangOpts().OpenCL` from the above code:
```
Failed Tests (13):
Clang :: CodeGenOpenCL/addr-space-struct-arg.cl
Clang :: CodeGenOpenCL/amdgpu-abi-struct-arg-byref.cl
Clang :: CodeGenOpenCL/amdgpu-abi-struct-coerce.cl
Clang :: CodeGenOpenCL/amdgpu-call-kernel.cl
Clang :: CodeGenOpenCL/amdgpu-enqueue-kernel.cl
Clang :: CodeGenOpenCL/amdgpu-printf.cl
Clang :: CodeGenOpenCL/builtins-amdgcn-gws-insts.cl
Clang :: CodeGenOpenCL/builtins-fp-atomics-gfx8.cl
Clang :: CodeGenOpenCL/implicit-addrspacecast-function-parameter.cl
Clang :: CodeGenOpenCL/opencl-kernel-call.cl
Clang :: CodeGenOpenCL/visibility.cl
Clang :: Frontend/amdgcn-machine-analysis-remarks.cl
Clang :: Misc/backend-resource-limit-diagnostics.cl
```
I looked at `CodeGenOpenCL/amdgpu-printf.cl`, and the problem there is the IR function `__clang_ocl_kern_imp_test_printf_noargs` is now getting marked as a kernel
```
define dso_local amdgpu_kernel void @__clang_ocl_kern_imp_test_printf_noargs() ...
```
which is not expected and is causing the failure.
So it seems for OpenCL having the attr doesn't mean the `ocl_kernel_imp` stub whatever thing should not be a kernel, but the normal non-stub function one should be, as we see the below in both the good and failing case:
```
define dso_local amdgpu_kernel void @test_printf_noargs()
```
The semantic fact (that having the attr doesn't mean it will have the kernel CC in IR) was true before my changes, and the OpenCL hanging has many checks to see if something has the attr which is currently returning true for these stub functions so just removing the attr would probably break things, so I expect it would be be a mess to untangle it. If you have any ideas let me know, I'm happy to try but I don't know anything about OpenCL.
```
https://github.com/llvm/llvm-project/pull/161905
More information about the cfe-commits
mailing list