[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)
John McCall via cfe-commits
cfe-commits at lists.llvm.org
Mon Dec 2 00:04:36 PST 2024
================
@@ -5692,7 +5692,10 @@ CGCallee CodeGenFunction::EmitCallee(const Expr *E) {
// Resolve direct calls.
} else if (auto DRE = dyn_cast<DeclRefExpr>(E)) {
if (auto FD = dyn_cast<FunctionDecl>(DRE->getDecl())) {
- return EmitDirectCallee(*this, FD);
+ auto CalleeDecl = FD->hasAttr<OpenCLKernelAttr>()
+ ? GlobalDecl(FD, KernelReferenceKind::Stub)
+ : FD;
+ return EmitDirectCallee(*this, CalleeDecl);
----------------
rjmccall wrote:
Hmm. It looks like the CUDA folks had this same problem and came up with an awkward workaround for it in `EmitDirectCallee`. We should really just be requesting the right GD in the first place. Could you add a `getGlobalDeclForDirectCall` function that does the right thing for both modes? If it ends up causing complicated behavior/test changes in CUDA mode, you can feel free to exclude CUDA for now and just leave a comment saying that the workaround should be removed in favor of doing the right thing in that function.
https://github.com/llvm/llvm-project/pull/115821
More information about the cfe-commits
mailing list