[clang] [Clang][OpenCL][AMDGPU] Allow a kernel to call another kernel (PR #115821)

John McCall via cfe-commits cfe-commits at lists.llvm.org
Mon Dec 2 00:04:36 PST 2024


================
@@ -5692,7 +5692,10 @@ CGCallee CodeGenFunction::EmitCallee(const Expr *E) {
   // Resolve direct calls.
   } else if (auto DRE = dyn_cast<DeclRefExpr>(E)) {
     if (auto FD = dyn_cast<FunctionDecl>(DRE->getDecl())) {
-      return EmitDirectCallee(*this, FD);
+      auto CalleeDecl = FD->hasAttr<OpenCLKernelAttr>()
+                            ? GlobalDecl(FD, KernelReferenceKind::Stub)
+                            : FD;
+      return EmitDirectCallee(*this, CalleeDecl);
----------------
rjmccall wrote:

Hmm.  It looks like the CUDA folks had this same problem and came up with an awkward workaround for it in `EmitDirectCallee`. We should really just be requesting the right GD in the first place. Could you add a `getGlobalDeclForDirectCall` function that does the right thing for both modes?  If it ends up causing complicated behavior/test changes in CUDA mode, you can feel free to exclude CUDA for now and just leave a comment saying that the workaround should be removed in favor of doing the right thing in that function.

https://github.com/llvm/llvm-project/pull/115821


More information about the cfe-commits mailing list