[PATCH] D57488: [CUDA] add support for the new kernel launch API in CUDA-9.2+.
Artem Belevich via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Thu Jan 31 10:37:58 PST 2019
tra added inline comments.
================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:239
+ CGM.Error(CGF.CurFuncDecl->getLocation(),
+ "Can't find declaration for cudaLaunchKernel()"); // FIXME.
+ return;
----------------
jlebar wrote:
> Unfixed FIXME?
Fixed the comment. :-)
There's not much we can do if we have no declaration for cudaLaunchKernel, so throwing the error here is the best we can do.
================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:260
+ /*isVarArg=*/false),
+ "__cudaPopCallConfiguration");
+
----------------
jlebar wrote:
> I see lots of references to `__cudaPushCallConfiguration`, but this is the only reference I see to `__cudaPopCallConfiguration`. Is this a typo? Also are we supposed to emit matching push and pop function calls? Kind of weird to do one without the other...
the `pop` part is indeed used only here.
`Push` is something that takes user-specified parameters, so we get Sema to check them.
`Pop` is much simpler and does not have any direct user exposure, so we can just create and use it here.
As for matching, it is balanced. `Push` is called at the kernel launch site with the parameters of `<<<>>>` .`Pop` is done in the host-side kernel stub where we retrieve those parameters and pass them to the CUDA runtime.
Essentially, push/pop are poor names for these functions are the nesting is never more than one level deep. We could've just stashed the arguments in a fixed buffer somewhere.
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D57488/new/
https://reviews.llvm.org/D57488
More information about the cfe-commits
mailing list