[PATCH] D57488: [CUDA] add support for the new kernel launch API in CUDA-9.2+.
Justin Lebar via Phabricator via cfe-commits
cfe-commits at lists.llvm.org
Wed Jan 30 17:07:08 PST 2019
jlebar accepted this revision.
jlebar added a comment.
This revision is now accepted and ready to land.
LGTM, mostly nits.
================
Comment at: clang/include/clang/Sema/Sema.h:10316
+ /// Returns the name of the launch configuration function.
+ std::string getCudaConfigureFuncName() const;
----------------
Could we be a little less vague, what exactly is the launch-configuration function? (Could be as simple as adding `e.g. cudaFooBar()`.)
================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:201
-void CGNVCUDARuntime::emitDeviceStubBody(CodeGenFunction &CGF,
- FunctionArgList &Args) {
+// CUDA 9.0+ uses new way to launch kernels. Parameters are packed in local
+// array and kernels are launched using cudaLaunchKernel().
----------------
nit `in a local array`
================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:212
+ VoidPtrTy, CharUnits::fromQuantity(16), "kernel_args",
+ llvm::ConstantInt::get(SizeTy, std::max(1UL, Args.size())));
+ // Store pointers to the arguments in a locally allocated launch_args.
----------------
Nit, s/`1UL`/`uint64{1}`/ or size_t, whatever this function takes. As-is we're baking in the assumption that unsigned long is the same as the type returned by Args.size(), which isn't necessarily true.
As an alternative, you could do `std::max<size_t>(1, Args.size())` or whatever the appropriate type is.
================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:239
+ CGM.Error(CGF.CurFuncDecl->getLocation(),
+ "Can't find declaration for cudaLaunchKernel()"); // FIXME.
+ return;
----------------
Unfixed FIXME?
================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:260
+ /*isVarArg=*/false),
+ "__cudaPopCallConfiguration");
+
----------------
I see lots of references to `__cudaPushCallConfiguration`, but this is the only reference I see to `__cudaPopCallConfiguration`. Is this a typo? Also are we supposed to emit matching push and pop function calls? Kind of weird to do one without the other...
================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:266
+ // Emit the call to cudaLaunch
+
+ llvm::Value *Kernel = CGF.Builder.CreatePointerCast(CGF.CurFn, VoidPtrTy);
----------------
Whitespace nit, maybe move this whitespace line before the comment?
================
Comment at: clang/lib/Headers/__clang_cuda_runtime_wrapper.h:429
+// CUDA runtime uses undocumented function to access kernel launch
+// configuration. We need to provide our own declaration for it here.
----------------
s/undocumented function/this undocumented function/?
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D57488/new/
https://reviews.llvm.org/D57488
More information about the cfe-commits
mailing list