[PATCH] D57488: [CUDA] add support for the new kernel launch API in CUDA-9.2+.

Justin Lebar via Phabricator via cfe-commits cfe-commits at lists.llvm.org
Wed Jan 30 17:07:08 PST 2019


jlebar accepted this revision.
jlebar added a comment.
This revision is now accepted and ready to land.

LGTM, mostly nits.



================
Comment at: clang/include/clang/Sema/Sema.h:10316
 
+  /// Returns the name of the launch configuration function.
+  std::string getCudaConfigureFuncName() const;
----------------
Could we be a little less vague, what exactly is the launch-configuration function?  (Could be as simple as adding `e.g. cudaFooBar()`.)


================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:201
 
-void CGNVCUDARuntime::emitDeviceStubBody(CodeGenFunction &CGF,
-                                         FunctionArgList &Args) {
+// CUDA 9.0+ uses new way to launch kernels. Parameters are packed in local
+// array and kernels are launched using cudaLaunchKernel().
----------------
nit `in a local array`


================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:212
+      VoidPtrTy, CharUnits::fromQuantity(16), "kernel_args",
+      llvm::ConstantInt::get(SizeTy, std::max(1UL, Args.size())));
+  // Store pointers to the arguments in a locally allocated launch_args.
----------------
Nit, s/`1UL`/`uint64{1}`/ or size_t, whatever this function takes.  As-is we're baking in the assumption that unsigned long is the same as the type returned by Args.size(), which isn't necessarily true.

As an alternative, you could do `std::max<size_t>(1, Args.size())` or whatever the appropriate type is.


================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:239
+    CGM.Error(CGF.CurFuncDecl->getLocation(),
+              "Can't find declaration for cudaLaunchKernel()"); // FIXME.
+    return;
----------------
Unfixed FIXME?


================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:260
+                              /*isVarArg=*/false),
+      "__cudaPopCallConfiguration");
+
----------------
I see lots of references to `__cudaPushCallConfiguration`, but this is the only reference I see to `__cudaPopCallConfiguration`.  Is this a typo?  Also are we supposed to emit matching push and pop function calls?  Kind of weird to do one without the other...


================
Comment at: clang/lib/CodeGen/CGCUDANV.cpp:266
+  // Emit the call to cudaLaunch
+
+  llvm::Value *Kernel = CGF.Builder.CreatePointerCast(CGF.CurFn, VoidPtrTy);
----------------
Whitespace nit, maybe move this whitespace line before the comment?


================
Comment at: clang/lib/Headers/__clang_cuda_runtime_wrapper.h:429
 
+// CUDA runtime uses undocumented function to access kernel launch
+// configuration. We need to provide our own declaration for it here.
----------------
s/undocumented function/this undocumented function/?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D57488/new/

https://reviews.llvm.org/D57488





More information about the cfe-commits mailing list