[PATCH] End-to-end CUDA compilation.

Artem Belevich tra at google.com
Mon Apr 6 17:20:41 PDT 2015


I'm still working on the changes to address your comments, so "done" means "done but not submitted yet". I'll finish remaining bits and will update the patch tomorrow (Tue).


================
Comment at: include/clang/Driver/CC1Options.td:605
@@ -604,1 +604,3 @@
 
+def cuda_include_gpucode : Separate<["-"], "cuda-include-gpucode">,
+  HelpText<"Incorporate CUDA device-side code.">;
----------------
eliben wrote:
> Should we prefix all cuda-related flags with -f for consistency with the existing ones? Don't know if it makes sense given that the cl_ ones above (for example) have no -f, but at least the CUDA ones should be consistent among themselves
Just had a chat with chandlerc@ and echristo@ on the subject.
Consensus appears to be that options related to driver behavior should be --cuda-something[=value] and options passed down to cc1 -fcuda-something[=value]. I'll rename the options I've added accordingly.



================
Comment at: lib/CodeGen/CGCUDANV.cpp:38
@@ +37,3 @@
+  /// Convenience reference to LLVM Context
+  llvm::LLVMContext &VMContext;
+  /// Convenience reference to the current module
----------------
eliben wrote:
> s/VMContext/Context/
Done.

================
Comment at: lib/CodeGen/CGCUDANV.cpp:41
@@ -35,1 +40,3 @@
+  llvm::Module &TheModule;
+  llvm::SmallVector<llvm::GlobalVariable *, 16> FatbinHandles;
 
----------------
eliben wrote:
> Document FatbinHandles
Done.

================
Comment at: lib/CodeGen/CGCUDANV.cpp:88
@@ -56,1 +87,3 @@
+  VoidPtrPtrTy = VoidPtrTy->getPointerTo();
+
 }
----------------
eliben wrote:
> extra line
Removed.

================
Comment at: lib/CodeGen/CGCUDANV.cpp:241
@@ +240,3 @@
+    llvm::Constant *Values[] = {
+        llvm::ConstantInt::get(IntTy, 0x466243b1), // Fatbin wrapper magic.
+        llvm::ConstantInt::get(IntTy, 1),          // Fatbin version.
----------------
eliben wrote:
> Use a named constant for the magic number -- it will then document itself
> 
It would be an overkill IMO. There's nothing more informative I could add to the comment that it's a magic number. 

================
Comment at: lib/CodeGen/CGCUDANV.cpp:249
@@ +248,3 @@
+        ".cuda_fatbin_wrapper");
+    FatbinWrapper->setAlignment(8);
+
----------------
eliben wrote:
> Comment explaining why
That's what nvcc does. I don't know whether there's a good reason for it. Removing it does not seem to break loading of GPU binary, so I'll remove explicit alignment.

================
Comment at: lib/CodeGen/CGCUDARuntime.h:54
@@ +53,3 @@
+  /// kernel launch stub.
+  virtual void EmitDeviceStub(CodeGenFunction &CGF, FunctionArgList &Args);
+
----------------
eliben wrote:
> I'd move this to the implementation as well, along with EmittedKernels.
> 
> Just reading the documentation of this method makes little sense given that it lives in an abstract interface. The code will be easier to untangle if the interface stays completely functionality-free. At this time this won't even add code duplication since we just have a single implementation.
Done that already while I was moving EmittedKernels out. 

================
Comment at: lib/Driver/Driver.cpp:1235
@@ +1234,3 @@
+
+    std::unique_ptr<Action> HostAction(
+        new CudaHostAction(std::move(Current), CudaDeviceJobActions));
----------------
eliben wrote:
> you can just return new CudaHostAction... here, no?
Done.

================
Comment at: lib/Driver/Driver.cpp:1522
@@ -1405,1 +1521,3 @@
 
+#if DISABLED_FOR_NOW
+    // TODO: Cuda compilation has more than one input. Need to figure out how to
----------------
eliben wrote:
> Remove
That was not intended to be committed. Will fix shortly.

================
Comment at: lib/Driver/Tools.cpp:2591
@@ +2590,3 @@
+    // Cuda compilation mode may pass more than one file.
+    // Verify that all additional files were derived from the same source.
+    IsCuda = true;
----------------
eliben wrote:
> Can you explain a bit more why/what this means in the comment?
General assumption that compilation deals with a single source file.
When we're compiling CUDA, driver may generate additional build passes and we may end 
up with an action that has more than one action input. The check makes sure that all those inputs were results of  compilation of the same source file.

Hmm. That's another case where I need info about source file type. Let me see if I can add a function to dig that out from the action chain and then this loop will not be necessary as I can explicitly check whether we're compiling a CUDA file.

http://reviews.llvm.org/D8463

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the cfe-commits mailing list