[PATCH] End-to-end CUDA compilation.

Mon Apr 6 15:02:21 PDT 2015

A couple of replies to comments; will do another pass on the new revision


================
Comment at: include/clang/Driver/CC1Options.td:611
@@ -610,1 +610,3 @@
 
+def cuda_include_gpucode : Separate<["-"], "cuda-include-gpucode">,
+  HelpText<"Incorporate CUDA device-side code.">;
----------------
tra wrote:
> eliben wrote:
> > I'm wondering about the "gpucode" mnemonic :-) It's unusual and kinda ambiguous. What does gpucode mean here? PTX? Maybe PTX can be more explicit then?
> > 
> > PTX is probably not too specific since this flag begins with "cuda_" so it's already about the CUDA/PTX flow.
> > 
> > [this applies to other uses of "gpucode" too]
> It's actually an opaque blob. clang does not care what's in the file as it just passes the bits to cudart which passes it to the driver. The driver can digest PTX (which we pass in this case), but it will as happily accept GPU code packed in fatbin or cubin formats. If/when we grow ability to compile device-side to SASS, we would just  do "-cuda-include-gpucode gpu-code-packed-in.cubin" and it should work with no other changes on the host side.
> 
> So, 'gpucode' was the best approximation I could come up with that would keep "GPU code in any shape or form as long as it's PTX/fatbin or cubin".
> 
> I'd be happy to change it. Suggestions?
I see - some generic mnemonic is needed, I agree (so PTX is not a good idea). But "--gpu-code" is a nvcc flag that means something completely different :-/ So "gpu code" here may still be confusing. Maybe "gpublob" or "gpuobject" or "gpubinary" or something like that. I can't think of a perfect solution right now.

 I'll leave it to your discretion.

================
Comment at: include/clang/Frontend/CodeGenOptions.h:164
@@ +163,3 @@
+  /// List of CUDA GPU code blobs to incorporate
+  std::vector<std::string> CudaGpuCodeFiles;
+
----------------
tra wrote:
> eliben wrote:
> > s/Files/Blobs/ or "strings"? And as above, maybe PTX would be better than "GpuCode"
> It's a vector of strings containing names of files that contain GPU code blobs, whatever their format may be. I'll rename the variable to CudaGpuCodeFileNames and will update the comment to reflect that.
> 
> How about this?
>   /// A list of file names passed with -cuda-include-gpucode options to forward
>   /// to CUDA runtime back-end for incorporating them into host-side object
>   /// file.
>   std::vector<std::string> CudaGpuCodeFileNames;
> 
> 
Yeah, if this is for file names, it's a good idea to have "FileNames" in the name

================
Comment at: lib/CodeGen/CGCUDARuntime.h:42
@@ -34,1 +41,3 @@
 
+  llvm::SmallVector<llvm::Function *, 16> EmittedKernels;
+  llvm::SmallVector<llvm::GlobalVariable *, 16> FatbinHandles;
----------------
tra wrote:
> eliben wrote:
> > It would really be great not to have data inside this abstract interface; is this necessary?
> > 
> > Note that "fatbin handles" sounds very NVIDIA CUDA runtime specific, though this interface is allegedly generic :)
> List of generated kernels is something that I expect to be useful for all subclasses of CUDARuntime. 
> That's why I've put EmittedKernels there and a non-virtual methodEmitDeviceStub() to populate it.
> 
> FatbinHandles, on the other hand, is indeed cudart-specific. I've moved it into CGCUDANV.
I would still remove EmittedKernels for now; we only have a single CUDA runtime at this time in upstream, so this feels redundant, as it makes the runtime interface / implementation barrier less clean than it should be. In the future if/when new runtime implementations are added, we'll figure out what's the best way to factor common code out is. 

YAGNI, essentially :)

http://reviews.llvm.org/D8463

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/