[PATCH] D14370: [doc] Compile CUDA with LLVM

Tue Nov 10 10:45:16 PST 2015

tra added inline comments.

================
Comment at: docs/CompileCudaWithLLVM.rst:83-85
@@ +82,5 @@
+  #include <cuda.h>
+  #include <cuda_runtime.h>
+  #include <helper_cuda.h> // for checkCudaErrors
+  #include <cuda_builtin_vars.h>
+
----------------
clang will -include cuda_runtime.h (nvcc does, too), so it's not necessary to include it from source.

clang's cuda_runtime.h wrapper will include cuda_builtin_vars.h, so including it explicitly here is not necessary as well.

helper_cuda.h comes from CUDA samples. I would suggest adding a note that we need CUDA samples installed as well because it's possible to have CUDA installed without them.

================
Comment at: docs/CompileCudaWithLLVM.rst:129
@@ +128,3 @@
+
+  $ clang++ -o axpy -I<CUDA install path>/include -I<CUDA install path>/samples/common/inc -L<CUDA install path>/<lib64 or lib> axpy.cu -lcudart_static -lcuda -ldl -lrt -pthread
+  $ ./axpy
----------------
"-I<CUDA install path>/include" -- unnecessary. clang would add it.

You also need to add -std=c++11 in order to use nullptr.

I've also found a weird issue with my patch -- without optimizations, kernel launch fails (silently in your example). For the time being compile with -O2. I'll find and fix the problem ASAP.

http://reviews.llvm.org/D14370