[PATCH] D14370: [doc] Compile CUDA with LLVM
Artem Belevich via llvm-commits
llvm-commits at lists.llvm.org
Thu Nov 5 16:58:42 PST 2015
tra added inline comments.
================
Comment at: docs/CompileCudaWithLLVM.rst:207
@@ +206,3 @@
+
+ $ clang++ -I<CUDA install path>/include -I<CUDA install path>/samples/common/inc -L<CUDA install path>/<lib64 or lib> axpy.cc -lcudart_static -lcuda -ldl -lrt -pthread
+
----------------
jingyue wrote:
> tra wrote:
> > You can pass device-side PTX to the host's cc1 with "-fcuda-include-gpubinary axpy.ptx"
> > and clang will embed PTX into host object file and will generate code to register kernels so that kernels can be launched with <<<...>>> without any additional steps.
> >
> >
> Can you clarify how to do this? I tried using `-Xclang` to set the `-fcuda-include-gpubinary` flag, but got the following.
>
> ```
> $ clang++ -Xclang -fcuda-include-gpubinary -Xclang axpy.ptx axpy.cc -I$CUDA_ROOT/include -I$CUDA_ROOT/samples/common/inc -L$CUDA_ROOT/lib64 -lcudart_static -lcuda -ldl -lrt -pthread
> axpy.cc:39:3: error: use of undeclared identifier 'axpy'
> axpy<<<1, kDataLen>>>(a, device_x, device_y);
> ^
> axpy.cc:39:9: error: expected expression
> axpy<<<1, kDataLen>>>(a, device_x, device_y);
> ^
> axpy.cc:39:23: error: expected expression
> axpy<<<1, kDataLen>>>(a, device_x, device_y);
> ^
> axpy.cc:39:25: warning: expression result unused [-Wunused-value]
> axpy<<<1, kDataLen>>>(a, device_x, device_y);
> ^
> axpy.cc:39:28: warning: expression result unused [-Wunused-value]
> axpy<<<1, kDataLen>>>(a, device_x, device_y);
> ^~~~~~~~
> 2 warnings and 3 errors generated.
> ```
The kernel must be present in axpy.cu during host compilation so compiler can generate host-side stub for kernel launch, so it only works without splitting.
http://reviews.llvm.org/D14370
More information about the llvm-commits
mailing list