[PATCH] D14370: [doc] Compile CUDA with LLVM

Fri Nov 6 10:06:16 PST 2015

I'll let you do that after this patch. You know much better than me on
those options.

On Fri, Nov 6, 2015 at 9:59 AM, Artem Belevich <tra at google.com> wrote:

> tra added inline comments.
>
> ================
> Comment at: docs/CompileCudaWithLLVM.rst:207
> @@ +206,3 @@
> +
> +     $ clang++ -I<CUDA install path>/include -I<CUDA install
> path>/samples/common/inc -L<CUDA install path>/<lib64 or lib> axpy.cc
> -lcudart_static -lcuda -ldl -lrt -pthread
> +
> ----------------
> jingyue wrote:
> > tra wrote:
> > > jingyue wrote:
> > > > tra wrote:
> > > > > You can pass device-side PTX to the host's cc1 with
> "-fcuda-include-gpubinary axpy.ptx"
> > > > > and clang will embed PTX into host object file and will generate
> code to register kernels so that kernels can be launched with <<<...>>>
> without any additional steps.
> > > > >
> > > > >
> > > > Can you clarify how to do this? I tried using `-Xclang` to set the
> `-fcuda-include-gpubinary` flag, but got the following.
> > > >
> > > > ```
> > > > $ clang++ -Xclang -fcuda-include-gpubinary -Xclang axpy.ptx axpy.cc
> -I$CUDA_ROOT/include -I$CUDA_ROOT/samples/common/inc -L$CUDA_ROOT/lib64
> -lcudart_static -lcuda -ldl -lrt -pthread
> > > > axpy.cc:39:3: error: use of undeclared identifier 'axpy'
> > > >   axpy<<<1, kDataLen>>>(a, device_x, device_y);
> > > >   ^
> > > > axpy.cc:39:9: error: expected expression
> > > >   axpy<<<1, kDataLen>>>(a, device_x, device_y);
> > > >         ^
> > > > axpy.cc:39:23: error: expected expression
> > > >   axpy<<<1, kDataLen>>>(a, device_x, device_y);
> > > >                       ^
> > > > axpy.cc:39:25: warning: expression result unused [-Wunused-value]
> > > >   axpy<<<1, kDataLen>>>(a, device_x, device_y);
> > > >                         ^
> > > > axpy.cc:39:28: warning: expression result unused [-Wunused-value]
> > > >   axpy<<<1, kDataLen>>>(a, device_x, device_y);
> > > >                            ^~~~~~~~
> > > > 2 warnings and 3 errors generated.
> > > > ```
> > > The kernel must be present in axpy.cu during host compilation so
> compiler can generate host-side stub for kernel launch, so it only works
> without splitting.
> > Still have issues with that.
> >
> > However, I managed to apply your three pending patches, and the patched
> version works great! So, I think it makes more sense for this document to
> ask early adopters to apply the patches and try the more functional patched
> version. Agree?
> Sure. The patches simplify large portion of this section down to
>
> ```
> clang++ -o axpy [...] axpy.cu
>
> ```
> I'll need to add details on various CUDA-related options I've added to
> clang.
> Do you want to incorporate them into this patch of should I do that after
> you've committed the docs?
>
>
> http://reviews.llvm.org/D14370
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20151106/94b67069/attachment.html>