[llvm] r259307 - [doc] improve the doc for CUDA

Tue Feb 2 15:52:34 PST 2016

----- Original Message -----
> From: "Jingyue Wu" <jingyue at google.com>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: "llvm-commits" <llvm-commits at lists.llvm.org>
> Sent: Tuesday, February 2, 2016 5:39:36 PM
> Subject: Re: [llvm] r259307 - [doc] improve the doc for CUDA
> 
> 
> I thought the website would pick it up automatically. How can I push
> that to the website?

It should. Maybe something is broken now? cc'ing Tanya; she might know.

 -Hal

> 
> 
> On Tue, Feb 2, 2016 at 3:01 PM, Hal Finkel < hfinkel at anl.gov > wrote:
> 
> 
> Hi Jingyue,
> 
> Thanks for updating this! FWIW, however, these changes don't yet seem
> to be reflected on the web site (
> http://llvm.org/docs/CompileCudaWithLLVM.html ).
> 
> -Hal
> 
> 
> 
> ----- Original Message -----
> > From: "Jingyue Wu via llvm-commits" < llvm-commits at lists.llvm.org >
> > To: llvm-commits at lists.llvm.org
> > Sent: Saturday, January 30, 2016 5:48:47 PM
> > Subject: [llvm] r259307 - [doc] improve the doc for CUDA
> > 
> > Author: jingyue
> > Date: Sat Jan 30 17:48:47 2016
> > New Revision: 259307
> > 
> > URL: http://llvm.org/viewvc/llvm-project?rev=259307&view=rev
> > Log:
> > [doc] improve the doc for CUDA
> > 
> > 1. Mentioned that CUDA support works best with trunk.
> > 2. Simplified the example by removing its dependency on the CUDA
> > samples.
> > 3. Explain the --cuda-gpu-arch flag.
> > 
> > Modified:
> > llvm/trunk/docs/CompileCudaWithLLVM.rst
> > 
> > Modified: llvm/trunk/docs/CompileCudaWithLLVM.rst
> > URL:
> > http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/CompileCudaWithLLVM.rst?rev=259307&r1=259306&r2=259307&view=diff
> > ==============================================================================
> > --- llvm/trunk/docs/CompileCudaWithLLVM.rst (original)
> > +++ llvm/trunk/docs/CompileCudaWithLLVM.rst Sat Jan 30 17:48:47
> > 2016
> > @@ -18,9 +18,11 @@ familiarity with CUDA. Information about
> > How to Build LLVM with CUDA Support
> > ===================================
> > 
> > -Below is a quick summary of downloading and building LLVM. Consult
> > the `Getting
> > -Started < http://llvm.org/docs/GettingStarted.html >`_ page for
> > more
> > details on
> > -setting up LLVM.
> > +CUDA support is still in development and works the best in the
> > trunk
> > version
> > +of LLVM. Below is a quick summary of downloading and building the
> > trunk
> > +version. Consult the `Getting Started
> > +< http://llvm.org/docs/GettingStarted.html >`_ page for more
> > details
> > on setting
> > +up LLVM.
> > 
> > #. Checkout LLVM
> > 
> > @@ -60,8 +62,6 @@ which multiplies a ``float`` array by a
> > 
> > .. code-block:: c++
> > 
> > - #include <helper_cuda.h> // for checkCudaErrors
> > -
> > #include <iostream>
> > 
> > __global__ void axpy(float a, float* x, float* y) {
> > @@ -78,25 +78,25 @@ which multiplies a ``float`` array by a
> > // Copy input data to device.
> > float* device_x;
> > float* device_y;
> > - checkCudaErrors(cudaMalloc(&device_x, kDataLen *
> > sizeof(float)));
> > - checkCudaErrors(cudaMalloc(&device_y, kDataLen *
> > sizeof(float)));
> > - checkCudaErrors(cudaMemcpy(device_x, host_x, kDataLen *
> > sizeof(float),
> > - cudaMemcpyHostToDevice));
> > + cudaMalloc(&device_x, kDataLen * sizeof(float));
> > + cudaMalloc(&device_y, kDataLen * sizeof(float));
> > + cudaMemcpy(device_x, host_x, kDataLen * sizeof(float),
> > + cudaMemcpyHostToDevice);
> > 
> > // Launch the kernel.
> > axpy<<<1, kDataLen>>>(a, device_x, device_y);
> > 
> > // Copy output data to host.
> > - checkCudaErrors(cudaDeviceSynchronize());
> > - checkCudaErrors(cudaMemcpy(host_y, device_y, kDataLen *
> > sizeof(float),
> > - cudaMemcpyDeviceToHost));
> > + cudaDeviceSynchronize();
> > + cudaMemcpy(host_y, device_y, kDataLen * sizeof(float),
> > + cudaMemcpyDeviceToHost);
> > 
> > // Print the results.
> > for (int i = 0; i < kDataLen; ++i) {
> > std::cout << "y[" << i << "] = " << host_y[i] << "\n";
> > }
> > 
> > - checkCudaErrors(cudaDeviceReset());
> > + cudaDeviceReset();
> > return 0;
> > }
> > 
> > @@ -104,16 +104,20 @@ The command line for compilation is simi
> > 
> > .. code-block:: console
> > 
> > - $ clang++ -o axpy -I<CUDA install path>/samples/common/inc
> > -L<CUDA
> > install path>/<lib64 or lib> axpy.cu -lcudart_static -lcuda -ldl
> > -lrt -pthread
> > + $ clang++ axpy.cu -o axpy --cuda-gpu-arch=<GPU arch> \
> > + -L<CUDA install path>/<lib64 or lib> \
> > + -lcudart_static -ldl -lrt -pthread
> > $ ./axpy
> > y[0] = 2
> > y[1] = 4
> > y[2] = 6
> > y[3] = 8
> > 
> > -Note that ``helper_cuda.h`` comes from the CUDA samples, so you
> > need
> > the
> > -samples installed for this example. ``<CUDA install path>`` is the
> > root
> > -directory where you installed CUDA SDK, typically
> > ``/usr/local/cuda``.
> > +``<CUDA install path>`` is the root directory where you installed
> > CUDA SDK,
> > +typically ``/usr/local/cuda``. ``<GPU arch>`` is `the compute
> > capability of
> > +your GPU < https://developer.nvidia.com/cuda-gpus >`_. For
> > example, if
> > you want
> > +to run your program on a GPU with compute capability of 3.5, you
> > should specify
> > +``--cuda-gpu-arch=sm_35``.
> > 
> > Optimizations
> > =============
> > 
> > 
> > _______________________________________________
> > llvm-commits mailing list
> > llvm-commits at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory