[llvm] r259307 - [doc] improve the doc for CUDA

Fri Feb 5 23:40:28 PST 2016

> On Feb 2, 2016, at 3:52 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> ----- Original Message -----
>> From: "Jingyue Wu" <jingyue at google.com>
>> To: "Hal Finkel" <hfinkel at anl.gov>
>> Cc: "llvm-commits" <llvm-commits at lists.llvm.org>
>> Sent: Tuesday, February 2, 2016 5:39:36 PM
>> Subject: Re: [llvm] r259307 - [doc] improve the doc for CUDA
>> 
>> 
>> I thought the website would pick it up automatically. How can I push
>> that to the website?
> 
> It should. Maybe something is broken now? cc'ing Tanya; she might know.
> 

This should now be working again. Please confirm you are seeing the correct html documents.

-Tanya

> -Hal
> 
>> 
>> 
>> On Tue, Feb 2, 2016 at 3:01 PM, Hal Finkel < hfinkel at anl.gov > wrote:
>> 
>> 
>> Hi Jingyue,
>> 
>> Thanks for updating this! FWIW, however, these changes don't yet seem
>> to be reflected on the web site (
>> http://llvm.org/docs/CompileCudaWithLLVM.html ).
>> 
>> -Hal
>> 
>> 
>> 
>> ----- Original Message -----
>>> From: "Jingyue Wu via llvm-commits" < llvm-commits at lists.llvm.org >
>>> To: llvm-commits at lists.llvm.org
>>> Sent: Saturday, January 30, 2016 5:48:47 PM
>>> Subject: [llvm] r259307 - [doc] improve the doc for CUDA
>>> 
>>> Author: jingyue
>>> Date: Sat Jan 30 17:48:47 2016
>>> New Revision: 259307
>>> 
>>> URL: http://llvm.org/viewvc/llvm-project?rev=259307&view=rev
>>> Log:
>>> [doc] improve the doc for CUDA
>>> 
>>> 1. Mentioned that CUDA support works best with trunk.
>>> 2. Simplified the example by removing its dependency on the CUDA
>>> samples.
>>> 3. Explain the --cuda-gpu-arch flag.
>>> 
>>> Modified:
>>> llvm/trunk/docs/CompileCudaWithLLVM.rst
>>> 
>>> Modified: llvm/trunk/docs/CompileCudaWithLLVM.rst
>>> URL:
>>> http://llvm.org/viewvc/llvm-project/llvm/trunk/docs/CompileCudaWithLLVM.rst?rev=259307&r1=259306&r2=259307&view=diff
>>> ==============================================================================
>>> --- llvm/trunk/docs/CompileCudaWithLLVM.rst (original)
>>> +++ llvm/trunk/docs/CompileCudaWithLLVM.rst Sat Jan 30 17:48:47
>>> 2016
>>> @@ -18,9 +18,11 @@ familiarity with CUDA. Information about
>>> How to Build LLVM with CUDA Support
>>> ===================================
>>> 
>>> -Below is a quick summary of downloading and building LLVM. Consult
>>> the `Getting
>>> -Started < http://llvm.org/docs/GettingStarted.html >`_ page for
>>> more
>>> details on
>>> -setting up LLVM.
>>> +CUDA support is still in development and works the best in the
>>> trunk
>>> version
>>> +of LLVM. Below is a quick summary of downloading and building the
>>> trunk
>>> +version. Consult the `Getting Started
>>> +< http://llvm.org/docs/GettingStarted.html >`_ page for more
>>> details
>>> on setting
>>> +up LLVM.
>>> 
>>> #. Checkout LLVM
>>> 
>>> @@ -60,8 +62,6 @@ which multiplies a ``float`` array by a
>>> 
>>> .. code-block:: c++
>>> 
>>> - #include <helper_cuda.h> // for checkCudaErrors
>>> -
>>> #include <iostream>
>>> 
>>> __global__ void axpy(float a, float* x, float* y) {
>>> @@ -78,25 +78,25 @@ which multiplies a ``float`` array by a
>>> // Copy input data to device.
>>> float* device_x;
>>> float* device_y;
>>> - checkCudaErrors(cudaMalloc(&device_x, kDataLen *
>>> sizeof(float)));
>>> - checkCudaErrors(cudaMalloc(&device_y, kDataLen *
>>> sizeof(float)));
>>> - checkCudaErrors(cudaMemcpy(device_x, host_x, kDataLen *
>>> sizeof(float),
>>> - cudaMemcpyHostToDevice));
>>> + cudaMalloc(&device_x, kDataLen * sizeof(float));
>>> + cudaMalloc(&device_y, kDataLen * sizeof(float));
>>> + cudaMemcpy(device_x, host_x, kDataLen * sizeof(float),
>>> + cudaMemcpyHostToDevice);
>>> 
>>> // Launch the kernel.
>>> axpy<<<1, kDataLen>>>(a, device_x, device_y);
>>> 
>>> // Copy output data to host.
>>> - checkCudaErrors(cudaDeviceSynchronize());
>>> - checkCudaErrors(cudaMemcpy(host_y, device_y, kDataLen *
>>> sizeof(float),
>>> - cudaMemcpyDeviceToHost));
>>> + cudaDeviceSynchronize();
>>> + cudaMemcpy(host_y, device_y, kDataLen * sizeof(float),
>>> + cudaMemcpyDeviceToHost);
>>> 
>>> // Print the results.
>>> for (int i = 0; i < kDataLen; ++i) {
>>> std::cout << "y[" << i << "] = " << host_y[i] << "\n";
>>> }
>>> 
>>> - checkCudaErrors(cudaDeviceReset());
>>> + cudaDeviceReset();
>>> return 0;
>>> }
>>> 
>>> @@ -104,16 +104,20 @@ The command line for compilation is simi
>>> 
>>> .. code-block:: console
>>> 
>>> - $ clang++ -o axpy -I<CUDA install path>/samples/common/inc
>>> -L<CUDA
>>> install path>/<lib64 or lib> axpy.cu -lcudart_static -lcuda -ldl
>>> -lrt -pthread
>>> + $ clang++ axpy.cu -o axpy --cuda-gpu-arch=<GPU arch> \
>>> + -L<CUDA install path>/<lib64 or lib> \
>>> + -lcudart_static -ldl -lrt -pthread
>>> $ ./axpy
>>> y[0] = 2
>>> y[1] = 4
>>> y[2] = 6
>>> y[3] = 8
>>> 
>>> -Note that ``helper_cuda.h`` comes from the CUDA samples, so you
>>> need
>>> the
>>> -samples installed for this example. ``<CUDA install path>`` is the
>>> root
>>> -directory where you installed CUDA SDK, typically
>>> ``/usr/local/cuda``.
>>> +``<CUDA install path>`` is the root directory where you installed
>>> CUDA SDK,
>>> +typically ``/usr/local/cuda``. ``<GPU arch>`` is `the compute
>>> capability of
>>> +your GPU < https://developer.nvidia.com/cuda-gpus >`_. For
>>> example, if
>>> you want
>>> +to run your program on a GPU with compute capability of 3.5, you
>>> should specify
>>> +``--cuda-gpu-arch=sm_35``.
>>> 
>>> Optimizations
>>> =============
>>> 
>>> 
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>> 
>> 
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>> 
>> 
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory