[libc-commits] [libc] [libc] Update GPU documentation pages (PR #84076)

Tue Mar 5 16:13:36 PST 2024

================
@@ -9,63 +8,107 @@ Using libc for GPUs
   :depth: 4
   :local:
 
-Building the GPU library
-========================
+Using the GPU C library
+=======================
 
-LLVM's libc GPU support *must* be built with an up-to-date ``clang`` compiler
-due to heavy reliance on ``clang``'s GPU support. This can be done automatically
-using the LLVM runtimes support. The GPU build is done using cross-compilation
-to the GPU architecture. This project currently supports AMD and NVIDIA GPUs
-which can be targeted using the appropriate target name. The following
-invocation will enable a cross-compiling build for the GPU architecture and
-enable the ``libc`` project only for them.
+Once you have finished :ref:`building<libc_gpu_building>` the GPU C library it
+can be used to run libc or libm functions directly on the GPU. Currently, not
+all C standard functions are supported on the GPU. Consult the :ref:`list of
+supported functions<libc_gpu_support>` for a comprehensive list.
 
-.. code-block:: sh
+The GPU C library supports two main usage modes. The first is as a supplementary
+library for offloading languages such as OpenMP, CUDA, or HIP. These aim to
+provide standard system utilities similarly to existing vendor libraries. The
+second method treats the GPU as a hosted target by compiling C or C++ for it
+directly. This is more similar to targeting OpenCL and is primarily used for
+testing.
+
+Offloading usage
+----------------
 
-  $> cd llvm-project  # The llvm-project checkout
-  $> mkdir build
-  $> cd build
-  $> cmake ../llvm -G Ninja                                               \
-     -DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt"                       \
-     -DLLVM_ENABLE_RUNTIMES="openmp"                                      \
-     -DCMAKE_BUILD_TYPE=<Debug|Release>   \ # Select build type
-     -DCMAKE_INSTALL_PREFIX=<PATH>        \ # Where 'libcgpu.a' will live
-     -DRUNTIMES_nvptx64-nvidia-cuda_LLVM_ENABLE_RUNTIMES=libc             \
-     -DRUNTIMES_amdgcn-amd-amdhsa_LLVM_ENABLE_RUNTIMES=libc               \
-     -DLLVM_RUNTIME_TARGETS=default;amdgcn-amd-amdhsa;nvptx64-nvidia-cuda
-  $> ninja install
-
-Since we want to include ``clang``, ``lld`` and ``compiler-rt`` in our
-toolchain, we list them in ``LLVM_ENABLE_PROJECTS``. To ensure ``libc`` is built
-using a compatible compiler and to support ``openmp`` offloading, we list them
-in ``LLVM_ENABLE_RUNTIMES`` to build them after the enabled projects using the
-newly built compiler. ``CMAKE_INSTALL_PREFIX`` specifies the installation
-directory in which to install the ``libcgpu-nvptx.a`` and ``libcgpu-amdgpu.a``
-libraries and headers along with LLVM. The generated headers will be placed in
-``include/<gpu-triple>``.
-
-Usage
-=====
-
-Once the static archive has been built it can be linked directly
-with offloading applications as a standard library. This process is described in
+Offloading languages like CUDA, HIP, or OpenMP work by compiling a single source
+file for both the host target and a list of offloading devices. In order to
+support standard compilation flows, the ``clang`` driver uses fat binaries to
 the `clang documentation <https://clang.llvm.org/docs/OffloadingDesign.html>`_.
-This linking mode is used by the OpenMP toolchain, but is currently opt-in for
-the CUDA and HIP toolchains through the ``--offload-new-driver``` and
-``-fgpu-rdc`` flags. A typical usage will look this this:
+The This linking mode is used by the OpenMP toolchain, but is currently opt-in
+for the CUDA and HIP toolchains through the ``--offload-new-driver``` and
+``-fgpu-rdc`` flags.
+
+The installation should contain a static library called ``libcgpu-amdgpu.a`` or
+``libcgpu-nvptx.a`` depending on which GPU architectures your build targeted.
+These contain fat binaries compatible with the offloading toolchain such that
+they can be used directly.
 
 .. code-block:: sh
 
-  $> clang foo.c -fopenmp --offload-arch=gfx90a -lcgpu
+  $> clang opnemp.c -fopenmp --offload-arch=gfx90a -lcgpu-amdgpu
+  $> clang cuda.cu --offload-arch=sm_80 --offload-new-driver -fgpu-rdc -lcgpu-nvptx
+  $> clang hip.hip --offload-arch=gfx940 --offload-new-driver -fgpu-rdc -lcgpu-amdgpu
+
+This will automatically link in the needed function definitions if they were
+required used by the user's application. Normally using the ``-fgpu-rdc`` option
----------------
agozillon wrote:

can probably remove "used" from the sentence

https://github.com/llvm/llvm-project/pull/84076