[libc-commits] [libc] 807f058 - [libc][Docs] Begin improving documentation for the GPU libc
Joseph Huber via libc-commits
libc-commits at lists.llvm.org
Wed Apr 26 08:31:03 PDT 2023
Author: Joseph Huber
Date: 2023-04-26T10:30:54-05:00
New Revision: 807f0584874d61b0eec5a3ed988402387560534c
URL: https://github.com/llvm/llvm-project/commit/807f0584874d61b0eec5a3ed988402387560534c
DIFF: https://github.com/llvm/llvm-project/commit/807f0584874d61b0eec5a3ed988402387560534c.diff
LOG: [libc][Docs] Begin improving documentation for the GPU libc
This patch updates some of the documentation for the GPU libc project.
There is a lot of work still to be done, but this sets the general
outline.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D149194
Added:
libc/docs/gpu/index.rst
libc/docs/gpu/rpc.rst
libc/docs/gpu/support.rst
libc/docs/gpu/testing.rst
libc/docs/gpu/using.rst
Modified:
libc/docs/index.rst
Removed:
libc/docs/gpu_mode.rst
################################################################################
diff --git a/libc/docs/gpu/index.rst b/libc/docs/gpu/index.rst
new file mode 100644
index 0000000000000..0ea54a7235459
--- /dev/null
+++ b/libc/docs/gpu/index.rst
@@ -0,0 +1,18 @@
+.. _libc_gpu:
+
+=============
+libc for GPUs
+=============
+
+.. note:: This feature is very experimental and may change in the future.
+
+The *GPU* support for LLVM's libc project aims to make a subset of the standard
+C library available on GPU based accelerators. Navigate using the links below to
+learn more about this project.
+
+.. toctree::
+
+ using
+ support
+ testing
+ rpc
diff --git a/libc/docs/gpu/rpc.rst b/libc/docs/gpu/rpc.rst
new file mode 100644
index 0000000000000..bdc2c4ac312cf
--- /dev/null
+++ b/libc/docs/gpu/rpc.rst
@@ -0,0 +1,17 @@
+.. _libc_gpu_rpc:
+
+======================
+Remote Procedure Calls
+======================
+
+.. contents:: Table of Contents
+ :depth: 4
+ :local:
+
+Remote Procedure Call Implementation
+====================================
+
+Certain features from the standard C library, such as allocation or printing,
+require support from the operating system. We instead implement a remote
+procedure call (RPC) interface to allow submitting work from the GPU to a host
+server that forwards it to the host system.
diff --git a/libc/docs/gpu/support.rst b/libc/docs/gpu/support.rst
new file mode 100644
index 0000000000000..59fdb61966838
--- /dev/null
+++ b/libc/docs/gpu/support.rst
@@ -0,0 +1,88 @@
+.. _libc_gpu_support:
+
+===================
+Supported Functions
+===================
+
+.. include:: ../check.rst
+
+.. contents:: Table of Contents
+ :depth: 4
+ :local:
+
+The following functions and headers are supported at least partially on the
+device. Some functions are implemented fully on the GPU, while others require a
+`remote procedure call <libc_gpu_rpc>`.
+
+ctype.h
+-------
+
+============= ========= ============
+Function Name Available RPC Required
+============= ========= ============
+isalnum |check|
+isalpha |check|
+isascii |check|
+isblank |check|
+iscntrl |check|
+isdigit |check|
+isgraph |check|
+islower |check|
+isprint |check|
+ispunct |check|
+isspace |check|
+isupper |check|
+isxdigit |check|
+toascii |check|
+tolower |check|
+toupper |check|
+============= ========= ============
+
+string.h
+--------
+
+============= ========= ============
+Function Name Available RPC Required
+============= ========= ============
+bcmp |check|
+bzero |check|
+memccpy |check|
+memchr |check|
+memcmp |check|
+memcpy |check|
+memmove |check|
+mempcpy |check|
+memrchr |check|
+memset |check|
+stpcpy |check|
+stpncpy |check|
+strcat |check|
+strchr |check|
+strcmp |check|
+strcpy |check|
+strcspn |check|
+strlcat |check|
+strlcpy |check|
+strlen |check|
+strncat |check|
+strncmp |check|
+strncpy |check|
+strnlen |check|
+strpbrk |check|
+strrchr |check|
+strspn |check|
+strstr |check|
+strtok |check|
+strtok_r |check|
+strdup
+strndup
+============= ========= ============
+
+stdlib.h
+--------
+
+============= ========= ============
+Function Name Available RPC Required
+============= ========= ============
+atoi |check|
+============= ========= ============
diff --git a/libc/docs/gpu/testing.rst b/libc/docs/gpu/testing.rst
new file mode 100644
index 0000000000000..09e875aea1366
--- /dev/null
+++ b/libc/docs/gpu/testing.rst
@@ -0,0 +1,32 @@
+.. _libc_gpu_testing:
+
+
+============================
+Testing the GPU libc library
+============================
+
+.. contents:: Table of Contents
+ :depth: 4
+ :local:
+
+Testing Infrastructure
+======================
+
+The testing support in LLVM's libc implementation for GPUs is designed to mimic
+the standard unit tests as much as possible. We use the `remote procedure call
+<libc_gpu_rpc>` support to provide the necessary utilities like printing from
+the GPU. Execution is performed by emitting a ``_start`` kernel from the GPU
+that is then called by an external loader utility. This is an example of how
+this can be done manually:
+
+.. code-block:: sh
+
+ $> clang++ crt1.o test.cpp --target=amdgcn-amd-amdhsa -mcpu=gfx90a -flto
+ $> ./amdhsa_loader --threads 1 --blocks 1 a.out
+ Test Passed!
+
+Unlike the exported ``libcgpu.a``, the testing architecture can only support a
+single architecture at a time. This is either detected automatically, or set
+manually by the user using ``LIBC_GPU_TEST_ARCHITECTURE``. The latter is useful
+in cases where the user does not build LLVM's libc on machine with the GPU to
+use for testing.
diff --git a/libc/docs/gpu/using.rst b/libc/docs/gpu/using.rst
new file mode 100644
index 0000000000000..6808f05ad13b6
--- /dev/null
+++ b/libc/docs/gpu/using.rst
@@ -0,0 +1,87 @@
+.. _libc_gpu_usage:
+
+
+===================
+Using libc for GPUs
+===================
+
+.. contents:: Table of Contents
+ :depth: 4
+ :local:
+
+Building the GPU library
+========================
+
+LLVM's libc GPU support *must* be built with an up-to-date ``clang`` compiler
+due to heavy reliance on ``clang``'s GPU support. This can be done automatically
+using the ``LLVM_ENABLE_RUNTIMES=libc`` option. To enable libc for the GPU,
+enable the ``LIBC_GPU_BUILD`` option. By default, ``libcgpu.a`` will be built
+using every supported GPU architecture. To restrict the number of architectures
+build, either set ``LLVM_LIBC_GPU_ARCHITECTURES`` to the list of desired
+architectures manually or use ``native`` to detect the GPUs on your system. A
+typical ``cmake`` configuration will look like this:
+
+.. code-block:: sh
+
+ $> cd llvm-project # The llvm-project checkout
+ $> mkdir build
+ $> cd build
+ $> cmake ../llvm -G Ninja \
+ -DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt" \
+ -DLLVM_ENABLE_RUNTIMES="libc;openmp" \
+ -DCMAKE_BUILD_TYPE=<Debug|Release> \ # Select build type
+ -DLIBC_GPU_BUILD=ON \ # Build in GPU mode
+ -DLLVM_LIBC_GPU_ARCHITECTURES=all \ # Build all supported architectures
+ -DCMAKE_INSTALL_PREFIX=<PATH> \ # Where 'libcgpu.a' will live
+ $> ninja install
+
+Since we want to include ``clang``, ``lld`` and ``compiler-rt`` in our
+toolchain, we list them in ``LLVM_ENABLE_PROJECTS``. To ensure ``libc`` is built
+using a compatible compiler and to support ``openmp`` offloading, we list them
+in ``LLVM_ENABLE_RUNTIMES`` to build them after the enabled projects using the
+newly built compiler. ``CMAKE_INSTALL_PREFIX`` specifies the installation
+directory in which to install the ``libcgpu.a`` library and headers along with
+LLVM. The generated headers will be placed in ``include/gpu-none-llvm``.
+
+Usage
+=====
+
+Once the ``libcgpu.a`` static archive has been built it can be linked directly
+with offloading applications as a standard library. This process is described in
+the `clang documentation <https://clang.llvm.org/docs/OffloadingDesign.html>`_.
+This linking mode is used by the OpenMP toolchain, but is currently opt-in for
+the CUDA and HIP toolchains through the ``--offload-new-driver``` and
+``-fgpu-rdc`` flags. A typical usage will look this this:
+
+.. code-block:: sh
+
+ $> clang foo.c -fopenmp --offload-arch=gfx90a -lcgpu
+
+The ``libcgpu.a`` static archive is a fat-binary containing LLVM-IR for each
+supported target device. The supported architectures can be seen using LLVM's
+``llvm-objdump`` with the ``--offloading`` flag:
+
+.. code-block:: sh
+
+ $> llvm-objdump --offloading libcgpu.a
+ libcgpu.a(strcmp.cpp.o): file format elf64-x86-64
+
+ OFFLOADING IMAGE [0]:
+ kind llvm ir
+ arch gfx90a
+ triple amdgcn-amd-amdhsa
+ producer none
+
+Because the device code is stored inside a fat binary, it can be
diff icult to
+inspect the resulting code. This can be done using the following utilities:
+
+.. code-block:: sh
+
+ $> llvm-ar x libcgpu.a strcmp.cpp.o
+ $> clang-offload-packager strcmp.cpp.o --image=arch=gfx90a,file=gfx90a.bc
+ $> opt -S out.bc
+ ...
+
+Please note that this fat binary format is provided for compatibility with
+existing offloading toolchains. The implementation in ``libc`` does not depend
+on any existing offloading languages and is completely freestanding.
diff --git a/libc/docs/gpu_mode.rst b/libc/docs/gpu_mode.rst
deleted file mode 100644
index b71b6eec5daee..0000000000000
--- a/libc/docs/gpu_mode.rst
+++ /dev/null
@@ -1,169 +0,0 @@
-.. _GPU_mode:
-
-==============
-GPU Mode
-==============
-
-.. include:: check.rst
-
-.. contents:: Table of Contents
- :depth: 4
- :local:
-
-.. note:: This feature is very experimental and may change in the future.
-
-The *GPU* mode of LLVM's libc is an experimental mode used to support calling
-libc routines during GPU execution. The goal of this project is to provide
-access to the standard C library on systems running accelerators. To begin using
-this library, build and install the ``libcgpu.a`` static archive following the
-instructions in :ref:`building_gpu_mode` and link with your offloading
-application.
-
-.. _building_gpu_mode:
-
-Building the GPU library
-========================
-
-LLVM's libc GPU support *must* be built using the same compiler as the final
-application to ensure relative LLVM bitcode compatibility. This can be done
-automatically using the ``LLVM_ENABLE_RUNTIMES=libc`` option. Furthermore,
-building for the GPU is only supported in :ref:`fullbuild_mode`. To enable the
-GPU build, set the target OS to ``gpu`` via ``LLVM_LIBC_TARGET_OS=gpu``. By
-default, ``libcgpu.a`` will be built using every supported GPU architecture. To
-restrict the number of architectures build, set ``LLVM_LIBC_GPU_ARCHITECTURES``
-to the list of desired architectures or use ``all``. A typical ``cmake``
-configuration will look like this:
-
-.. code-block:: sh
-
- $> cd llvm-project # The llvm-project checkout
- $> mkdir build
- $> cd build
- $> cmake ../llvm -G Ninja \
- -DLLVM_ENABLE_PROJECTS="clang;lld;compiler-rt" \
- -DLLVM_ENABLE_RUNTIMES="libc;openmp" \
- -DCMAKE_BUILD_TYPE=<Debug|Release> \ # Select build type
- -DLLVM_LIBC_FULL_BUILD=ON \ # We need the full libc
- -DLIBC_GPU_BUILD=ON \ # Build in GPU mode
- -DLLVM_LIBC_GPU_ARCHITECTURES=all \ # Build all supported architectures
- -DCMAKE_INSTALL_PREFIX=<PATH> \ # Where 'libcgpu.a' will live
- $> ninja install
-
-Since we want to include ``clang``, ``lld`` and ``compiler-rt`` in our
-toolchain, we list them in ``LLVM_ENABLE_PROJECTS``. To ensure ``libc`` is built
-using a compatible compiler and to support ``openmp`` offloading, we list them
-in ``LLVM_ENABLE_RUNTIMES`` to build them after the enabled projects using the
-newly built compiler. ``CMAKE_INSTALL_PREFIX`` specifies the installation
-directory in which to install the ``libcgpu.a`` library along with LLVM.
-
-Usage
-=====
-
-Once the ``libcgpu.a`` static archive has been built in
-:ref:`building_gpu_mode`, it can be linked directly with offloading applications
-as a standard library. This process is described in the `clang documentation
-<https://clang.llvm.org/docs/OffloadingDesign.html>_`. This linking mode is used
-by the OpenMP toolchain, but is currently opt-in for the CUDA and HIP toolchains
-using the ``--offload-new-driver``` and ``-fgpu-rdc`` flags. A typical usage
-will look this this:
-
-.. code-block:: sh
-
- $> clang foo.c -fopenmp --offload-arch=gfx90a -lcgpu
-
-The ``libcgpu.a`` static archive is a fat-binary containing LLVM-IR for each
-supported target device. The supported architectures can be seen using LLVM's
-objdump with the ``--offloading`` flag:
-
-.. code-block:: sh
-
- $> llvm-objdump --offloading libcgpu.a
- libcgpu.a(strcmp.cpp.o): file format elf64-x86-64
-
- OFFLOADING IMAGE [0]:
- kind llvm ir
- arch gfx90a
- triple amdgcn-amd-amdhsa
- producer <none>
-
-Because the device code is stored inside a fat binary, it can be
diff icult to
-inspect the resulting code. This can be done using the following utilities:
-
-.. code-block:: sh
-
- $> llvm-ar x libcgpu.a strcmp.cpp.o
- $> clang-offload-packager strcmp.cpp.o --image=arch=gfx90a,file=gfx90a.bc
- $> opt -S out.bc
- ...
-
-Supported Functions
-===================
-
-The following functions and headers are supported at least partially on the
-device. Currently, only basic device functions that do not require an operating
-system are supported on the device. Supporting functions like `malloc` using an
-RPC mechanism is a work-in-progress.
-
-ctype.h
--------
-
-============= =========
-Function Name Available
-============= =========
-isalnum |check|
-isalpha |check|
-isascii |check|
-isblank |check|
-iscntrl |check|
-isdigit |check|
-isgraph |check|
-islower |check|
-isprint |check|
-ispunct |check|
-isspace |check|
-isupper |check|
-isxdigit |check|
-toascii |check|
-tolower |check|
-toupper |check|
-============= =========
-
-string.h
---------
-
-============= =========
-Function Name Available
-============= =========
-bcmp |check|
-bzero |check|
-memccpy |check|
-memchr |check|
-memcmp |check|
-memcpy |check|
-memmove |check|
-mempcpy |check|
-memrchr |check|
-memset |check|
-stpcpy |check|
-stpncpy |check|
-strcat |check|
-strchr |check|
-strcmp |check|
-strcpy |check|
-strcspn |check|
-strlcat |check|
-strlcpy |check|
-strlen |check|
-strncat |check|
-strncmp |check|
-strncpy |check|
-strnlen |check|
-strpbrk |check|
-strrchr |check|
-strspn |check|
-strstr |check|
-strtok |check|
-strtok_r |check|
-strdup
-strndup
-============= =========
diff --git a/libc/docs/index.rst b/libc/docs/index.rst
index 90422617403e6..5e9a602b5a96a 100644
--- a/libc/docs/index.rst
+++ b/libc/docs/index.rst
@@ -52,7 +52,7 @@ stages there is no ABI stability in any form.
usage_modes
overlay_mode
fullbuild_mode
- gpu_mode
+ gpu/index.rst
.. toctree::
:hidden:
More information about the libc-commits
mailing list