[libc-commits] [libc] [libc] Update GPU documentation pages (PR #84076)
Nick Desaulniers via libc-commits
libc-commits at lists.llvm.org
Tue Mar 5 15:32:14 PST 2024
================
@@ -78,11 +121,120 @@ inspect the resulting code. This can be done using the following utilities:
.. code-block:: sh
- $> llvm-ar x libcgpu.a strcmp.cpp.o
- $> clang-offload-packager strcmp.cpp.o --image=arch=gfx90a,file=gfx90a.bc
- $> opt -S out.bc
- ...
+ $> llvm-ar x libcgpu.a strcmp.cpp.o
+ $> clang-offload-packager strcmp.cpp.o --image=arch=generic,file=gfx90a.bc
+ $> opt -S out.bc
+ ...
Please note that this fat binary format is provided for compatibility with
existing offloading toolchains. The implementation in ``libc`` does not depend
on any existing offloading languages and is completely freestanding.
+
+Direct compilation
+------------------
+
+Instead of using standard offloading languages, we can also target the CPU
+directly using C and C++ to create a GPU executable similarly to OpenCL. This is
+done by targeting the GPU architecture directly using `clang's cross compilation
+support https://clang.llvm.org/docs/CrossCompilation.html`_. This is the method
+that the GPU C library uses both to provide its definitions and to run tests.
+
+This allows us to easily define GPU specific libraries and programs that fit
+well into existing tools. In order to target the GPU effectively we rely heavily
+on the compiler's intrinsic and built-in functions. For example, the following
+function gets the thread identifier in the 'x' dimension on both GPUs.
+
+.. code-block:: c++
+
+ uint32_t get_thread_id_x() {
+ #if defined(__AMDGPU__)
+ return __builtin_amdgcn_workitem_id_x();
+ #elif defined(__NVPTX__)
+ return __nvvm_read_ptx_sreg_tid_x();
+ #else
+ #error "Unsupported platform"
+ #endif
+ }
+
+We can then compile this for both NVPTX and AMDGPU into LLVM-IR using the
+following commands.
+
+.. code-block:: sh
+
+ $> clang id.c --target=amdgcn-amd-amdhsa -mcpu=native -nogpulib -flto -c
+ $> clang id.c --target=nvptx64-nvidia-cuda -march=native -nogpulib -flto -c
+
+This support allows us to treat the GPU as a hosted environment by providing a C
+library and startup object just like a standard C library running on the host
+machine. Then, in order to execute these images we provide a loader utility to
+launch the executable on the GPU similar to a cross-compiling emulator.
+
+Building for AMDGPU targets
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The AMDGPU target supports several features natively by virtue of using ``lld``
+as its linker. The installation will include the ``include/amdgcn-amd-amdhsa``
+and ``lib/amdgcn-amd-amdha`` directories that contain the necessary code to use
+the library. We can directly link against ``libc.a`` and use LTO to generate the
+final executable. The ``crt1.o`` object contains the kernels necessary to launch
+the ``main`` function.
+
+.. code-block:: c++
+
+ #include <stdio.h>
+
+ int main() { fputs("Hello from AMDGPU!\n", stdout); }
+
+This function can be compiled with the C library support. Note that ``-flto``
+and ``-mcpu=`` should be defined. This is because the GPU subarchitectures do
+not have strict backwards compatibility. Use ``-mcpu=help`` for accepted
+arguments or ``-mcpu=native`` to target the system's installed GPUs if present.
+Once built, we use the ``amdhsa-loader`` utility to launch execution on the GPU.
+This will be built if the ``hsa_runtime64`` library was found during build time.
+
+.. code-block:: sh
+
+ $> clang hello.c --target=amdgcn-amd-amdhsa -mcpu=native -flto -lc <install>/lib/amdgcn-amd-amdhsa/crt1.o
----------------
nickdesaulniers wrote:
Do you need `-mcpu=native`? Or `-march=native` below (L233)? Consider removing if not.
https://github.com/llvm/llvm-project/pull/84076
More information about the libc-commits
mailing list