[Openmp-commits] [openmp] 55d043c - [OpenMP][Docs] Document multi-architecture binary handling

Tue Oct 11 11:41:01 PDT 2022

Author: Joseph Huber
Date: 2022-10-11T13:40:41-05:00
New Revision: 55d043cc06497533264eb0aa210f4d8b39eef378

URL: https://github.com/llvm/llvm-project/commit/55d043cc06497533264eb0aa210f4d8b39eef378
DIFF: https://github.com/llvm/llvm-project/commit/55d043cc06497533264eb0aa210f4d8b39eef378.diff

LOG: [OpenMP][Docs] Document multi-architecture binary handling

Summary:
This patch adds some documentation in the FAQ regarding
multi-architecture binary support using the `--offload-arch` flag.

Added: 
    

Modified: 
    openmp/docs/SupportAndFAQ.rst

Removed: 
    


################################################################################
diff  --git a/openmp/docs/SupportAndFAQ.rst b/openmp/docs/SupportAndFAQ.rst
index 212f6682a1ce3..4ce64a40e9905 100644

--- a/openmp/docs/SupportAndFAQ.rst
+++ b/openmp/docs/SupportAndFAQ.rst
@@ -329,3 +329,67 @@ can fail. If this warning is triggered it means that the kernel may run out of
 stack memory during execution and crash. The environment variable
 ``LIBOMPTARGET_STACK_SIZE`` can be used to increase the stack size if this
 occurs.
+
+Q: Can OpenMP offloading compile for multiple architectures?
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Since LLVM version 15.0, OpenMP offloading supports offloading to multiple 
+architectures at once. This allows for executables to be run on 
diff erent 
+targets, such as offloading to AMD and NVIDIA GPUs simultaneously, as well as 
+multiple sub-architectures for the same target. Additionally, static libraries 
+will only extract archive members if an architecture is used, allowing users to 
+create generic libraries.
+
+The architecture can either be specified manually using ``--offload-arch=``. If 
+``--offload-arch=`` is present no ``-fopenmp-targets=`` flag is present then the 
+targets will be inferred from the architectures. Conversely, if 
+``--fopenmp-targets=`` is present with no ``--offload-arch``  then the target 
+architecture will be set to a default value, usually the architecture supported 
+by the system LLVM was built on.
+
+For example, an executable can be built that runs on AMDGPU and NVIDIA hardware 
+given that the necessary build tools are installed for both.
+
+.. code-block:: shell
+
+   clang example.c -fopenmp --offload-arch=gfx90a --offload-arch=sm_80
+
+If just given the architectures we should be able to infer the triples, 
+otherwise we can specify them manually.
+
+.. code-block:: shell
+
+   clang example.c -fopenmp -fopenmp-targets=amdgcn-amd-amdhsa,nvptx64-nvidia-cuda \
+      -Xopenmp-target=amdgcn-amd-amdhsa --offload-arch=gfx90a \
+      -Xopenmp-target=nvptx64-nvidia-cuda --offload-arch=sm_80
+
+When linking against a static library that contains device code for multiple 
+architectures, only the images used by the executable will be extracted.
+
+.. code-block:: shell
+
+   clang example.c -fopenmp --offload-arch=gfx90a,gfx90a,sm_70,sm_80 -c
+   llvm-ar rcs libexample.a example.o
+   clang app.c -fopenmp --offload-arch=gfx90a -o app
+
+The supported device images can be viewed using the ``--offloading`` option with 
+``llvm-objdump``.
+
+.. code-block:: shell
+
+   clang example.c -fopenmp --offload-arch=gfx90a --offload-arch=sm_80 -o example
+   llvm-objdump --offloading example
+
+   a.out:  file format elf64-x86-64
+
+   OFFLOADING IMAGE [0]:
+   kind            elf
+   arch            gfx90a
+   triple          amdgcn-amd-amdhsa
+   producer        openmp
+
+   OFFLOADING IMAGE [1]:
+   kind            elf
+   arch            sm_80
+   triple          nvptx64-nvidia-cuda
+   producer        openmp