r338049 - [OPENMP] What's new for OpenMP in clang.
Alexey Bataev via cfe-commits
cfe-commits at lists.llvm.org
Thu Jul 26 10:53:45 PDT 2018
Author: abataev
Date: Thu Jul 26 10:53:45 2018
New Revision: 338049
URL: http://llvm.org/viewvc/llvm-project?rev=338049&view=rev
Log:
[OPENMP] What's new for OpenMP in clang.
Updated ReleaseNotes + Status of the OpenMP support in clang.
Modified:
cfe/trunk/docs/OpenMPSupport.rst
cfe/trunk/docs/ReleaseNotes.rst
Modified: cfe/trunk/docs/OpenMPSupport.rst
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/OpenMPSupport.rst?rev=338049&r1=338048&r2=338049&view=diff
==============================================================================
--- cfe/trunk/docs/OpenMPSupport.rst (original)
+++ cfe/trunk/docs/OpenMPSupport.rst Thu Jul 26 10:53:45 2018
@@ -10,13 +10,15 @@
.. role:: partial
.. role:: good
+.. contents::
+ :local:
+
==================
OpenMP Support
==================
-Clang fully supports OpenMP 3.1 + some elements of OpenMP 4.5. Clang supports offloading to X86_64, AArch64 and PPC64[LE] devices.
-Support for Cuda devices is not ready yet.
-The status of major OpenMP 4.5 features support in Clang.
+Clang fully supports OpenMP 4.5. Clang supports offloading to X86_64, AArch64,
+PPC64[LE] and has `basic support for Cuda devices`_.
Standalone directives
=====================
@@ -35,7 +37,7 @@ Standalone directives
* #pragma omp target: :good:`Complete`.
-* #pragma omp declare target: :partial:`Partial`. No full codegen support.
+* #pragma omp declare target: :good:`Complete`.
* #pragma omp teams: :good:`Complete`.
@@ -64,5 +66,66 @@ Combined directives
* #pragma omp target teams distribute parallel for [simd]: :good:`Complete`.
-Clang does not support any constructs/updates from upcoming OpenMP 5.0 except for `reduction`-based clauses in the `task` and `target`-based directives.
-In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP Tools Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux, Windows, and mac OS.
+Clang does not support any constructs/updates from upcoming OpenMP 5.0 except
+for `reduction`-based clauses in the `task` and `target`-based directives.
+
+In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP Tools
+Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux, Windows, and mac OS.
+ows, and mac OS.
+
+.. _basic support for Cuda devices:
+
+Cuda devices support
+====================
+
+Directives execution modes
+--------------------------
+
+Clang code generation for target regions supports two modes: the SPMD and
+non-SPMD modes. Clang chooses one of these two modes automatically based on the
+way directives and clauses on those directives are used. The SPMD mode uses a
+simplified set of runtime functions thus increasing performance at the cost of
+supporting some OpenMP features. The non-SPMD mode is the most generic mode and
+supports all currently available OpenMP features. The compiler will always
+attempt to use the SPMD mode wherever possible. SPMD mode will not be used if:
+
+ - The target region contains an `if()` clause that refers to a `parallel`
+ directive.
+
+ - The target region contains a `parallel` directive with a `num_threads()`
+ clause.
+
+ - The target region contains user code (other than OpenMP-specific
+ directives) in between the `target` and the `parallel` directives.
+
+Data-sharing modes
+------------------
+
+Clang supports two data-sharing models for Cuda devices: `Generic` and `Cuda`
+modes. The default mode is `Generic`. `Cuda` mode can give an additional
+performance and can be activated using the `-fopenmp-cuda-mode` flag. In
+`Generic` mode all local variables that can be shared in the parallel regions
+are stored in the global memory. In `Cuda` mode local variables are not shared
+between the threads and it is user responsibility to share the required data
+between the threads in the parallel regions.
+
+Features not supported or with limited support for Cuda devices
+---------------------------------------------------------------
+
+- Reductions across the teams are not supported yet.
+
+- Cancellation constructs are not supported.
+
+- Doacross loop nest is not supported.
+
+- User-defined reductions are supported only for trivial types.
+
+- Nested parallelism: inner parallel regions are executed sequentially.
+
+- Static linking of libraries containing device code is not supported yet.
+
+- Automatic translation of math functions in target regions to device-specific
+ math functions is not implemented yet.
+
+- Debug information for OpenMP target regions is not supported yet.
+
Modified: cfe/trunk/docs/ReleaseNotes.rst
URL: http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/ReleaseNotes.rst?rev=338049&r1=338048&r2=338049&view=diff
==============================================================================
--- cfe/trunk/docs/ReleaseNotes.rst (original)
+++ cfe/trunk/docs/ReleaseNotes.rst Thu Jul 26 10:53:45 2018
@@ -216,7 +216,21 @@ OpenCL C Language Changes in Clang
OpenMP Support in Clang
----------------------------------
-- ...
+- Clang gained basic support for OpenMP 4.5 offloading for NVPTX target.
+ To compile your program for NVPTX target use the following options:
+ `-fopenmp -fopenmp-targets=nvptx64-nvidia-cuda` for 64 bit platforms or
+ `-fopenmp -fopenmp-targets=nvptx-nvidia-cuda` for 32 bit platform.
+
+- Passing options to the OpenMP device offloading toolchain can be done using
+ the `-Xopenmp-target=<triple> -opt=val` flag. In this way the `-opt=val`
+ option will be forwarded to the respective OpenMP device offloading toolchain
+ described by the triple. For example passing the compute capability to
+ the OpenMP NVPTX offloading toolchain can be done as follows:
+ `-Xopenmp-target=nvptx62-nvidia-cuda -march=sm_60`. For the case when only one
+ target offload toolchain is specified under the `-fopenmp-targets=<triples>`
+ option, then the triple can be skipped: `-Xopenmp-target -march=sm_60`.
+
+- Other bugfixes.
CUDA Support in Clang
---------------------
More information about the cfe-commits
mailing list