r338049 - [OPENMP] What's new for OpenMP in clang.
Jonas Hahnfeld via cfe-commits
cfe-commits at lists.llvm.org
Sun Jul 29 09:41:21 PDT 2018
I just noticed that UsersManual says: "Clang supports all OpenMP 3.1
directives and clauses." Maybe this should link to OpenMPSupport?
On 2018-07-26 19:53, Alexey Bataev via cfe-commits wrote:
> Author: abataev
> Date: Thu Jul 26 10:53:45 2018
> New Revision: 338049
>
> URL: http://llvm.org/viewvc/llvm-project?rev=338049&view=rev
> Log:
> [OPENMP] What's new for OpenMP in clang.
>
> Updated ReleaseNotes + Status of the OpenMP support in clang.
>
> Modified:
> cfe/trunk/docs/OpenMPSupport.rst
> cfe/trunk/docs/ReleaseNotes.rst
>
> Modified: cfe/trunk/docs/OpenMPSupport.rst
> URL:
> http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/OpenMPSupport.rst?rev=338049&r1=338048&r2=338049&view=diff
> ==============================================================================
> --- cfe/trunk/docs/OpenMPSupport.rst (original)
> +++ cfe/trunk/docs/OpenMPSupport.rst Thu Jul 26 10:53:45 2018
> @@ -10,13 +10,15 @@
> .. role:: partial
> .. role:: good
>
> +.. contents::
> + :local:
> +
> ==================
> OpenMP Support
> ==================
>
> -Clang fully supports OpenMP 3.1 + some elements of OpenMP 4.5. Clang
> supports offloading to X86_64, AArch64 and PPC64[LE] devices.
> -Support for Cuda devices is not ready yet.
> -The status of major OpenMP 4.5 features support in Clang.
> +Clang fully supports OpenMP 4.5. Clang supports offloading to X86_64,
> AArch64,
> +PPC64[LE] and has `basic support for Cuda devices`_.
>
> Standalone directives
> =====================
> @@ -35,7 +37,7 @@ Standalone directives
>
> * #pragma omp target: :good:`Complete`.
>
> -* #pragma omp declare target: :partial:`Partial`. No full codegen
> support.
> +* #pragma omp declare target: :good:`Complete`.
>
> * #pragma omp teams: :good:`Complete`.
>
> @@ -64,5 +66,66 @@ Combined directives
>
> * #pragma omp target teams distribute parallel for [simd]:
> :good:`Complete`.
>
> -Clang does not support any constructs/updates from upcoming OpenMP
> 5.0 except for `reduction`-based clauses in the `task` and
> `target`-based directives.
> -In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP
> Tools Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux,
> Windows, and mac OS.
> +Clang does not support any constructs/updates from upcoming OpenMP 5.0
> except
> +for `reduction`-based clauses in the `task` and `target`-based
> directives.
> +
> +In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP
> Tools
> +Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux,
> Windows, and mac OS.
> +ows, and mac OS.
> +
> +.. _basic support for Cuda devices:
> +
> +Cuda devices support
> +====================
> +
> +Directives execution modes
> +--------------------------
> +
> +Clang code generation for target regions supports two modes: the SPMD
> and
> +non-SPMD modes. Clang chooses one of these two modes automatically
> based on the
> +way directives and clauses on those directives are used. The SPMD mode
> uses a
> +simplified set of runtime functions thus increasing performance at the
> cost of
> +supporting some OpenMP features. The non-SPMD mode is the most generic
> mode and
> +supports all currently available OpenMP features. The compiler will
> always
> +attempt to use the SPMD mode wherever possible. SPMD mode will not be
> used if:
> +
> + - The target region contains an `if()` clause that refers to a
> `parallel`
> + directive.
> +
> + - The target region contains a `parallel` directive with a
> `num_threads()`
> + clause.
> +
> + - The target region contains user code (other than OpenMP-specific
> + directives) in between the `target` and the `parallel`
> directives.
> +
> +Data-sharing modes
> +------------------
> +
> +Clang supports two data-sharing models for Cuda devices: `Generic` and
> `Cuda`
> +modes. The default mode is `Generic`. `Cuda` mode can give an
> additional
> +performance and can be activated using the `-fopenmp-cuda-mode` flag.
> In
> +`Generic` mode all local variables that can be shared in the parallel
> regions
> +are stored in the global memory. In `Cuda` mode local variables are
> not shared
> +between the threads and it is user responsibility to share the
> required data
> +between the threads in the parallel regions.
> +
> +Features not supported or with limited support for Cuda devices
> +---------------------------------------------------------------
> +
> +- Reductions across the teams are not supported yet.
> +
> +- Cancellation constructs are not supported.
> +
> +- Doacross loop nest is not supported.
> +
> +- User-defined reductions are supported only for trivial types.
> +
> +- Nested parallelism: inner parallel regions are executed
> sequentially.
> +
> +- Static linking of libraries containing device code is not supported
> yet.
> +
> +- Automatic translation of math functions in target regions to
> device-specific
> + math functions is not implemented yet.
> +
> +- Debug information for OpenMP target regions is not supported yet.
> +
>
> Modified: cfe/trunk/docs/ReleaseNotes.rst
> URL:
> http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/ReleaseNotes.rst?rev=338049&r1=338048&r2=338049&view=diff
> ==============================================================================
> --- cfe/trunk/docs/ReleaseNotes.rst (original)
> +++ cfe/trunk/docs/ReleaseNotes.rst Thu Jul 26 10:53:45 2018
> @@ -216,7 +216,21 @@ OpenCL C Language Changes in Clang
> OpenMP Support in Clang
> ----------------------------------
>
> -- ...
> +- Clang gained basic support for OpenMP 4.5 offloading for NVPTX
> target.
> + To compile your program for NVPTX target use the following options:
> + `-fopenmp -fopenmp-targets=nvptx64-nvidia-cuda` for 64 bit
> platforms or
> + `-fopenmp -fopenmp-targets=nvptx-nvidia-cuda` for 32 bit platform.
> +
> +- Passing options to the OpenMP device offloading toolchain can be
> done using
> + the `-Xopenmp-target=<triple> -opt=val` flag. In this way the
> `-opt=val`
> + option will be forwarded to the respective OpenMP device offloading
> toolchain
> + described by the triple. For example passing the compute capability
> to
> + the OpenMP NVPTX offloading toolchain can be done as follows:
> + `-Xopenmp-target=nvptx62-nvidia-cuda -march=sm_60`. For the case
> when only one
> + target offload toolchain is specified under the
> `-fopenmp-targets=<triples>`
> + option, then the triple can be skipped: `-Xopenmp-target
> -march=sm_60`.
> +
> +- Other bugfixes.
>
> CUDA Support in Clang
> ---------------------
>
>
> _______________________________________________
> cfe-commits mailing list
> cfe-commits at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
More information about the cfe-commits
mailing list