r338049 - [OPENMP] What's new for OpenMP in clang.

Sun Jul 29 10:59:27 PDT 2018

Yes, that would be good

Best regards,
Alexey Bataev

> 29 июля 2018 г., в 12:41, Jonas Hahnfeld via cfe-commits <cfe-commits at lists.llvm.org> написал(а):
> 
> I just noticed that UsersManual says: "Clang supports all OpenMP 3.1 directives and clauses." Maybe this should link to OpenMPSupport?
> 
>> On 2018-07-26 19:53, Alexey Bataev via cfe-commits wrote:
>> Author: abataev
>> Date: Thu Jul 26 10:53:45 2018
>> New Revision: 338049
>> URL: http://llvm.org/viewvc/llvm-project?rev=338049&view=rev
>> Log:
>> [OPENMP] What's new for OpenMP in clang.
>> Updated ReleaseNotes + Status of the OpenMP support in clang.
>> Modified:
>>    cfe/trunk/docs/OpenMPSupport.rst
>>    cfe/trunk/docs/ReleaseNotes.rst
>> Modified: cfe/trunk/docs/OpenMPSupport.rst
>> URL:
>> http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/OpenMPSupport.rst?rev=338049&r1=338048&r2=338049&view=diff
>> ==============================================================================
>> --- cfe/trunk/docs/OpenMPSupport.rst (original)
>> +++ cfe/trunk/docs/OpenMPSupport.rst Thu Jul 26 10:53:45 2018
>> @@ -10,13 +10,15 @@
>> .. role:: partial
>> .. role:: good
>> +.. contents::
>> +   :local:
>> +
>> ==================
>> OpenMP Support
>> ==================
>> -Clang fully supports OpenMP 3.1 + some elements of OpenMP 4.5. Clang
>> supports offloading to X86_64, AArch64 and PPC64[LE] devices.
>> -Support for Cuda devices is not ready yet.
>> -The status of major OpenMP 4.5 features support in Clang.
>> +Clang fully supports OpenMP 4.5. Clang supports offloading to X86_64, AArch64,
>> +PPC64[LE] and has `basic support for Cuda devices`_.
>> Standalone directives
>> =====================
>> @@ -35,7 +37,7 @@ Standalone directives
>> * #pragma omp target: :good:`Complete`.
>> -* #pragma omp declare target: :partial:`Partial`.  No full codegen support.
>> +* #pragma omp declare target: :good:`Complete`.
>> * #pragma omp teams: :good:`Complete`.
>> @@ -64,5 +66,66 @@ Combined directives
>> * #pragma omp target teams distribute parallel for [simd]: :good:`Complete`.
>> -Clang does not support any constructs/updates from upcoming OpenMP
>> 5.0 except for `reduction`-based clauses in the `task` and
>> `target`-based directives.
>> -In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP
>> Tools Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux,
>> Windows, and mac OS.
>> +Clang does not support any constructs/updates from upcoming OpenMP 5.0 except
>> +for `reduction`-based clauses in the `task` and `target`-based directives.
>> +
>> +In addition, the LLVM OpenMP runtime `libomp` supports the OpenMP Tools
>> +Interface (OMPT) on x86, x86_64, AArch64, and PPC64 on Linux,
>> Windows, and mac OS.
>> +ows, and mac OS.
>> +
>> +.. _basic support for Cuda devices:
>> +
>> +Cuda devices support
>> +====================
>> +
>> +Directives execution modes
>> +--------------------------
>> +
>> +Clang code generation for target regions supports two modes: the SPMD and
>> +non-SPMD modes. Clang chooses one of these two modes automatically based on the
>> +way directives and clauses on those directives are used. The SPMD mode uses a
>> +simplified set of runtime functions thus increasing performance at the cost of
>> +supporting some OpenMP features. The non-SPMD mode is the most generic mode and
>> +supports all currently available OpenMP features. The compiler will always
>> +attempt to use the SPMD mode wherever possible. SPMD mode will not be used if:
>> +
>> +   - The target region contains an `if()` clause that refers to a `parallel`
>> +     directive.
>> +
>> +   - The target region contains a `parallel` directive with a `num_threads()`
>> +     clause.
>> +
>> +   - The target region contains user code (other than OpenMP-specific
>> +     directives) in between the `target` and the `parallel` directives.
>> +
>> +Data-sharing modes
>> +------------------
>> +
>> +Clang supports two data-sharing models for Cuda devices: `Generic` and `Cuda`
>> +modes. The default mode is `Generic`. `Cuda` mode can give an additional
>> +performance and can be activated using the `-fopenmp-cuda-mode` flag. In
>> +`Generic` mode all local variables that can be shared in the parallel regions
>> +are stored in the global memory. In `Cuda` mode local variables are not shared
>> +between the threads and it is user responsibility to share the required data
>> +between the threads in the parallel regions.
>> +
>> +Features not supported or with limited support for Cuda devices
>> +---------------------------------------------------------------
>> +
>> +- Reductions across the teams are not supported yet.
>> +
>> +- Cancellation constructs are not supported.
>> +
>> +- Doacross loop nest is not supported.
>> +
>> +- User-defined reductions are supported only for trivial types.
>> +
>> +- Nested parallelism: inner parallel regions are executed sequentially.
>> +
>> +- Static linking of libraries containing device code is not supported yet.
>> +
>> +- Automatic translation of math functions in target regions to device-specific
>> +  math functions is not implemented yet.
>> +
>> +- Debug information for OpenMP target regions is not supported yet.
>> +
>> Modified: cfe/trunk/docs/ReleaseNotes.rst
>> URL:
>> http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/ReleaseNotes.rst?rev=338049&r1=338048&r2=338049&view=diff
>> ==============================================================================
>> --- cfe/trunk/docs/ReleaseNotes.rst (original)
>> +++ cfe/trunk/docs/ReleaseNotes.rst Thu Jul 26 10:53:45 2018
>> @@ -216,7 +216,21 @@ OpenCL C Language Changes in Clang
>> OpenMP Support in Clang
>> ----------------------------------
>> -- ...
>> +- Clang gained basic support for OpenMP 4.5 offloading for NVPTX target.
>> +   To compile your program for NVPTX target use the following options:
>> +   `-fopenmp -fopenmp-targets=nvptx64-nvidia-cuda` for 64 bit platforms or
>> +   `-fopenmp -fopenmp-targets=nvptx-nvidia-cuda` for 32 bit platform.
>> +
>> +- Passing options to the OpenMP device offloading toolchain can be done using
>> +  the `-Xopenmp-target=<triple> -opt=val` flag. In this way the `-opt=val`
>> +  option will be forwarded to the respective OpenMP device offloading toolchain
>> +  described by the triple. For example passing the compute capability to
>> +  the OpenMP NVPTX offloading toolchain can be done as follows:
>> +  `-Xopenmp-target=nvptx62-nvidia-cuda -march=sm_60`. For the case
>> when only one
>> +  target offload toolchain is specified under the `-fopenmp-targets=<triples>`
>> +  option, then the triple can be skipped: `-Xopenmp-target -march=sm_60`.
>> +
>> +- Other bugfixes.
>> CUDA Support in Clang
>> ---------------------
>> _______________________________________________
>> cfe-commits mailing list
>> cfe-commits at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
>