[flang-commits] [flang] Parallel runtime library design doc (PRIF) (PR #76088)

Katherine Rasmussen via flang-commits flang-commits at lists.llvm.org
Tue Jun 4 14:16:11 PDT 2024


================
@@ -0,0 +1,1987 @@
+<!--===- docs/CoarrayFortranRuntime.md
+
+   Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+   See https://llvm.org/LICENSE.txt for license information.
+   SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+-->
+# Parallel Runtime Interface for Fortran (PRIF) Specification, Revision 0.3
+
+Dan Bonachea  
+Katherine Rasmussen  
+Brad Richardson  
+Damian Rouson  
+Lawrence Berkeley National Laboratory, USA  
+<fortran at lbl.gov>  
+
+# Abstract
+
+This document specifies an interface to support the parallel features of
+Fortran, named the Parallel Runtime Interface for Fortran (PRIF). PRIF is a
+proposed solution in which the runtime library is responsible for coarray
+allocation, deallocation and accesses, image synchronization, atomic operations,
+events, and teams. In this interface, the compiler is responsible for
+transforming the invocation of Fortran-level parallel features into procedure
+calls to the necessary PRIF procedures. The interface is designed for
+portability across shared- and distributed-memory machines, different operating
+systems, and multiple architectures. Implementations of this interface are
+intended as an augmentation for the compiler's own runtime library. With an
+implementation-agnostic interface, alternative parallel runtime libraries may be
+developed that support the same interface. One benefit of this approach is the
+ability to vary the communication substrate. A central aim of this document is
+to define a parallel runtime interface in standard Fortran syntax, which enables
+us to leverage Fortran to succinctly express various properties of the procedure
+interfaces, including argument attributes.
+
+> **WORK IN PROGRESS** This document is still a draft and may continue to evolve.    
+> Feedback and questions should be directed to: <fortran at lbl.gov>
+
+\newpage
+# Change Log
+
+## Revision 0.1
+
+* Identify parallel features
+* Sketch out high-level design
+* Decide on compiler vs PRIF responsibilities
+
+## Revision 0.2 (Dec. 2023)
+
+* Change name to PRIF
+* Fill out interfaces to all PRIF provided procedures
+* Write descriptions, discussions and overviews of various features, arguments, etc.
+
+## Revision 0.3 (May 2024)
+
+* `prif_(de)allocate` are renamed to `prif_(de)allocate_coarray`
+* `prif_(de)allocate_non_symmetric` are renamed to `prif_(de)allocate`
+* `prif_local_data_size` renamed to `prif_size_bytes` and
+  add a client note about the procedure
+* Update interface to `prif_base_pointer` by replacing three arguments, `coindices`,
+  `team`, and `team_number`, with one argument `image_num`. Update the semantics
+  of `prif_base_pointer`, as it is no longer responsible for resolving the coindices and
+  team information into a number that represents the image on the initial team before
+  returning the address. That is now expected to occur before the `prif_base_pointer`
+  call and passed into the `image_num` argument.
+* Add target attribute on `coarray_handles` argument to `prif_deallocate_coarray`
+* Add pointer attribute on `handle` argument to `coarray_cleanup` callback for `prif_allocate_coarray`
+* Add target attribute on `value` argument to `prif_put` and `prif_get`
+* Add new PRIF-specific constant `PRIF_STAT_OUT_OF_MEMORY`
+* Clarify that remote pointers passed to various procedures must reference storage
+  allocated using `prif_allocate_coarray` or `prif_allocate`
+* Clarify description of the `allocated_memory` argument for
+  the procedures `prif_allocate_coarray` and `prif_allocate`
+* Clarify descriptions of `event_var_ptr`, `lock_var_ptr`, and `notify_ptr`
+* Clarify descriptions for `prif_stop`, `prif_put`, `prif_get`,
+  intrinsic derived types, sections about `MOVE_ALLOC` and coarray accesses
+* Replace the phrase "local completion" with the phrase "source completion",
+  and add the new phrase to the glossary
+* Clarify that `prif_stop` should be used to initiate normal termination
+* Describe the `operation` argument to `prif_co_reduce`
+* Rename and clarify the cobounds arguments to `prif_alias_create`
+* Clarify the descriptions of `source_image`/`result_image` arguments to collective calls
+* Clarify completion semantics for atomic operations
+* Rename `coindices` argument names to `cosubscripts` to more closely correspond with
+  the terms used in the Fortran standard
+* Rename `local_buffer` and `local_buffer_stride` arg names
+  to `current_image_buffer` and `current_image_buffer_stride`
+* Update `coindexed-object` references to _coindexed-named-object_ to match
+  the term change in the most recent Fortran 2023 standard
+* Convert several explanatory sections to "Notes"
+* Add implementation note about the PRIF API being defined in Fortran
+* Add section "How to read the PRIF specification"
+* Add section "Glossary"
+* Improve description of the `final_func` arg to `prif_allocate_coarray`
+  and move some of previous description to a client note.
+
+\newpage
+# Problem Description
+
+In order to be fully Fortran 2023 compliant, a Fortran compiler needs support for
+what is commonly referred to as Coarray Fortran, which includes features
+related to parallelism. These features include the following statements,
+subroutines, functions, types, and kind type parameters:
+
+* **Statements:**
+  - _Synchronization:_ `SYNC ALL`, `SYNC IMAGES`, `SYNC MEMORY`, `SYNC TEAM`
+  - _Events:_ `EVENT POST`, `EVENT WAIT`
+  - _Notify:_ `NOTIFY WAIT`
+  - _Error termination:_ `ERROR STOP`
+  - _Locks:_ `LOCK`, `UNLOCK`
+  - _Failed images:_ `FAIL IMAGE`
+  - _Teams:_ `FORM TEAM`, `CHANGE TEAM`
+  - _Critical sections:_ `CRITICAL`, `END CRITICAL`
+* **Intrinsic functions:** `NUM_IMAGES`, `THIS_IMAGE`, `LCOBOUND`, `UCOBOUND`,
+  `TEAM_NUMBER`, `GET_TEAM`, `FAILED_IMAGES`, `STOPPED_IMAGES`, `IMAGE_STATUS`,
+  `COSHAPE`, `IMAGE_INDEX`
+* **Intrinsic subroutines:**
+  - _Collective subroutines:_ `CO_SUM`, `CO_MAX`, `CO_MIN`, `CO_REDUCE`, `CO_BROADCAST`
+  - _Atomic subroutines:_ `ATOMIC_ADD`, `ATOMIC_AND`, `ATOMIC_CAS`,
+    `ATOMIC_DEFINE`, `ATOMIC_FETCH_ADD`, `ATOMIC_FETCH_AND`, `ATOMIC_FETCH_OR`,
+    `ATOMIC_FETCH_XOR`, `ATOMIC_OR`, `ATOMIC_REF`, `ATOMIC_XOR`
+  - _Other subroutines:_ `EVENT_QUERY`
+* **Types, kind type parameters, and values:**
+  - _Intrinsic derived types:_ `EVENT_TYPE`, `TEAM_TYPE`, `LOCK_TYPE`, `NOTIFY_TYPE`
+  - _Atomic kind type parameters:_ `ATOMIC_INT_KIND` AND `ATOMIC_LOGICAL_KIND`
+  - _Values:_ `STAT_FAILED_IMAGE`, `STAT_LOCKED`, `STAT_LOCKED_OTHER_IMAGE`,
+    `STAT_STOPPED_IMAGE`, `STAT_UNLOCKED`, `STAT_UNLOCKED_FAILED_IMAGE`
+
+In addition to supporting syntax related to the above features,
+compilers will also need to be able to handle new execution concepts such as
+image control. The image control concept affects the behaviors of some
+statements that were introduced in Fortran expressly for supporting parallel
+programming, but image control also affects the behavior of some statements
+that pre-existed parallelism in standard Fortran:
+
+* **Image control statements:**
+  - _Pre-existing statements_: `ALLOCATE`, `DEALLOCATE`, `STOP`, `END`,
+    a `CALL` to `MOVE_ALLOC` with coarray arguments
+  - _New statements:_ `SYNC ALL`, `SYNC IMAGES`, `SYNC MEMORY`, `SYNC TEAM`,
+    `CHANGE TEAM`, `END TEAM`, `CRITICAL`, `END CRITICAL`, `EVENT POST`,
+    `EVENT WAIT`, `FORM TEAM`, `LOCK`, `UNLOCK`, `NOTIFY WAIT`
+
+One consequence of the statements being categorized as image control statements
+will be the need to restrict code movement by optimizing compilers.
+
+# Proposed Solution
+
+This specification proposes an interface to support the above features,
+named Parallel Runtime Interface for Fortran (PRIF). By defining an
+implementation-agnostic interface, we envision facilitating the development of
+alternative parallel runtime libraries that support the same interface. One
+benefit of this approach is the ability to vary the communication substrate.
+A central aim of this document is to specify a parallel runtime interface in
+standard Fortran syntax, which enables us to leverage Fortran to succinctly
+express various properties of the procedure interfaces, including argument
+attributes. See [Rouson and Bonachea (2022)] for additional details.
+
+## Parallel Runtime Interface for Fortran (PRIF)
+
+The Parallel Runtime Interface for Fortran is a proposed interface in which the
+PRIF implementation is responsible for coarray allocation, deallocation and
+accesses, image synchronization, atomic operations, events, and teams. In this
+interface, the compiler is responsible for transforming the invocation of
+Fortran-level parallel features to add procedure calls to the necessary PRIF
+procedures. Below you can find a table showing the delegation of tasks
+between the compiler and the PRIF implementation. The interface is designed for
+portability across shared- and distributed-memory machines, different operating
+systems, and multiple architectures. 
+
+Implementations of PRIF are intended as an
+augmentation for the compiler's own runtime library. While the interface can
+support multiple implementations, we envision needing to build the PRIF implementation
+as part of installing the compiler. The procedures and types provided
+for direct invocation as part of the PRIF implementation shall be defined in a
+Fortran module with the name `prif`.
+
+## Delegation of tasks between the Fortran compiler and the PRIF implementation
+
+The following table outlines which tasks will be the responsibility of the
+Fortran compiler and which tasks will be the responsibility of the PRIF
+implementation. A 'X' in the "Fortran compiler" column indicates that the compiler has
+the primary responsibility for that task, while a 'X' in the "PRIF implementation"
+column indicates that the compiler will invoke the PRIF implementation to perform
+the task and the PRIF implementation has primary responsibility for the task's
+implementation. See the [Procedure descriptions](#prif-procedures)
+for the list of PRIF implementation procedures that the compiler will invoke.
+
+|                                                      Tasks                                                                       |  Fortran compiler  | PRIF implementation |
+|----------------------------------------------------------------------------------------------------------------------------------|--------------------|---------------------|
+| Establish and initialize static coarrays prior to `main`                                                                         |         X          |                     |
+| Track corank of coarrays                                                                                                         |         X          |                     |
+| Track local coarrays for implicit deallocation when exiting a scope                                                              |         X          |                     |
+| Initialize a coarray with `SOURCE=` as part of `ALLOCATE`                                                                        |         X          |                     |
+| Provide `prif_critical_type` coarrays for `CRITICAL`                                                                             |         X          |                     |
+| Provide final subroutine for all derived types that are finalizable or that have allocatable components that appear in a coarray |         X          |                     |
+| Track variable allocation status, including resulting from use of `MOVE_ALLOC`                                                   |         X          |                     |
+|                                                                                                                                  |                    |                     |
+| Intrinsics related to parallelism, eg. `NUM_IMAGES`, `COSHAPE`, `IMAGE_INDEX`                                                    |                    |          X          |
+| Allocate and deallocate a coarray                                                                                                |                    |          X          |
+| Reference a _coindexed-named-object_                                                                                             |                    |          X          |
+| Team statements/constructs: `FORM TEAM`, `CHANGE TEAM`, `END TEAM`                                                                |                    |          X          |
+| Team stack abstraction                                                                                                           |                    |          X          |
+| Track coarrays for implicit deallocation at `END TEAM`                                                                           |                    |          X          |
+| Atomic subroutines, e.g. `ATOMIC_FETCH_ADD`                                                                                      |                    |          X          |
+| Collective subroutines, e.g. `CO_BROADCAST`, `CO_SUM`                                                                            |                    |          X          |
+| Synchronization statements, e.g. `SYNC ALL`, `SYNC TEAM`                                                                         |                    |          X          |
+| Events: `EVENT POST`, `EVENT WAIT`                                                                                               |                    |          X          |
+| Locks: `LOCK`, `UNLOCK`                                                                                                          |                    |          X          |
+| `CRITICAL` construct                                                                                                             |                    |          X          |
+| `NOTIFY WAIT` statement                                                                                                          |                    |          X          |
+
+| **NOTE**: Caffeine - LBNL's Implementation of the Parallel Runtime Interface for Fortran |
+| ---------------- |
+| Implementations for much of the Parallel Runtime Interface for Fortran exist in [Caffeine], a parallel runtime library supporting coarray Fortran compilers. Caffeine will continue to be developed in order to fully implement PRIF. Caffeine targets the [GASNet-EX] exascale networking middleware, however PRIF is deliberately agnostic to details of the communication substrate. As such it should be possible to develop PRIF implementations targeting other substrates including the Message Passing Interface ([MPI]). |
+
+## How to read the PRIF specification
+
+The following types and procedures align with corresponding types and procedures
+from the Fortran standard. In many cases, the correspondence is clear from the identifiers.
+For example, the PRIF procedure `prif_num_images` corresponds to the intrinsic function
+`NUM_IMAGES` that is defined in the Fortran standard. In other cases, the correspondence
+may be less clear and is stated explicitly.
+
+In order to avoid redundancy, some details are not included below as the corresponding
+descriptions in the Fortran standard contain the detailed descriptions of what is
+required by the language. For example, this document references the term _coindexed-named-object_
+multiple times, but does not define it since it is part of the language and the Fortran
+standard defines it. As such, in order to fully understand the PRIF specification, it is
+critical to read and reference the Fortran standard alongside it. Additionally, the
+descriptions in the PRIF specification use similar language to the language used in the
+Fortran standard, such as terms like ‘shall’.” Where PRIF uses terms not defined in
+the standard, their definitions may be found in the [`Glossary`](#glossary).
+
+# PRIF Types and Constants
+
+## Fortran Intrinsic Derived Types
+
+These types will be defined by the PRIF implementation. The
+compiler will use these PRIF-provided implementation definitions for the corresponding
+types in the compiler's implementation of the `ISO_FORTRAN_ENV` module. This
+enables the internal structure of each given type to be tailored as needed for
+a given PRIF implementation.
+
+### `prif_team_type`
+
+* implementation for `TEAM_TYPE` from `ISO_FORTRAN_ENV`
+
+### `prif_event_type`
+
+* implementation for `EVENT_TYPE` from `ISO_FORTRAN_ENV`
+
+### `prif_lock_type`
+
+* implementation for `LOCK_TYPE` from `ISO_FORTRAN_ENV`
+
+### `prif_notify_type`
+
+* implementation for `NOTIFY_TYPE` from `ISO_FORTRAN_ENV`
+
+## Constants in `ISO_FORTRAN_ENV`
+
+These values will be defined in the PRIF implementation and it is proposed that the
+compiler will use a rename to use the PRIF implementation definitions for these
+values in the compiler's implementation of the `ISO_FORTRAN_ENV` module.
+
+### `PRIF_ATOMIC_INT_KIND`
+
+This shall be set to an implementation-defined value from the compiler-provided `INTEGER_KINDS`
+array.
+
+### `PRIF_ATOMIC_LOGICAL_KIND`
+
+This shall be set to an implementation-defined value from the compiler-provided `LOGICAL_KINDS`
+array.
+
+### `PRIF_CURRENT_TEAM`
+
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation. It shall be distinct from the values `PRIF_INITIAL_TEAM` and
+`PRIF_PARENT_TEAM`
+
+### `PRIF_INITIAL_TEAM`
+
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation. It shall be distinct from the values `PRIF_CURRENT_TEAM` and
+`PRIF_PARENT_TEAM`
+
+### `PRIF_PARENT_TEAM`
+
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation. It shall be distinct from the values `PRIF_CURRENT_TEAM` and
+`PRIF_INITIAL_TEAM`
+
+### `PRIF_STAT_FAILED_IMAGE`
+
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation to be negative if the implementation cannot detect failed images
+and positive otherwise. It shall be distinct from all other stat constants
+defined by this specification.
+
+### `PRIF_STAT_LOCKED`
+
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation. It shall be distinct from all other stat constants
+defined by this specification.
+
+### `PRIF_STAT_LOCKED_OTHER_IMAGE`
+
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation. It shall be distinct from all other stat constants
+defined by this specification.
+
+### `PRIF_STAT_STOPPED_IMAGE`
+
+This shall be a positive value of type `integer(c_int)` that is defined by the
+implementation. It shall be distinct from all other stat constants
+defined by this specification.
+
+### `PRIF_STAT_UNLOCKED`
+
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation. It shall be distinct from all other stat constants
+defined by this specification.
+
+### `PRIF_STAT_UNLOCKED_FAILED_IMAGE`
+
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation. It shall be distinct from all other stat constants
+defined by this specification.
+
+## PRIF-Specific Constants
+
+This constant is not defined by the Fortran standard.
+
+### `PRIF_STAT_OUT_OF_MEMORY`
+
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation. It shall be distinct from  all other stat constants
+defined by this specification. It shall indicate a low-memory condition
+and may be returned by `prif_allocate_coarray` or `prif_allocate`.
+
+## PRIF-Specific Types
+
+These derived types are defined by the PRIF implementation and the contents are
+opaque to the compiler. They don't correspond directly to types mandated
+by the Fortran specification, but rather are helper types used in PRIF to 
+provide the parallel Fortran features.
+
+### `prif_coarray_handle`
+
+* a derived type provided by the PRIF implementation whose contents are opaque to the
+  compiler. It represents a reference to a coarray descriptor and is passed 
+  back and forth across PRIF for coarray operations. 
+* Each coarray descriptor maintains some "context data" on a per-image basis, which the compiler may
+  use to support proper implementation of coarray arguments, especially with
+  respect to `MOVE_ALLOC` operations on allocatable coarrays.
+  This is accessed/set with the procedures `prif_get_context_handle` and
+  `prif_set_context_handle`. PRIF does not interpret the contents of this context data in
+  any way, and it is only accessible on the current image. The context data is
+  a property of the allocated coarray object, and is thus shared between all
+  handles and aliases that refer to the same coarray allocation (i.e. those
+  created from a call to `prif_alias_create`).
+
+### `prif_critical_type`
+
+* a derived type provided by the PRIF implementation that is opaque to the
+  compiler and is used for implementing `critical` blocks
+
+# PRIF Procedures
+
+**The PRIF API provides implementations of parallel Fortran features, as specified
+in Fortran 2023. For any given `prif_*` procedure that corresponds to a Fortran
+procedure or statement of similar name, the constraints and semantics associated
+with each argument to the `prif_*` procedure match those of the analogous
+argument to the parallel Fortran feature, except where this document explicitly
+specifies otherwise. For any given `prif_*` procedure that corresponds to a Fortran
+procedure or statement of similar name, the constraints and semantics match those
+of the analogous parallel Fortran feature. In particular, any required synchronization
+is performed by the PRIF implementation unless otherwise specified.**
+
+| **IMPLEMENTATION NOTE**: |
+| ---------------- |
+| The PRIF API is defined as a set of Fortran language procedures and supporting types, and as such an implementation of PRIF cannot be expressed solely in C/C++. However C/C++ can be used to implement portions of the PRIF procedures via calls to BIND(C) procedures. |
+
+Where possible, optional arguments are used for optional parts or different
+forms of statements or procedures. In some cases the different forms or presence
+of certain options change the return type or rank, and in those cases a generic
+interface with different specific procedures is used.
+
+## Common Arguments
+
+There are multiple Common Arguments sections throughout the specification that
+outline details of the arguments that are common for the following sections
+of procedure interfaces.
+
+### Integer and Pointer Arguments
+
+There are several categories of arguments where the PRIF implementation will need
+pointers and/or integers. These fall broadly into the following categories.
+
+1. `integer(c_intptr_t)`: Anything containing a pointer representation where
+   the compiler might be expected to perform pointer arithmetic
+2. `type(c_ptr)` and `type(c_funptr)`: Anything containing a pointer to an
+   object/function where the compiler is expected only to pass it (back) to the
+   PRIF implementation
+3. `integer(c_size_t)`: Anything containing an object size, in units of bytes
+   or elements, i.e. shape, element_size, etc.
+4. `integer(c_ptrdiff_t)`: strides between elements for non-contiguous coarray
+   accesses
+5. `integer(c_int)`: Integer arguments corresponding to image index and
+  stat arguments. It is expected that the most common integer arguments
+  appearing in Fortran code will be of default integer kind, it is expected that
+  this will correspond with that kind, and there is no reason to expect these
+  arguments to have values that would not be representable in this kind.
+6. `integer(c_intmax_t)`: Bounds, cobounds, indices, cosubscripts, and any other
----------------
ktras wrote:

**LBL response:** Because integer arguments to many Fortran intrinsics may be any size integer, PRIF needs to support the widest integer the compiler supports. This is by definition c_intmax_t, which POSIX defined for exactly this type of use case. This is often equivalent to c_int64_t, but toolchains are permitted to support even wider types here. Note that in particular, specifying c_size_t here as you suggest would prevent use of 64-bit cobounds and team_numbers on ILP32 platforms (which is still a meaningful concept).

https://github.com/llvm/llvm-project/pull/76088


More information about the flang-commits mailing list