[flang-commits] [clang-tools-extra] [flang] [llvm] Parallel runtime library design doc (PR #76088)

Brad Richardson via flang-commits flang-commits at lists.llvm.org
Wed Dec 20 10:04:43 PST 2023


https://github.com/everythingfunctional created https://github.com/llvm/llvm-project/pull/76088

This document specifies the interface design for supporting the parallel features of flang.

>From 6d59127f993b915a79ad247a45d17f8d2d8f4c06 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Wed, 4 Jan 2023 17:09:40 -0800
Subject: [PATCH 01/33] Create first draft of the design doc for Coarray
 Fortran features.

---
 flang/docs/CoarrayFortranRuntime.md | 56 +++++++++++++++++++++++++++++
 1 file changed, 56 insertions(+)
 create mode 100644 flang/docs/CoarrayFortranRuntime.md

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
new file mode 100644
index 00000000000000..ef16c138905b79
--- /dev/null
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -0,0 +1,56 @@
+<!--===- docs/CoarrayFortranRuntime.md
+
+   Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+   See https://llvm.org/LICENSE.txt for license information.
+   SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+-->
+
+# Problem description
+  In order to be fully Fortran 2018 compliant, Flang needs to add support for what is commonly referred to as coarray fortran,
+  which includes features related to parallelism. These features include coarrays, teams, events, and the statements, intrinsic
+  subroutines and functions that support them. These statements include `sync-all-stmt`, `sync-images-stmt`, `sync-memory-stmt`,
+  `sync-team-stmt`, `event-post-stmt`, `event-wait-stmt`,` error-stop-stmt`, `lock-stmt`. The intrinsic functions include
+  `num_images`, `this_image`, `lcobound`, `ucobound`, `team_number`, `get_team`, `failed_images`, `stopped_images`,
+  `image_status`, `coshape`, `image_index`. The intrinsic subroutines include `event_query`, the collectives: `co_sum`, `co_max`,
+  `co_min`, `co_reduce`, `co_broadcast` and the atomics: `atomic_add`, `atomic_and`, `atomic_cas`, `atomic_define`,
+  `atomic_fetch_add`, `atomic_fetch_and`, `atomic_fetch_or`, `atomic_fetch_xor`, `atomic_or`, `atomic_ref`, `atomic_xor`.
+
+
+
+# Proposed solution
+  This design document proposes to use [Caffeine](https://github.com/berkeleylab/caffeine), a parallel runtime library, to handle the runtime support for coarray Fortran features.
+
+
+
+# Implementation details overview
+  This design document proposes the design of Flang features and discusses how Flang will interface with Caffeine, the
+  runtime library. It outlines which tasks will be the responsibility of Flang and which tasks will be the responsibility
+  of Caffeine.
+
+## Coarray Runtime Library Caffeine
+  Caffeine is a parallel runtime library that aims to support Fortran compilers with a programming-model-agnostic application
+  binary interface (ABI) to various communication libraries. Current work is on supporting the ABI with the GASNet-EX
+  exascale-ready networking middleware.
+
+## Delegation of tasks between Flang and Caffeine
+
+| Tasks | Flang | Caffeine |
+| ----  | ----- | -------- |
+| Track corank of coarrays                |     ✓     |           |
+| Track teams associated with a coarray   |     ✓     |           |
+| Assigning variables of type `team-type` |     ✓     |           |
+| Track when implicit coarray deallocation needs to occur when exiting a scope |     ✓     |           |
+| Implementing the intrinsic `coshape`    |     ?     |     ?     |
+| Team stack abstraction                  |           |     ✓     |
+| `form-team-stmt`                        |           |     ✓     |
+| `change-team-stmt`                      |           |     ✓     |
+| `end-team-stmt`                         |           |     ✓     |
+| Allocate a coarray                      |           |     ✓     |
+| Deallocate a coarray                    |           |     ✓     |
+| Reference a coarray                     |           |     ✓     |
+
+
+
+# Testing plan
+[tbd]

>From 9cebd3dacb4009599a91f21bea75523b7a2da3aa Mon Sep 17 00:00:00 2001
From: Damian Rouson <rouson at lbl.gov>
Date: Thu, 5 Jan 2023 14:45:41 -0800
Subject: [PATCH 02/33] doc: edit Problem Description & Proposed Solution (#83)

* doc: edit Problem Description & Proposed Solution

This commit
1. Attempts to make the feature and statement list exhaustive.
2. Adds types, kind type parameters, and values that explicitly support parallelism.
3. Adds a list of image control statements, including those that pre-existed parallelism in Fortran.
3. Makes the use of statement syntax more uniform, e.g., switch `sync-all-stmt` to `sync all`.
4. Add a few links.

* Update flang/docs/CoarrayFortranRuntime.md

Co-authored-by: Katherine Rasmussen <krasmussen at lbl.gov>

* Update flang/docs/CoarrayFortranRuntime.md

Co-authored-by: Katherine Rasmussen <krasmussen at lbl.gov>
---
 flang/docs/CoarrayFortranRuntime.md | 44 +++++++++++++++++++++--------
 1 file changed, 33 insertions(+), 11 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index ef16c138905b79..fda36b55abe72c 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -7,21 +7,37 @@
 -->
 
 # Problem description
-  In order to be fully Fortran 2018 compliant, Flang needs to add support for what is commonly referred to as coarray fortran,
-  which includes features related to parallelism. These features include coarrays, teams, events, and the statements, intrinsic
-  subroutines and functions that support them. These statements include `sync-all-stmt`, `sync-images-stmt`, `sync-memory-stmt`,
-  `sync-team-stmt`, `event-post-stmt`, `event-wait-stmt`,` error-stop-stmt`, `lock-stmt`. The intrinsic functions include
+  In order to be fully Fortran 2018 compliant, Flang needs to add support for what is commonly referred to as coarray fortran, which includes features related to parallelism. These features include the following statements, subroutines, functions, types, and kind type parameters:
+  
+  * **Statements:** 
+    - _Synchronization:_ `sync all`, `sync images`, `sync memory`, `sync team` 
+    - _Events:_ `event post`, `event wait`
+    - _Error termination:_ `error stop`
+    - _Locks:_ `lock`, `unlock` 
+    - _Failed images:_ `fail image`
+    - _Teams:_ `form team`, `change team`
+    - _Critical sections:_ `critical`, `end critical`
+  * **Intrinsic functions:**
   `num_images`, `this_image`, `lcobound`, `ucobound`, `team_number`, `get_team`, `failed_images`, `stopped_images`,
-  `image_status`, `coshape`, `image_index`. The intrinsic subroutines include `event_query`, the collectives: `co_sum`, `co_max`,
-  `co_min`, `co_reduce`, `co_broadcast` and the atomics: `atomic_add`, `atomic_and`, `atomic_cas`, `atomic_define`,
-  `atomic_fetch_add`, `atomic_fetch_and`, `atomic_fetch_or`, `atomic_fetch_xor`, `atomic_or`, `atomic_ref`, `atomic_xor`.
-
+  `image_status`, `coshape`, `image_index`
+  * **Intrinsic subroutines:**
+    - _Collective subroutines:_ `co_sum`, `co_max`, `co_min`, `co_reduce`, `co_broadcast`
+    - _Atomic subroutines:_ `atomic_add`, `atomic_and`, `atomic_cas`, `atomic_define`,
+  `atomic_fetch_add`, `atomic_fetch_and`, `atomic_fetch_or`, `atomic_fetch_xor`, `atomic_or`, `atomic_ref`, `atomic_xor`
+    - _Other subroutines:_ `event_query`
+  * **Types, kind type parameters, and values:**
+    - _Intrinsic derived types:_ `event_type`, `team_type`
+    - _Atomic kind type parameters:_ `atomic_int_kind` and `atomic_logical_kind`
+    - _Values:_ `stat_failed_image`, `stat_locked`, `stat_locked_other_image`, `stat_stopped_image`, `stat_unlocked`, `stat_unlocked_failed_image`
 
+In addition to being able to support syntax related to the above features, compilers will also need to be able to handle new execution concepts such as image control.  The image control concept affects the behaviors of some statements that were introduced in Fortran expressly for supporting parallel programming, but image control also affects the behavior of some statements that pre-existed parallism in standard Fortran:
+ * **Image control statements:**
+   - _Pre-existing statements_: `allocate`, `deallocate`, `stop`, `end`, a `call` referencing `move_alloc` with coarray arguments
+   - _New statements:_ `sync all`, `sync images`, `sync memory`, `sync team`, `change team`, `end team`, `critical`, `end critical`, `event post`, `event wait`, `form team`, `lock`, `unlock`
+One consequence of the statements being categorizing statements as image control will be the need to restrict code movement by optimizing compilers.
 
 # Proposed solution
-  This design document proposes to use [Caffeine](https://github.com/berkeleylab/caffeine), a parallel runtime library, to handle the runtime support for coarray Fortran features.
-
-
+  This design document proposes an application programming interface (API) to support the above features.  Implementations of some parts of the API exist in [Caffeine], a parallel runtime library targeting coarray Fortran compilers.  By defining a library-agnostic API, we envision facilitating the development of alternative parallel runtime libraries that support the same API.  One benefit of this approach is the ability to vary the communication substrate.  For example, Caffeine uses the [GASNet-EX] exascale networking middleware, whereas it might also be possible to develop wrappers that would support the proposed API with [OpenCoarrays], which uses the Message Passing Interface ([MPI]). A central aim of this document is to use a parallel runtime API in standard Fortran syntax, which enables us to leverage the Fortran to succinctly express various properties of the procedure interfaces, including argument attributes.  See [Rouson and Bonachea (2022)] for additional details.
 
 # Implementation details overview
   This design document proposes the design of Flang features and discusses how Flang will interface with Caffeine, the
@@ -54,3 +70,9 @@
 
 # Testing plan
 [tbd]
+
+[Caffeine]: https://go.lbl.gov/caffeine
+[GASNet-EX]: https://go.lbl.gov/gasnet
+[OpenCoarrays]: https://github.com/sourceryinstitute/opencoarrays
+[MPI]: https://www.mpi-forum.org
+[Rouson and Bonachea (2022)]: https://doi.org/10.25344/S4459B

>From df4a4236c5d853ff42ad32251234dc7cf69e3bfb Mon Sep 17 00:00:00 2001
From: Dan Bonachea <dobonachea at lbl.gov>
Date: Mon, 9 Jan 2023 15:37:06 -0500
Subject: [PATCH 03/33] Clarify responsibility for implicit coarray
 deallocation

---
 flang/docs/CoarrayFortranRuntime.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index fda36b55abe72c..96f84139ddca60 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -56,8 +56,9 @@ One consequence of the statements being categorizing statements as image control
 | Track corank of coarrays                |     ✓     |           |
 | Track teams associated with a coarray   |     ✓     |           |
 | Assigning variables of type `team-type` |     ✓     |           |
-| Track when implicit coarray deallocation needs to occur when exiting a scope |     ✓     |           |
+| Track coarrays for implicit deallocation when exiting a scope |     ✓     |           |
 | Implementing the intrinsic `coshape`    |     ?     |     ?     |
+| Track allocatable coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
 | Team stack abstraction                  |           |     ✓     |
 | `form-team-stmt`                        |           |     ✓     |
 | `change-team-stmt`                      |           |     ✓     |

>From 7489ff39a16db7c4bd7372e4a1b35878a18981c6 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Wed, 11 Jan 2023 18:24:18 -0800
Subject: [PATCH 04/33] Add draft for new section titled Compiler Facing
 Caffeine API.

---
 flang/docs/CoarrayFortranRuntime.md | 103 ++++++++++++++++++++++++++--
 1 file changed, 98 insertions(+), 5 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 96f84139ddca60..cc23bfa7bf0ab8 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -8,12 +8,12 @@
 
 # Problem description
   In order to be fully Fortran 2018 compliant, Flang needs to add support for what is commonly referred to as coarray fortran, which includes features related to parallelism. These features include the following statements, subroutines, functions, types, and kind type parameters:
-  
-  * **Statements:** 
-    - _Synchronization:_ `sync all`, `sync images`, `sync memory`, `sync team` 
+
+  * **Statements:**
+    - _Synchronization:_ `sync all`, `sync images`, `sync memory`, `sync team`
     - _Events:_ `event post`, `event wait`
     - _Error termination:_ `error stop`
-    - _Locks:_ `lock`, `unlock` 
+    - _Locks:_ `lock`, `unlock`
     - _Failed images:_ `fail image`
     - _Teams:_ `form team`, `change team`
     - _Critical sections:_ `critical`, `end critical`
@@ -56,6 +56,7 @@ One consequence of the statements being categorizing statements as image control
 | Track corank of coarrays                |     ✓     |           |
 | Track teams associated with a coarray   |     ✓     |           |
 | Assigning variables of type `team-type` |     ✓     |           |
+| Translate critical construct to lock/unlock |     ✓     |           |
 | Track coarrays for implicit deallocation when exiting a scope |     ✓     |           |
 | Implementing the intrinsic `coshape`    |     ?     |     ?     |
 | Track allocatable coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
@@ -65,9 +66,101 @@ One consequence of the statements being categorizing statements as image control
 | `end-team-stmt`                         |           |     ✓     |
 | Allocate a coarray                      |           |     ✓     |
 | Deallocate a coarray                    |           |     ✓     |
-| Reference a coarray                     |           |     ✓     |
+| Reference a coindexed-object           |           |     ✓     |
+
+Add to table: teams, events, synchronization statements, critical construct, locks
+
+## Compiler facing Caffeine API
+
+### Puts and Gets
+
+Current pseudo code. May not stay in design doc.
+
+```
+  module subroutine caf_put_blocking(coarray, coindices, target, value, team, team_number, stat)
+    implicit none
+    type(caf_co_handle), intent(in) :: coarray
+    integer, intent(in) :: coindices(:)
+    type(*), dimension(..), intent(in) :: target, value
+    type(team_type), optional, intent(in) :: team
+    integer, optional, intent(in) :: team_number
+    integer, optional, intent(out) :: stat
+  end subroutine
+
+  module subroutine caf_get_blocking(coarray, coindices, source, value, team, team_number, stat)
+    implicit none
+    type(caf_co_handle), intent(in) :: coarray
+    integer, intent(in) :: coindices(:)
+    type(*), dimension(..), intent(in) :: source
+    type(*), dimension(..), intent(inout) :: value
+    type(team_type), optional, intent(in) :: team
+    integer, optional, intent(in) :: team_number
+    integer, optional, intent(out) :: stat
+  end subroutine
+```
+  * **caf_put_blocking:**
+    -   Description: ...
+    -   Procedure Interface: `subroutine caf_put_blocking(coarray, coindices, target, value, team, team_number, stat)`
+
+  * **caf_get_blocking:**
+    -   Description: ...
+    -   Procedure Interface: `subroutine caf_get_blocking(coarray, coindices, source, value, team, team_number, stat)`
+
+  Arguments to `caf_put_blocking` and `caf_get_blocking`:
+
+| Argument | Type | Rank | Dimensions | Intent | Additional attributes | Notes |
+| -------- | ---- | ---- | ---------- | ------ | --------------------- | ----- |
+| `coarray` | `caf_co_handle` | 0 | n/a | `intent(in)` | n/a | caf_co_handle will be a derived type provided by Caffeine. This argument is a handle for the established coarray. This handle will be created when the coarray is established. |
+| `coindices` | `integer` | 1 | dimension(:) | `intent(in)` | n/a | ----- |
+| `target` | `type(*)` | 1 | dimension(..) | `intent(in)` | n/a | ----- |
+| `value`  | `type(*)` | 1 | dimension(..) | `intent(in)` for gets, `intent(inout)` for puts | n/a | ----- |
+| `team` | `team_type` | 0 | n/a | `intent(in)` | optional | Both optional arguments `team` and `team_number` shall not be present in the same call|
+| `team_number` |  `integer` | 0 | n/a | `intent(in)` | optional | Both optional arguments `team` and `team_number` shall not be present in the same call|
+| `stat` | `integer` | 0 | n/a | `intent(out)` | optional | ----- |
+
+  * **Asynchrony:**
+    -   Could be handle based or fence based approaches
+    -   Handle based - return can individual operation handle, later on compiler synchronizes handle
+    -   Fence based - implicit handle operations, closer to MPI
+
+### Atomic subroutines
+
+  * **caf_atomic_define:**
+    -   Description: ...
+    -   Procedure Interface: ...
+    -   Arguments: ...
+
+  * **caf_atomic_ref:**
+    -   Description: ...
+    -   Procedure Interface: ...
+    -   Arguments: ...
+
+  * **caf_atomic_add:**
+    -   Description: Blocking atomic operation...
+    -   Procedure Interface:   `subroutine caf_atomic_add(coarray, coindicies, offset, value, stat)` or `subroutine caf_atomic_add(coarray, coindicies, target, value, stat)`
+    -   Arguments: ...
+
+
+Current pseudo code. May not stay in design doc.
 
+Option 1 with offset:
+```
+  module subroutine caf_atomic_add(coarray, coindicies, offset, value, stat) ! blocking atomic operation
+    type(caf_co_handle) :: coarray
+    integer, intent(in) :: coindices(:)
+    integer :: offset
+    integer(kind=atomic_int_kind) :: value
+  end subroutine
+```
 
+Option 2 with target:
+```
+  module subroutine caf_atomic_add(coarray, coindicies, target, value, stat) ! blocking atomic operation
+    type(caf_co_handle) :: coarray
+    integer, intent(in) :: coindices(:)
+    type(*), intent(in) :: target
+  end subroutine
+```
 
 # Testing plan
 [tbd]

>From 0af43d90def1674a8ea71b392a7331b7e7c8954b Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Sun, 29 Jan 2023 13:31:03 -0800
Subject: [PATCH 05/33] Add more drafts for allocating and deallocating
 coarrays.

---
 flang/docs/CoarrayFortranRuntime.md | 40 ++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index cc23bfa7bf0ab8..19fcc13f386978 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -58,7 +58,8 @@ One consequence of the statements being categorizing statements as image control
 | Assigning variables of type `team-type` |     ✓     |           |
 | Translate critical construct to lock/unlock |     ✓     |           |
 | Track coarrays for implicit deallocation when exiting a scope |     ✓     |           |
-| Implementing the intrinsic `coshape`    |     ?     |     ?     |
+| Initialize a coarray with SOURCE= as part of allocate-stmt |     ✓     |           |
+| Implementing the intrinsics `coshape`, `lcobound`, and `ucobound`, `image_index` and keeping track of corank    |     ?     |     ?     |
 | Track allocatable coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
 | Team stack abstraction                  |           |     ✓     |
 | `form-team-stmt`                        |           |     ✓     |
@@ -68,10 +69,47 @@ One consequence of the statements being categorizing statements as image control
 | Deallocate a coarray                    |           |     ✓     |
 | Reference a coindexed-object           |           |     ✓     |
 
+
 Add to table: teams, events, synchronization statements, critical construct, locks
 
 ## Compiler facing Caffeine API
 
+
+### Allocation and deallocation
+
+Draft:
+
+caf_allocate is called when the compiler wants to allocate a coarray, or when there is a statically declared coarray
+
+Compiler-tracks-codescriptor
+```
+  module subroutine caf_allocate(coarray_handle, local_slice)
+    implicit none
+    type(caf_co_handle), intent(out) :: coarray_handle
+    type(*), dimension(..), intent(inout) :: local_slice
+  end subroutine
+```
+In this case, compiler would provide `image_index`, `coshape`, `lcobound`, `ucobound` and keep track of corank
+
+Caffeine-tracks-codescriptor
+```
+  module subroutine caf_allocate(lbounds, sizes, coarray_handle, local_slice)
+    implicit none
+    type(caf_co_handle), intent(out) :: coarray_handle
+    type(*), dimension(..), intent(inout) :: local_slice
+    integer, dimension(:), intent(in) :: lbounds, sizes !precondition these args must be same size
+  end subroutine
+```
+In this case, Caffeine would provide `image_index`, `coshape`, `lcobound`, `ucobound` and keep track of corank
+
+
+```
+  module subroutine caf_deallocate(coarray_handles)
+    implicit none
+    type(caf_co_handle), dimension(:), intent(out) :: coarray_handles
+  end subroutine
+```
+
 ### Puts and Gets
 
 Current pseudo code. May not stay in design doc.

>From 5a1046fcb08b07a336b1895461c5e94bd625f84d Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Tue, 7 Feb 2023 10:02:11 -0800
Subject: [PATCH 06/33] Update responsibilities able and add internal notes.

---
 flang/docs/CoarrayFortranRuntime.md | 30 ++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 19fcc13f386978..b1fcf29feb475e 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -59,7 +59,8 @@ One consequence of the statements being categorizing statements as image control
 | Translate critical construct to lock/unlock |     ✓     |           |
 | Track coarrays for implicit deallocation when exiting a scope |     ✓     |           |
 | Initialize a coarray with SOURCE= as part of allocate-stmt |     ✓     |           |
-| Implementing the intrinsics `coshape`, `lcobound`, and `ucobound`, `image_index` and keeping track of corank    |     ?     |     ?     |
+| Keeping track of corank |     ✓     |     ?      |
+| Implementing the intrinsics `coshape`, `lcobound`, and `ucobound`, `image_index`  |     ?     |     ?     |
 | Track allocatable coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
 | Team stack abstraction                  |           |     ✓     |
 | `form-team-stmt`                        |           |     ✓     |
@@ -89,7 +90,7 @@ Compiler-tracks-codescriptor
     type(*), dimension(..), intent(inout) :: local_slice
   end subroutine
 ```
-In this case, compiler would provide `image_index`, `coshape`, `lcobound`, `ucobound` and keep track of corank
+In this case, compiler would provide `image_index`, `coshape`, `lcobound`, `ucobound`
 
 Caffeine-tracks-codescriptor
 ```
@@ -100,7 +101,7 @@ Caffeine-tracks-codescriptor
     integer, dimension(:), intent(in) :: lbounds, sizes !precondition these args must be same size
   end subroutine
 ```
-In this case, Caffeine would provide `image_index`, `coshape`, `lcobound`, `ucobound` and keep track of corank
+In this case, Caffeine would provide `image_index`, `coshape`, `lcobound`, `ucobound`
 
 
 ```
@@ -200,6 +201,29 @@ Option 2 with target:
   end subroutine
 ```
 
+
+
+## Berkeley Lab internal Notes: (REMOVE before submission)
+
+### `caf_co_handle`
+
+   The following is a Fortran heavy pseudo code, not the exact implementation we plan
+   ```
+   type caf_co_handle
+     type(c_ptr) :: base_addr
+     integer, allocatable, dimension(:) :: lbounds, sizes
+     !integer :: established_team ! probably not necessary unless we want to bounds check
+   end type
+   ```
+
+flexible array member in c
+
+### Caffeine internals for coarray accesses
+  Coarray access could start with image_index, with the same coarray coindicies and team identifier argument
+  and would get back a single integer, which is the image number in that team
+  Then can use that to pass into internal query about team and global image index and can use it to bounds check.
+
+
 # Testing plan
 [tbd]
 

>From 6b54a5deddac19507d244dae2c720301c359c1e3 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Wed, 1 Mar 2023 09:54:46 -0800
Subject: [PATCH 07/33] Update interfaces for puts and gets.

---
 flang/docs/CoarrayFortranRuntime.md | 70 ++++++++++++++++++++++++-----
 1 file changed, 58 insertions(+), 12 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index b1fcf29feb475e..6221d909064a0d 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -113,10 +113,19 @@ In this case, Caffeine would provide `image_index`, `coshape`, `lcobound`, `ucob
 
 ### Puts and Gets
 
+
+Semantics: Puts and gets will maintain serial dependencies for the issuing image.
+           A non-blocking get has to be started and finished in the same segment.
+           (same non-blocking semantics will likely apply to collectives, use caf_wait_for, caf_try_for, etc)
+           (should change team and critical be non-blocking? sync-all?)
+
 Current pseudo code. May not stay in design doc.
 
+Will have fence based puts
+split phased gets
+
 ```
-  module subroutine caf_put_blocking(coarray, coindices, target, value, team, team_number, stat)
+  module subroutine caf_put(coarray, coindices, team, team_number, target, value, stat)
     implicit none
     type(caf_co_handle), intent(in) :: coarray
     integer, intent(in) :: coindices(:)
@@ -126,24 +135,55 @@ Current pseudo code. May not stay in design doc.
     integer, optional, intent(out) :: stat
   end subroutine
 
-  module subroutine caf_get_blocking(coarray, coindices, source, value, team, team_number, stat)
+  ! any puts that are still in flight need to commited
+  ! throw away any caches
+  ! not synchronizing operation
+  ! caf_end_segment is a side effect of image control stmts
+  module subroutine caf_end_segment()
+    implicit none
+  end subroutine
+
+  module subroutine caf_get_blocking(coarray, coindices, team, team_number, source, value, stat)
     implicit none
     type(caf_co_handle), intent(in) :: coarray
     integer, intent(in) :: coindices(:)
-    type(*), dimension(..), intent(in) :: source
+    type(*), dimension(..), intent(in) :: source ! useful to get the "shape" of the thing, not the value of this dummy arg, compiler needs to ensure this dummy arg is not a copy for this strategy to work, compiler's codegen needs to ensure that this (and other subroutine calls) are not using copies for this arg
     type(*), dimension(..), intent(inout) :: value
     type(team_type), optional, intent(in) :: team
     integer, optional, intent(in) :: team_number
     integer, optional, intent(out) :: stat
   end subroutine
-```
-  * **caf_put_blocking:**
-    -   Description: ...
-    -   Procedure Interface: `subroutine caf_put_blocking(coarray, coindices, target, value, team, team_number, stat)`
 
+  module subroutine caf_get_non_blocking(coarray, coindices, team, team_number, source, value, stat, async_handle)
+    implicit none
+    type(caf_co_handle), intent(in) :: coarray
+    integer, intent(in) :: coindices(:)
+    type(*), dimension(..), intent(in) :: source
+    type(*), dimension(..), intent(inout) :: value ! may need asynchronous attribute or may be implicitly asynchronous
+    type(team_type), optional, intent(in) :: team
+    integer, optional, intent(in) :: team_number
+    integer, optional, intent(out) :: stat
+    type(caf_async_handle), intent(out) :: async_handle
+  end subroutine
+
+  ! waits until operation
+  ! consumes handle
+  module subroutine caf_wait_for(async_handle)
+    implicit none
+    type(caf_async_handle), intent(inout) :: async_handle
+  end subroutine
+
+  ! consumes handle IF finished
+  module subroutine caf_try_for(async_handle, finished)
+    implicit none
+    type(caf_async_handle), intent(inout) :: async_handle
+    logical, intent(out) :: finished
+  end subroutine
+
+```
   * **caf_get_blocking:**
     -   Description: ...
-    -   Procedure Interface: `subroutine caf_get_blocking(coarray, coindices, source, value, team, team_number, stat)`
+    -   Procedure Interface: `subroutine caf_get_blocking(coarray, coindices, team, team_number, source, value, stat)`
 
   Arguments to `caf_put_blocking` and `caf_get_blocking`:
 
@@ -187,8 +227,7 @@ Option 1 with offset:
   module subroutine caf_atomic_add(coarray, coindicies, offset, value, stat) ! blocking atomic operation
     type(caf_co_handle) :: coarray
     integer, intent(in) :: coindices(:)
-    integer :: offset
-    integer(kind=atomic_int_kind) :: value
+    integer :: offset, value, stat
   end subroutine
 ```
 
@@ -196,8 +235,9 @@ Option 2 with target:
 ```
   module subroutine caf_atomic_add(coarray, coindicies, target, value, stat) ! blocking atomic operation
     type(caf_co_handle) :: coarray
-    integer, intent(in) :: coindices(:)
-    type(*), intent(in) :: target
+    integer, intent(in) :: coindices(:) ! names image num
+    integer(kind=atomic_int_kind), intent(in) :: target !location of target is relevant, not the value of target, need this to compute the offset when the `atom` dummy argument to the intrinsic is part of a derived type
+    integer :: value, stat
   end subroutine
 ```
 
@@ -216,6 +256,12 @@ Option 2 with target:
    end type
    ```
 
+TODOs:
+    allow for non-blocking collective subroutines
+    need to be able to track puts in flight, may need a write buffer, record boundaries in a hash table struct
+    every single rma needs to check the table to see if there is a conflicting overlap
+    could add caching
+
 flexible array member in c
 
 ### Caffeine internals for coarray accesses

>From 84b5f603bc2d3122ef53dcc6b7866bd706a1ee1b Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Wed, 1 Mar 2023 13:45:04 -0800
Subject: [PATCH 08/33] Update description in proposed solution and add table
 of contents with the types, arguments, and procedures being proposed.

---
 flang/docs/CoarrayFortranRuntime.md | 91 ++++++++++++++++++++++++-----
 1 file changed, 75 insertions(+), 16 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 6221d909064a0d..b2a6517d511bac 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -26,7 +26,7 @@
   `atomic_fetch_add`, `atomic_fetch_and`, `atomic_fetch_or`, `atomic_fetch_xor`, `atomic_or`, `atomic_ref`, `atomic_xor`
     - _Other subroutines:_ `event_query`
   * **Types, kind type parameters, and values:**
-    - _Intrinsic derived types:_ `event_type`, `team_type`
+    - _Intrinsic derived types:_ `event_type`, `team_type`, `lock_type`
     - _Atomic kind type parameters:_ `atomic_int_kind` and `atomic_logical_kind`
     - _Values:_ `stat_failed_image`, `stat_locked`, `stat_locked_other_image`, `stat_stopped_image`, `stat_unlocked`, `stat_unlocked_failed_image`
 
@@ -34,20 +34,15 @@ In addition to being able to support syntax related to the above features, compi
  * **Image control statements:**
    - _Pre-existing statements_: `allocate`, `deallocate`, `stop`, `end`, a `call` referencing `move_alloc` with coarray arguments
    - _New statements:_ `sync all`, `sync images`, `sync memory`, `sync team`, `change team`, `end team`, `critical`, `end critical`, `event post`, `event wait`, `form team`, `lock`, `unlock`
-One consequence of the statements being categorizing statements as image control will be the need to restrict code movement by optimizing compilers.
+One consequence of the statements being categorized as image control statements will be the need to restrict code movement by optimizing compilers.
 
 # Proposed solution
-  This design document proposes an application programming interface (API) to support the above features.  Implementations of some parts of the API exist in [Caffeine], a parallel runtime library targeting coarray Fortran compilers.  By defining a library-agnostic API, we envision facilitating the development of alternative parallel runtime libraries that support the same API.  One benefit of this approach is the ability to vary the communication substrate.  For example, Caffeine uses the [GASNet-EX] exascale networking middleware, whereas it might also be possible to develop wrappers that would support the proposed API with [OpenCoarrays], which uses the Message Passing Interface ([MPI]). A central aim of this document is to use a parallel runtime API in standard Fortran syntax, which enables us to leverage the Fortran to succinctly express various properties of the procedure interfaces, including argument attributes.  See [Rouson and Bonachea (2022)] for additional details.
+  This design document proposes an interface to support the above features, named Fortran Parallel Runtime Interface.  Implementations of some parts of the interface exist in [Caffeine], a parallel runtime library targeting coarray Fortran compilers.  By defining a library-agnostic interface, we envision facilitating the development of alternative parallel runtime libraries that support the same interface.  One benefit of this approach is the ability to vary the communication substrate.  For example, Caffeine uses the [GASNet-EX] exascale networking middleware, whereas it might also be possible to develop wrappers that would support the proposed interface with [OpenCoarrays], which uses the Message Passing Interface ([MPI]). A central aim of this document is to use a parallel runtime interface in standard Fortran syntax, which enables us to leverage the Fortran to succinctly express various properties of the procedure interfaces, including argument attributes.  See [Rouson and Bonachea (2022)] for additional details.
 
-# Implementation details overview
-  This design document proposes the design of Flang features and discusses how Flang will interface with Caffeine, the
-  runtime library. It outlines which tasks will be the responsibility of Flang and which tasks will be the responsibility
-  of Caffeine.
+# Interface overview
+  This document proposes a design for the Fortran Parallel Runtime Interface. It outlines which tasks will be the responsibility of the Fortran compiler and which tasks will be the responsibility of the runtime library. For the rest of the document, we will refer to the design in terms of Flang and Caffeine.
 
-## Coarray Runtime Library Caffeine
-  Caffeine is a parallel runtime library that aims to support Fortran compilers with a programming-model-agnostic application
-  binary interface (ABI) to various communication libraries. Current work is on supporting the ABI with the GASNet-EX
-  exascale-ready networking middleware.
+## Fortran Parallel Runtime Interface
 
 ## Delegation of tasks between Flang and Caffeine
 
@@ -68,12 +63,69 @@ One consequence of the statements being categorizing statements as image control
 | `end-team-stmt`                         |           |     ✓     |
 | Allocate a coarray                      |           |     ✓     |
 | Deallocate a coarray                    |           |     ✓     |
-| Reference a coindexed-object           |           |     ✓     |
+| Reference a coindexed-object            |           |     ✓     |
 
 
 Add to table: teams, events, synchronization statements, critical construct, locks
 
-## Compiler facing Caffeine API
+## Compiler facing Caffeine interface
+
+### Types
+(TODO: add hyperlinks to the discussion of each type description)
+
+ Provided Fortran types
+   * `caf_event_type`
+   * `caf_team_type`
+   * `caf_lock_type`
+
+ Caffeine specific types
+   * `caf_co_handle`
+   * `caf_async_handle`
+   * `caf_source_loc`   (NOTE: REMOVE: something like this is needed for critical constructs) (Does compiler control implementation of the type, or just provide the information and Caffeine controls the implementation?) OR deal with critical constructs by rewriting critical constructs as blocks with lock and unlocks (BURDENSOME because lock_type has to be coarray, this is the rationale for not rewriting, if we need it)
+
+### Common arguments
+
+   `coarray`, `coindices`, `target`, `value`, `team`, `team_number`, `stat`
+
+### Procedures (just names)
+
+(TODO: add hyperlinks to the discussion of each procedure description)
+
+   Collectives
+     `caf_co_broadcast`, `caf_co_max`, `caf_co_min`, `caf_co_reduce`, `caf_co-sum`
+
+   Program startup and shutdown
+     `caf_init`, `caf_error_stop`, `caf_stop`, `caf_fail_image`, etc. (TODO, fill in)
+
+   Allocation and deallocation
+     `caf_allocate`, `caf_deallocate`
+
+   Coarray Access
+      `caf_put`, `caf_get_blocking`, `caf_get_async`
+
+   Operation Synchronization
+      `caf_async_wait_for`, `caf_async_try_for`, `caf_sync_memory`
+
+   Image Synchronization
+      `caf_sync_all`, `caf_sync_images`, `caf_lock`, `caf_unlock`, `caf_critical`
+
+   Events
+      `caf_event_post`, `caf_event_wait`, `caf_event_query`
+
+   Teams
+     `caf_change_team`, `caf_end_team`, `caf_form_team`, `caf_sync_team`, `caf_get_team`, `caf_team_number`
+
+   Atomic Memory Operation
+      `caf_atomic_add`, `caf_atomic_and`, `caf_atomic_cas`, `caf_atomic_define`, `caf_atomic_fetch_add`, `caf_atomic_fetch_and`, `caf_atomic_fetch_or`, `caf_atomic_fetch_xor`, `caf_atomic_or`, `caf_atomic_ref`, `caf_atomic_xor`
+
+   Coarray Queries
+     `caf_lcobound`, `caf_ucobound`, `caf_coshape`, `caf_image_index`
+
+   Image Queries
+     `caf_num_images`, `caf_this_image`, `caf_failed_images`, `caf_stopped_images`,`caf_image_status`
+
+
+### Procedure descriptions
 
 
 ### Allocation and deallocation
@@ -154,7 +206,7 @@ split phased gets
     integer, optional, intent(out) :: stat
   end subroutine
 
-  module subroutine caf_get_non_blocking(coarray, coindices, team, team_number, source, value, stat, async_handle)
+  module subroutine caf_get_async(coarray, coindices, team, team_number, source, value, stat, async_handle)
     implicit none
     type(caf_co_handle), intent(in) :: coarray
     integer, intent(in) :: coindices(:)
@@ -181,6 +233,12 @@ split phased gets
   end subroutine
 
 ```
+
+  * **caf_put:**
+    -   Description:
+    -   Procedure Interface: `subroutine caf_put(coarray, coindices, team, team_number, target, value, stat)`
+
+
   * **caf_get_blocking:**
     -   Description: ...
     -   Procedure Interface: `subroutine caf_get_blocking(coarray, coindices, team, team_number, source, value, stat)`
@@ -201,8 +259,8 @@ split phased gets
     -   Could be handle based or fence based approaches
     -   Handle based - return can individual operation handle, later on compiler synchronizes handle
     -   Fence based - implicit handle operations, closer to MPI
-
-### Atomic subroutines
+#
+## Atomic subroutines
 
   * **caf_atomic_define:**
     -   Description: ...
@@ -261,6 +319,7 @@ TODOs:
     need to be able to track puts in flight, may need a write buffer, record boundaries in a hash table struct
     every single rma needs to check the table to see if there is a conflicting overlap
     could add caching
+    if the constants (stat_failed_image, etc) are compiler provided, we need to get C access to these values
 
 flexible array member in c
 

>From d179564a8c189924f683a6625f9d396f19c59fb8 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Wed, 1 Mar 2023 16:56:00 -0800
Subject: [PATCH 09/33] First pass at updating formating of content describing
 the arguments and the procedures with links, etc.

---
 flang/docs/CoarrayFortranRuntime.md | 605 +++++++++++++++++++++-------
 1 file changed, 465 insertions(+), 140 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index b2a6517d511bac..8cec0adb6b445c 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -44,33 +44,7 @@ One consequence of the statements being categorized as image control statements
 
 ## Fortran Parallel Runtime Interface
 
-## Delegation of tasks between Flang and Caffeine
-
-| Tasks | Flang | Caffeine |
-| ----  | ----- | -------- |
-| Track corank of coarrays                |     ✓     |           |
-| Track teams associated with a coarray   |     ✓     |           |
-| Assigning variables of type `team-type` |     ✓     |           |
-| Translate critical construct to lock/unlock |     ✓     |           |
-| Track coarrays for implicit deallocation when exiting a scope |     ✓     |           |
-| Initialize a coarray with SOURCE= as part of allocate-stmt |     ✓     |           |
-| Keeping track of corank |     ✓     |     ?      |
-| Implementing the intrinsics `coshape`, `lcobound`, and `ucobound`, `image_index`  |     ?     |     ?     |
-| Track allocatable coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
-| Team stack abstraction                  |           |     ✓     |
-| `form-team-stmt`                        |           |     ✓     |
-| `change-team-stmt`                      |           |     ✓     |
-| `end-team-stmt`                         |           |     ✓     |
-| Allocate a coarray                      |           |     ✓     |
-| Deallocate a coarray                    |           |     ✓     |
-| Reference a coindexed-object            |           |     ✓     |
-
-
-Add to table: teams, events, synchronization statements, critical construct, locks
-
-## Compiler facing Caffeine interface
-
-### Types
+## Types
 (TODO: add hyperlinks to the discussion of each type description)
 
  Provided Fortran types
@@ -79,66 +53,472 @@ Add to table: teams, events, synchronization statements, critical construct, loc
    * `caf_lock_type`
 
  Caffeine specific types
-   * `caf_co_handle`
-   * `caf_async_handle`
-   * `caf_source_loc`   (NOTE: REMOVE: something like this is needed for critical constructs) (Does compiler control implementation of the type, or just provide the information and Caffeine controls the implementation?) OR deal with critical constructs by rewriting critical constructs as blocks with lock and unlocks (BURDENSOME because lock_type has to be coarray, this is the rationale for not rewriting, if we need it)
+   * `caf_co_handle_t` : `caf_co_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler.
+   * `caf_async_handle_t`
+   * `caf_source_loc`   (REMOVE_NOTE: something like this is needed for critical constructs) (Does compiler control implementation of the type, or just provide the information and Caffeine controls the implementation?) OR deal with critical constructs by rewriting critical constructs as blocks with lock and unlocks (BURDENSOME because lock_type has to be coarray, this is the rationale for not rewriting, if we need it)
 
-### Common arguments
+## Common arguments
 
-   `coarray`, `coindices`, `target`, `value`, `team`, `team_number`, `stat`
+   [`coarray_handle`](#coarray_handle), [`coindices`](#coindices), [`target`](#target), [`value`](#value), [`team`](#team), [`team_number`](#team_number), [`stat`](#stat)
 
-### Procedures (just names)
-
-(TODO: add hyperlinks to the discussion of each procedure description)
+## Procedures
 
    Collectives
-     `caf_co_broadcast`, `caf_co_max`, `caf_co_min`, `caf_co_reduce`, `caf_co-sum`
+     [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum)
 
    Program startup and shutdown
-     `caf_init`, `caf_error_stop`, `caf_stop`, `caf_fail_image`, etc. (TODO, fill in)
+     [`caf_init`](#caf_init), [`caf_finalize`](#caf_finalize), [`caf_error_stop`](#caf_error_stop), [`caf_stop`](#caf_stop), [`caf_fail_image`](#caf_fail_image)
 
    Allocation and deallocation
-     `caf_allocate`, `caf_deallocate`
+     [`caf_allocate`](#caf_allocate), [`caf_deallocate`](#caf_deallocate)
 
    Coarray Access
-      `caf_put`, `caf_get_blocking`, `caf_get_async`
+     [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking), [`caf_get_async`](#caf_get_async)
 
    Operation Synchronization
-      `caf_async_wait_for`, `caf_async_try_for`, `caf_sync_memory`
+     [`caf_async_wait_for`](#caf_aync_wait_for), [`caf_async_try_for`](#caf_async_try_for), [`caf_sync_memory`](#caf_sync_memory)
 
    Image Synchronization
-      `caf_sync_all`, `caf_sync_images`, `caf_lock`, `caf_unlock`, `caf_critical`
+     [`caf_sync_all`](#caf_sync_all), [`caf_sync_images`](#caf_sync_images), [`caf_lock`](#caf_lock), [`caf_unlock`](#caf_unlock), [`caf_critical`](#caf_critical)
 
    Events
-      `caf_event_post`, `caf_event_wait`, `caf_event_query`
+     [`caf_event_post`](#caf_event_post), [`caf_event_wait`](#caf_event_wait), [`caf_event_query`](#caf_event_query)
 
    Teams
-     `caf_change_team`, `caf_end_team`, `caf_form_team`, `caf_sync_team`, `caf_get_team`, `caf_team_number`
+     [`caf_change_team`](#caf_change_team), [`caf_end_team`](#caf_end_team), [`caf_form_team`](#caf_form_team), [`caf_sync_team`](#caf_sync_team), [`caf_get_team`](#caf_get_team), [`caf_team_number`](#caf_team_number)
 
    Atomic Memory Operation
-      `caf_atomic_add`, `caf_atomic_and`, `caf_atomic_cas`, `caf_atomic_define`, `caf_atomic_fetch_add`, `caf_atomic_fetch_and`, `caf_atomic_fetch_or`, `caf_atomic_fetch_xor`, `caf_atomic_or`, `caf_atomic_ref`, `caf_atomic_xor`
+     [`caf_atomic_add`](#caf_atomic_add), [`caf_atomic_and`](#caf_atomic_and), [`caf_atomic_cas`](#caf_atomic_cas), [`caf_atomic_define`](#caf_atomic_define), [`caf_atomic_fetch_add`](#caf_atomic_fetch_add), [`caf_atomic_fetch_and`](#caf_atomic_fetch_and), [`caf_atomic_fetch_or`](#caf_atomic_fetch_or), [`caf_atomic_fetch_xor`](#caf_atomic_fetch_xor), [`caf_atomic_or`](#caf_atomic_or), [`caf_atomic_ref`](#caf_atomic_ref), [`caf_atomic_xor`](#caf_atomic_xor)
 
    Coarray Queries
-     `caf_lcobound`, `caf_ucobound`, `caf_coshape`, `caf_image_index`
+     [`caf_lcobound`](#caf_lcobound), [`caf_ucobound`](#caf_ucobound), [`caf_coshape`](#caf_coshape), [`caf_image_index`](#caf_image_index)
 
    Image Queries
-     `caf_num_images`, `caf_this_image`, `caf_failed_images`, `caf_stopped_images`,`caf_image_status`
+     [`caf_num_images`](#caf_num_images), [`caf_this_image`](#caf_this_image), [`caf_failed_images`](#caf_failed_images), [`caf_stopped_images`](#caf_stopped_images), [`caf_image_status`](#caf_image_status)
+
+
+## Common arguments' descriptions
+
+ ### `coarray_handle`
+   * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking)
+   * scalar of type `caf_co_handle_t`
+   * This argument is a handle for the established coarray. The handle will be created when the coarray is established.
+ ### `coarray_handles`
+   * array of type `caf_co_handle_t`
+ ### `coindices`
+   * 1d array of type `integer`, dimension(:)
+ ### `target`
+   * 1d array of type(*), dimension(..)
+ ### `value`
+ ### `team`
+ ### `team_number`
+ ### `stat`
+  * Argument for [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum)
+  * of type `integer`
+  * if no error condition occurs on that image, it is assigned the value `0`
+
+## Procedure descriptions
+
+### Collectives
+
+ #### `caf_co_broadcast`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**: [`stat`](#stat)
+
+ #### `caf_co_max`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**: [`stat`](#stat)
+
+ #### `caf_co_min`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**: [`stat`](#stat)
+
+ #### `caf_co_reduce`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**: [`stat`](#stat)
+
+ #### `caf_co_sum`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**: [`stat`](#stat)
+
+### Program startup and shutdown
+
+  When the compiler identifies a program that uses "Coarray Fortran" features, it will insert calls to `caf_init` and `caf_finalize`. These procedures ...
+
+ #### `caf_init`
+  * **Description**: (REMOVE_NOTE: should it be caf_caffeinate?)
+  * **Procedure Interface**: `function caf_init() result(exit_code)`
+  * **Result**: `exit_code` is an `integer` whose value ...
+
+ #### `caf_finalize`
+  * **Description**: (REMOVE_NOTE: should it be caf_decaffeinate?)
+  * **Procedure Interface**: `subroutine caf_finalize(exit_code)`
+  * **Arguments**: `exit_code` is `intent(in)` and ...
+
+ #### `caf_error_stop`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_stop`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_fail_image`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
 
+### Allocation and deallocation
 
-### Procedure descriptions
+ #### `caf_allocate`
+  * **Description**: Calls to `caf_allocate` will be inserted when the compiler wants to allocate a coarray or when there is a statically declared coarray. This procedure allocates memory for a coarray.
+  * **Procedure Interface**: 2 options, see link to pseudo code, based on (yet undecided) choice between whether the compiler implements `image_index`, `coshape`, `lcobound`, and `ucobound` or whether the runtime library implements those
+  * **Arguments**:
+  * [caf_allocate pseudo code](#caf_allocate-pseudo-code) (temporarily in design doc)
+
+ #### `caf_deallocate`
+  * **Description**:
+  * **Procedure Interface**: `subroutine caf_deallocate(coarray_handles)`
+  * **Arguments**: `coarray_handles` is `intent(out)`
+
+### Coarray Access
+
+ Coarray accesses will maintain serial dependencies for the issuing image. A non-blocking get has to be started and finished in the same segment. The interface provides puts that are fence-based and gets that are split phased.
+
+
+ #### `caf_put`
+  * **Description**:
+  * **Procedure Interface**: `subroutine caf_put(coarray_handle, coindices, team, team_number, target, value, stat)`
+  * **Arguments**: [`coarray_handle`](#coarray_handle) is `intent(in)`, [`coindices`](#coindices) is `intent(in)`, [`target`](#target) is `intent(in)`
+  * [caf_put pseudo code](#caf_put-pseudo-code) (temporarily in design doc)
+
+ #### `caf_get_blocking`
+  * **Description**:
+  * **Procedure Interface**: `subroutine caf_get_blocking(coarray_handle, coindices, team, team_number, source, value, stat)`
+  * **Arguments**: [`coarray_handle`](#coarray_handle) is `intent(in)`, [`coindices`](#coindices) is `intent(in)`, [`target`](#target) is `intent(in)`
+
+ #### `caf_get_async`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+###  Operation Synchronization
+
+ #### `caf_async_wait_for`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+  * [caf_async_wait_for pseudo code](#caf_async_wait_for-pseudo-code) (temporarily in design doc)
+
+ #### `caf_async_try_for`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+  * [caf_async_try_for pseudo code](#caf_async_try_for-pseudo-code) (temporarily in design doc)
+
+ #### `caf_sync_memory`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+### Image Synchronization
+
+ #### `caf_sync_all`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_sync_images`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_lock`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_unlock`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_critical`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+### Events
+
+ #### `caf_event_post`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_event_wait`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_event_query`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+### Teams
+
+ #### `caf_change_team`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_end_team`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_form_team`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_sync_team`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_get_team`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_team_number`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+### Atomic Memory Operation
+
+ #### `caf_atomic_add`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_atomic_and`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_atomic_cas`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_atomic_define`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_atomic_fetch_add`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_atomic_fetch_and`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_atomic_fetch_or`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_atomic_fetch_xor`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_atomic_or`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_atomic_ref`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_atomic_xor`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+### Coarray Queries
+
+ #### `caf_lcobound`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_ucobound`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_coshape`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_image_index`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+### Image Queries
+
+ #### `caf_num_images`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_this_image`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_failed_images`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_stopped_images`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
+ #### `caf_image_status`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
 
 
 ### Allocation and deallocation
 
+
+```
+  module subroutine caf_deallocate(coarray_handles)
+    implicit none
+    type(caf_co_handle_t), dimension(:), intent(out) :: coarray_handles
+  end subroutine
+```
+
+### Puts and Gets
+
+  Arguments to `caf_put_blocking` and `caf_get_blocking`:
+
+REMOVE_NOTE: remove following table as integrate information into above argument descriptions
+
+| Argument | Type | Rank | Dimensions | Intent | Additional attributes | Notes |
+| -------- | ---- | ---- | ---------- | ------ | --------------------- | ----- |
+| `target` | `type(*)` | 1 | dimension(..) | `intent(in)` | n/a | ----- |
+| `value`  | `type(*)` | 1 | dimension(..) | `intent(in)` for gets, `intent(inout)` for puts | n/a | ----- |
+| `team` | `team_type` | 0 | n/a | `intent(in)` | optional | Both optional arguments `team` and `team_number` shall not be present in the same call|
+| `team_number` |  `integer` | 0 | n/a | `intent(in)` | optional | Both optional arguments `team` and `team_number` shall not be present in the same call|
+| `stat` | `integer` | 0 | n/a | `intent(out)` | optional | ----- |
+
+  * **Asynchrony:**
+    -   Could be handle based or fence based approaches
+    -   Handle based - return can individual operation handle, later on compiler synchronizes handle
+    -   Fence based - implicit handle operations, closer to MPI
+#
+## Atomic subroutines
+
+  * **caf_atomic_define:**
+    -   Description: ...
+    -   Procedure Interface: ...
+    -   Arguments: ...
+
+  * **caf_atomic_ref:**
+    -   Description: ...
+    -   Procedure Interface: ...
+    -   Arguments: ...
+
+  * **caf_atomic_add:**
+    -   Description: Blocking atomic operation...
+    -   Procedure Interface:   `subroutine caf_atomic_add(coarray_handle, coindicies, offset, value, stat)` or `subroutine caf_atomic_add(coarray_handle, coindicies, target, value, stat)`
+    -   Arguments: ...
+
+
+Current pseudo code. May not stay in design doc.
+
+Option 1 with offset:
+```
+  module subroutine caf_atomic_add(coarray_handle, coindicies, offset, value, stat) ! blocking atomic operation
+    type(caf_co_handle_t) :: coarray_handle
+    integer, intent(in) :: coindices(:)
+    integer :: offset, value, stat
+  end subroutine
+```
+
+Option 2 with target:
+```
+  module subroutine caf_atomic_add(coarray_handle, coindicies, target, value, stat) ! blocking atomic operation
+    type(caf_co_handle_t) :: coarray_handle
+    integer, intent(in) :: coindices(:) ! names image num
+    integer(kind=atomic_int_kind), intent(in) :: target !location of target is relevant, not the value of target, need this to compute the offset when the `atom` dummy argument to the intrinsic is part of a derived type
+    integer :: value, stat
+  end subroutine
+```
+
+
+## Delegation of tasks between Flang and Caffeine
+
+| Tasks | Flang | Caffeine |
+| ----  | ----- | -------- |
+| Track corank of coarrays                |     ✓     |           |
+| Track teams associated with a coarray   |     ✓     |           |
+| Assigning variables of type `team-type` |     ✓     |           |
+| Translate critical construct to lock/unlock |     ✓     |           |
+| Track coarrays for implicit deallocation when exiting a scope |     ✓     |           |
+| Initialize a coarray with SOURCE= as part of allocate-stmt |     ✓     |           |
+| Keeping track of corank |     ✓     |     ?      |
+| Implementing the intrinsics `coshape`, `lcobound`, and `ucobound`, `image_index`  |     ?     |     ?     |
+| Track allocatable coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
+| Team stack abstraction                  |           |     ✓     |
+| `form-team-stmt`                        |           |     ✓     |
+| `change-team-stmt`                      |           |     ✓     |
+| `end-team-stmt`                         |           |     ✓     |
+| Allocate a coarray                      |           |     ✓     |
+| Deallocate a coarray                    |           |     ✓     |
+| Reference a coindexed-object             |           |     ✓     |
+
+
+Add to table: teams, events, synchronization statements, critical construct, locks
+
+
+
+
+
+Current pseudo code. May not stay in design doc.
+
 Draft:
 
-caf_allocate is called when the compiler wants to allocate a coarray, or when there is a statically declared coarray
+#### caf_allocate pseudo code
 
 Compiler-tracks-codescriptor
 ```
   module subroutine caf_allocate(coarray_handle, local_slice)
     implicit none
-    type(caf_co_handle), intent(out) :: coarray_handle
+    type(caf_co_handle_t), intent(out) :: coarray_handle
     type(*), dimension(..), intent(inout) :: local_slice
   end subroutine
 ```
@@ -148,7 +528,7 @@ Caffeine-tracks-codescriptor
 ```
   module subroutine caf_allocate(lbounds, sizes, coarray_handle, local_slice)
     implicit none
-    type(caf_co_handle), intent(out) :: coarray_handle
+    type(caf_co_handle_t), intent(out) :: coarray_handle
     type(*), dimension(..), intent(inout) :: local_slice
     integer, dimension(:), intent(in) :: lbounds, sizes !precondition these args must be same size
   end subroutine
@@ -156,37 +536,23 @@ Caffeine-tracks-codescriptor
 In this case, Caffeine would provide `image_index`, `coshape`, `lcobound`, `ucobound`
 
 
-```
-  module subroutine caf_deallocate(coarray_handles)
-    implicit none
-    type(caf_co_handle), dimension(:), intent(out) :: coarray_handles
-  end subroutine
-```
-
-### Puts and Gets
-
-
-Semantics: Puts and gets will maintain serial dependencies for the issuing image.
-           A non-blocking get has to be started and finished in the same segment.
-           (same non-blocking semantics will likely apply to collectives, use caf_wait_for, caf_try_for, etc)
-           (should change team and critical be non-blocking? sync-all?)
-
-Current pseudo code. May not stay in design doc.
-
-Will have fence based puts
-split phased gets
+#### caf_put pseudo code
 
 ```
-  module subroutine caf_put(coarray, coindices, team, team_number, target, value, stat)
+  module subroutine caf_put(coarray_handle, coindices, team, team_number, target, value, stat)
     implicit none
-    type(caf_co_handle), intent(in) :: coarray
+    type(caf_co_handle_t), intent(in) :: coarray_handle
     integer, intent(in) :: coindices(:)
     type(*), dimension(..), intent(in) :: target, value
     type(team_type), optional, intent(in) :: team
     integer, optional, intent(in) :: team_number
     integer, optional, intent(out) :: stat
   end subroutine
+```
 
+#### caf_end_segment pseudo code
+
+```
   ! any puts that are still in flight need to commited
   ! throw away any caches
   ! not synchronizing operation
@@ -194,10 +560,14 @@ split phased gets
   module subroutine caf_end_segment()
     implicit none
   end subroutine
+```
 
-  module subroutine caf_get_blocking(coarray, coindices, team, team_number, source, value, stat)
+#### caf_get_blocking pseudo code
+
+```
+  module subroutine caf_get_blocking(coarray_handle, coindices, team, team_number, source, value, stat)
     implicit none
-    type(caf_co_handle), intent(in) :: coarray
+    type(caf_co_handle_t), intent(in) :: coarray_handle
     integer, intent(in) :: coindices(:)
     type(*), dimension(..), intent(in) :: source ! useful to get the "shape" of the thing, not the value of this dummy arg, compiler needs to ensure this dummy arg is not a copy for this strategy to work, compiler's codegen needs to ensure that this (and other subroutine calls) are not using copies for this arg
     type(*), dimension(..), intent(inout) :: value
@@ -205,109 +575,64 @@ split phased gets
     integer, optional, intent(in) :: team_number
     integer, optional, intent(out) :: stat
   end subroutine
+```
 
-  module subroutine caf_get_async(coarray, coindices, team, team_number, source, value, stat, async_handle)
+#### caf_get_async pseudo code
+
+```
+  module subroutine caf_get_async(coarray_handle, coindices, team, team_number, source, value, stat, async_handle)
     implicit none
-    type(caf_co_handle), intent(in) :: coarray
+    type(caf_co_handle_t), intent(in) :: coarray_handle
     integer, intent(in) :: coindices(:)
     type(*), dimension(..), intent(in) :: source
     type(*), dimension(..), intent(inout) :: value ! may need asynchronous attribute or may be implicitly asynchronous
     type(team_type), optional, intent(in) :: team
     integer, optional, intent(in) :: team_number
     integer, optional, intent(out) :: stat
-    type(caf_async_handle), intent(out) :: async_handle
+    type(caf_async_handle_t), intent(out) :: async_handle
   end subroutine
+```
 
+#### caf_async_wait_for pseudo code
+
+```
   ! waits until operation
   ! consumes handle
   module subroutine caf_wait_for(async_handle)
     implicit none
-    type(caf_async_handle), intent(inout) :: async_handle
+    type(caf_async_handle_t), intent(inout) :: async_handle
   end subroutine
+```
 
+#### caf_async_try_for pseudo code
+
+```
   ! consumes handle IF finished
   module subroutine caf_try_for(async_handle, finished)
     implicit none
-    type(caf_async_handle), intent(inout) :: async_handle
+    type(caf_async_handle_t), intent(inout) :: async_handle
     logical, intent(out) :: finished
   end subroutine
 
 ```
 
-  * **caf_put:**
-    -   Description:
-    -   Procedure Interface: `subroutine caf_put(coarray, coindices, team, team_number, target, value, stat)`
 
 
-  * **caf_get_blocking:**
-    -   Description: ...
-    -   Procedure Interface: `subroutine caf_get_blocking(coarray, coindices, team, team_number, source, value, stat)`
-
-  Arguments to `caf_put_blocking` and `caf_get_blocking`:
-
-| Argument | Type | Rank | Dimensions | Intent | Additional attributes | Notes |
-| -------- | ---- | ---- | ---------- | ------ | --------------------- | ----- |
-| `coarray` | `caf_co_handle` | 0 | n/a | `intent(in)` | n/a | caf_co_handle will be a derived type provided by Caffeine. This argument is a handle for the established coarray. This handle will be created when the coarray is established. |
-| `coindices` | `integer` | 1 | dimension(:) | `intent(in)` | n/a | ----- |
-| `target` | `type(*)` | 1 | dimension(..) | `intent(in)` | n/a | ----- |
-| `value`  | `type(*)` | 1 | dimension(..) | `intent(in)` for gets, `intent(inout)` for puts | n/a | ----- |
-| `team` | `team_type` | 0 | n/a | `intent(in)` | optional | Both optional arguments `team` and `team_number` shall not be present in the same call|
-| `team_number` |  `integer` | 0 | n/a | `intent(in)` | optional | Both optional arguments `team` and `team_number` shall not be present in the same call|
-| `stat` | `integer` | 0 | n/a | `intent(out)` | optional | ----- |
 
-  * **Asynchrony:**
-    -   Could be handle based or fence based approaches
-    -   Handle based - return can individual operation handle, later on compiler synchronizes handle
-    -   Fence based - implicit handle operations, closer to MPI
-#
-## Atomic subroutines
 
-  * **caf_atomic_define:**
-    -   Description: ...
-    -   Procedure Interface: ...
-    -   Arguments: ...
 
-  * **caf_atomic_ref:**
-    -   Description: ...
-    -   Procedure Interface: ...
-    -   Arguments: ...
-
-  * **caf_atomic_add:**
-    -   Description: Blocking atomic operation...
-    -   Procedure Interface:   `subroutine caf_atomic_add(coarray, coindicies, offset, value, stat)` or `subroutine caf_atomic_add(coarray, coindicies, target, value, stat)`
-    -   Arguments: ...
-
-
-Current pseudo code. May not stay in design doc.
-
-Option 1 with offset:
-```
-  module subroutine caf_atomic_add(coarray, coindicies, offset, value, stat) ! blocking atomic operation
-    type(caf_co_handle) :: coarray
-    integer, intent(in) :: coindices(:)
-    integer :: offset, value, stat
-  end subroutine
-```
-
-Option 2 with target:
-```
-  module subroutine caf_atomic_add(coarray, coindicies, target, value, stat) ! blocking atomic operation
-    type(caf_co_handle) :: coarray
-    integer, intent(in) :: coindices(:) ! names image num
-    integer(kind=atomic_int_kind), intent(in) :: target !location of target is relevant, not the value of target, need this to compute the offset when the `atom` dummy argument to the intrinsic is part of a derived type
-    integer :: value, stat
-  end subroutine
-```
+## Berkeley Lab internal Notes: (REMOVE_NOTES before submission)
 
+Same non-blocking semantics (has to be started and finished in the same segment) will likely apply to collectives, use caf_wait_for, caf_try_for, etc
+Should change team and critical be non-blocking? sync-all?)
 
 
-## Berkeley Lab internal Notes: (REMOVE before submission)
 
-### `caf_co_handle`
+### `caf_co_handle_t`
 
    The following is a Fortran heavy pseudo code, not the exact implementation we plan
    ```
-   type caf_co_handle
+   type caf_co_handle_t
      type(c_ptr) :: base_addr
      integer, allocatable, dimension(:) :: lbounds, sizes
      !integer :: established_team ! probably not necessary unless we want to bounds check

>From 081cd8535fb3a5affb79ff2ca4bb3f9f840f4b33 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Thu, 2 Mar 2023 15:52:15 -0800
Subject: [PATCH 10/33] Add descriptions of the types. Add more procedure
 descriptions. Remove arguments table after distributing all of its
 information in the new format of describing procedures.

---
 flang/docs/CoarrayFortranRuntime.md | 240 +++++++++++++---------------
 1 file changed, 114 insertions(+), 126 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 8cec0adb6b445c..bb3a782d117fa2 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -5,6 +5,8 @@
    SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 
 -->
+# THIS IS A WORK IN PROGRESS - DECISIONS REGARDING THE DESIGNS DISCUSSED IN THIS DOCUMENT ARE ONGOING AND MAY CHANGE
+
 
 # Problem description
   In order to be fully Fortran 2018 compliant, Flang needs to add support for what is commonly referred to as coarray fortran, which includes features related to parallelism. These features include the following statements, subroutines, functions, types, and kind type parameters:
@@ -45,17 +47,10 @@ One consequence of the statements being categorized as image control statements
 ## Fortran Parallel Runtime Interface
 
 ## Types
-(TODO: add hyperlinks to the discussion of each type description)
 
- Provided Fortran types
-   * `caf_event_type`
-   * `caf_team_type`
-   * `caf_lock_type`
+ **Provided Fortran types:** [`caf_event_type`](#caf_event_type), [`caf_team_type`](#caf_team_type), [`caf_lock_type`](#caf_lock_type)
 
- Caffeine specific types
-   * `caf_co_handle_t` : `caf_co_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler.
-   * `caf_async_handle_t`
-   * `caf_source_loc`   (REMOVE_NOTE: something like this is needed for critical constructs) (Does compiler control implementation of the type, or just provide the information and Caffeine controls the implementation?) OR deal with critical constructs by rewriting critical constructs as blocks with lock and unlocks (BURDENSOME because lock_type has to be coarray, this is the rationale for not rewriting, if we need it)
+ **Caffeine specific types:** [`caf_co_handle_t`](#caf_co_handle_t), [`caf_async_handle_t`](#caf_async_handle_t), [`caf_source_loc_t`](#caf_source_loc_t)
 
 ## Common arguments
 
@@ -63,59 +58,97 @@ One consequence of the statements being categorized as image control statements
 
 ## Procedures
 
-   Collectives
+   **Collectives:**
      [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum)
 
-   Program startup and shutdown
+   **Program startup and shutdown:**
      [`caf_init`](#caf_init), [`caf_finalize`](#caf_finalize), [`caf_error_stop`](#caf_error_stop), [`caf_stop`](#caf_stop), [`caf_fail_image`](#caf_fail_image)
 
-   Allocation and deallocation
+   **Allocation and deallocation:**
      [`caf_allocate`](#caf_allocate), [`caf_deallocate`](#caf_deallocate)
 
-   Coarray Access
+   **Coarray Access:**
      [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking), [`caf_get_async`](#caf_get_async)
 
-   Operation Synchronization
+   **Operation Synchronization:**
      [`caf_async_wait_for`](#caf_aync_wait_for), [`caf_async_try_for`](#caf_async_try_for), [`caf_sync_memory`](#caf_sync_memory)
 
-   Image Synchronization
+   **Image Synchronization:**
      [`caf_sync_all`](#caf_sync_all), [`caf_sync_images`](#caf_sync_images), [`caf_lock`](#caf_lock), [`caf_unlock`](#caf_unlock), [`caf_critical`](#caf_critical)
 
-   Events
+   **Events:**
      [`caf_event_post`](#caf_event_post), [`caf_event_wait`](#caf_event_wait), [`caf_event_query`](#caf_event_query)
 
-   Teams
+   **Teams:**
      [`caf_change_team`](#caf_change_team), [`caf_end_team`](#caf_end_team), [`caf_form_team`](#caf_form_team), [`caf_sync_team`](#caf_sync_team), [`caf_get_team`](#caf_get_team), [`caf_team_number`](#caf_team_number)
 
-   Atomic Memory Operation
+   **Atomic Memory Operation:**
      [`caf_atomic_add`](#caf_atomic_add), [`caf_atomic_and`](#caf_atomic_and), [`caf_atomic_cas`](#caf_atomic_cas), [`caf_atomic_define`](#caf_atomic_define), [`caf_atomic_fetch_add`](#caf_atomic_fetch_add), [`caf_atomic_fetch_and`](#caf_atomic_fetch_and), [`caf_atomic_fetch_or`](#caf_atomic_fetch_or), [`caf_atomic_fetch_xor`](#caf_atomic_fetch_xor), [`caf_atomic_or`](#caf_atomic_or), [`caf_atomic_ref`](#caf_atomic_ref), [`caf_atomic_xor`](#caf_atomic_xor)
 
-   Coarray Queries
+   **Coarray Queries:**
      [`caf_lcobound`](#caf_lcobound), [`caf_ucobound`](#caf_ucobound), [`caf_coshape`](#caf_coshape), [`caf_image_index`](#caf_image_index)
 
-   Image Queries
+   **Image Queries:**
      [`caf_num_images`](#caf_num_images), [`caf_this_image`](#caf_this_image), [`caf_failed_images`](#caf_failed_images), [`caf_stopped_images`](#caf_stopped_images), [`caf_image_status`](#caf_image_status)
 
+## Types Descriptions
+
+ **Fortran Intrinsic Derived types**
+   These types will be defined in the runtime library and it is proposed that the compiler will use a rename to use the runtime library definitions for these types in the compiler's implementation of the `ISO_Fortran_Env` module.
+
+   ### `caf_team_type`
+     * implementation for `team_type` from `ISO_Fortran_Env`
+   ### `caf_event_type`
+     * implementation for `event_type` from `ISO_Fortran_Env`
+   ### `caf_lock_type`
+     * implementation for `lock_type` from `ISO_Fortran_Env`
+
+ **Caffeine specific types**
+   ### `caf_co_handle_t`
+     * `caf_co_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler.
+   ### `caf_async_handle_t`
+     * `caf_async_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler. This type will help the runtime library track and provide asynchrony.
+   ### `caf_source_loc_t`
+     * `caf_source_loc_t` will be used to track the location of the critical construct blocks. The runtime library will handle critical constructs, and not expect the compiler to rewrite them as blocks with lock and unlock statements. This would be burdensome on the compiler because a lock_type variable would need to be declared, but as it needs to be a coarray, it would have to hoist its (REMOVE_NOTE: reword?!?!) declaration. TODO_DECISION: The compiler will control the implementation of the type and pass it off to the runtime library OR The runtime library will control the implementation of the type and receive the required information from the compiler to create the needed instances of the type.
+
 
 ## Common arguments' descriptions
 
  ### `coarray_handle`
-   * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking)
+   * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking), [`caf_get_async`](#caf_get_async)
    * scalar of type `caf_co_handle_t`
    * This argument is a handle for the established coarray. The handle will be created when the coarray is established.
  ### `coarray_handles`
    * array of type `caf_co_handle_t`
+ ### `async_handle`
+   * Argument for [`caf_get_async`](#caf_get_async), [`caf_async_wait_for`](#caf_async_wait_for), [`caf_async_try_for`](#caf_async_try_for)
+   * scalar of type `caf_async_handle_t`
+   * This argument is
+ ### `finished`
+   * Argument for [`caf_async_try_for`](#caf_async_try_for)
+   * scalar of type `caf_async_handle_t`
+   * This argument is
  ### `coindices`
-   * 1d array of type `integer`, dimension(:)
+   * 1d assumed-shape array of type `integer`
  ### `target`
-   * 1d array of type(*), dimension(..)
+   * assumed-rank array of `type(*)`
+   * (REMOVE_NOTE: Is this note true for the puts and gets? And not just the atomics?) The location of this argument is the relevant information, not its value. This means that the compiler needs to ensure that when codegen (REMOVE_NOTE: ?) occurs, this argument is pass by reference and there is no copy made. The location of `target` is needed to compute the offset when the atomic operations' `atom` dummy argument is part of a derived type.
  ### `value`
+   * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking), [`caf_get_async`](#caf_get_async)
+   * assumed-rank array of `type(*)`
+ ### `source`
+   * Argument for [`caf_get_async`](#caf_get_async)
+   * assumed-rank array of `type(*)`
  ### `team`
+   * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking)
+   * scalar of type `team_type`
  ### `team_number`
+   * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking)
+   * scalar of type `integer`
  ### `stat`
-  * Argument for [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum)
-  * of type `integer`
-  * if no error condition occurs on that image, it is assigned the value `0`
+  * Argument for [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum), [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking)
+  * scalar of type `integer`
+  * if no error condition occurs on that image, it is assigned the value `0` (REMOVE_NOTE: ?)
 
 ## Procedure descriptions
 
@@ -179,7 +212,7 @@ One consequence of the statements being categorized as image control statements
 
  #### `caf_allocate`
   * **Description**: Calls to `caf_allocate` will be inserted when the compiler wants to allocate a coarray or when there is a statically declared coarray. This procedure allocates memory for a coarray.
-  * **Procedure Interface**: 2 options, see link to pseudo code, based on (yet undecided) choice between whether the compiler implements `image_index`, `coshape`, `lcobound`, and `ucobound` or whether the runtime library implements those
+  * **Procedure Interface**: `subroutine caf_allocate(lbounds, sizes, coarray_handle, local_slice)`
   * **Arguments**:
   * [caf_allocate pseudo code](#caf_allocate-pseudo-code) (temporarily in design doc)
 
@@ -187,6 +220,7 @@ One consequence of the statements being categorized as image control statements
   * **Description**:
   * **Procedure Interface**: `subroutine caf_deallocate(coarray_handles)`
   * **Arguments**: `coarray_handles` is `intent(out)`
+  * [caf_deallocate pseudo code](#caf_deallocate-pseudo-code) (temporarily in design doc)
 
 ### Coarray Access
 
@@ -196,31 +230,33 @@ One consequence of the statements being categorized as image control statements
  #### `caf_put`
   * **Description**:
   * **Procedure Interface**: `subroutine caf_put(coarray_handle, coindices, team, team_number, target, value, stat)`
-  * **Arguments**: [`coarray_handle`](#coarray_handle) is `intent(in)`, [`coindices`](#coindices) is `intent(in)`, [`target`](#target) is `intent(in)`
+  * **Arguments**: [`coarray_handle`](#coarray_handle) is `intent(in)`, [`coindices`](#coindices) is `intent(in)`, [`target`](#target) is `intent(in)`, [`value`](#value) is `intent(inout)`, [`team`](#team) is `intent(in)` and `optional`, [`team_number`](#team_number) is `intent(in)` and `optional`, [`stat`](#stat) is `intent(out)` and `optional`
+  * **Notes**: Both optional arguments `team` and `team_number` shall not be present in the same call
   * [caf_put pseudo code](#caf_put-pseudo-code) (temporarily in design doc)
 
  #### `caf_get_blocking`
   * **Description**:
   * **Procedure Interface**: `subroutine caf_get_blocking(coarray_handle, coindices, team, team_number, source, value, stat)`
-  * **Arguments**: [`coarray_handle`](#coarray_handle) is `intent(in)`, [`coindices`](#coindices) is `intent(in)`, [`target`](#target) is `intent(in)`
+  * **Arguments**: [`coarray_handle`](#coarray_handle) is `intent(in)`, [`coindices`](#coindices) is `intent(in)`, [`target`](#target) is `intent(in)`, [`value`](#value) is `intent(in)`, [`team`](#team) is `intent(in)` and `optional`, [`team_number`](#team_number) is `intent(in)` and `optional`, [`stat`](#stat) is `intent(out)` and `optional`
+  * **Notes**: Both optional arguments `team` and `team_number` shall not be present in the same call
 
  #### `caf_get_async`
   * **Description**:
-  * **Procedure Interface**:
-  * **Arguments**:
+  * **Procedure Interface**: `subroutine caf_get_async(coarray_handle, coindices, team, team_number, source, value, stat, async_handle)`
+  * **Arguments**: [`coarray_handle`](#coarray_handle) is `intent(in)`, [`coindices`](#coindices) is `intent(in)`, [`team`](#team) is `intent(in)` and `optional`, [`team_number`](#team_number) is `intent(in)` and `optional`, [`source`](#source) is `intent(in)`, [`value`](#value) is `intent(inout)`, [`stat`](#stat) is `intent(out)`, [`async_handle`](#async_handle) is `intent(out)`
 
 ###  Operation Synchronization
 
  #### `caf_async_wait_for`
-  * **Description**:
-  * **Procedure Interface**:
-  * **Arguments**:
+  * **Description**: This procedure waits until (REMOVE_NOTE: asynchronous?) operation is complete and then consumes the async handle
+  * **Procedure Interface**: `subroutine caf_async_wait_for(async_handle)`
+  * **Arguments**: [`async_handle`](#async_handle) is `intent(inout)`
   * [caf_async_wait_for pseudo code](#caf_async_wait_for-pseudo-code) (temporarily in design doc)
 
  #### `caf_async_try_for`
-  * **Description**:
-  * **Procedure Interface**:
-  * **Arguments**:
+  * **Description**: This procedure consumes the async handle if and only if the operation is complete
+  * **Procedure Interface**: `subroutine caf_async_try_for(async_handle, finished)`
+  * **Arguments**: [`async_handle`](#async_handle) is `intent(inout)`, [`finished`](#finished) is `intent(out)`
   * [caf_async_try_for pseudo code](#caf_async_try_for-pseudo-code) (temporarily in design doc)
 
  #### `caf_sync_memory`
@@ -306,10 +342,13 @@ One consequence of the statements being categorized as image control statements
 
 ### Atomic Memory Operation
 
+All atomic operations are blocking operations.
+
  #### `caf_atomic_add`
   * **Description**:
-  * **Procedure Interface**:
+  * **Procedure Interface**: TODO_DECISION: `subroutine caf_atomic_add(coarray_handle, coindicies, offset, value, stat)` or `subroutine caf_atomic_add(coarray_handle, coindicies, target, value, stat)`
   * **Arguments**:
+  * [caf_atomic_add pseudo code](#caf_atomic_add-pseudo-code) (temporarily in design doc)
 
  #### `caf_atomic_and`
   * **Description**:
@@ -411,75 +450,6 @@ One consequence of the statements being categorized as image control statements
   * **Arguments**:
 
 
-### Allocation and deallocation
-
-
-```
-  module subroutine caf_deallocate(coarray_handles)
-    implicit none
-    type(caf_co_handle_t), dimension(:), intent(out) :: coarray_handles
-  end subroutine
-```
-
-### Puts and Gets
-
-  Arguments to `caf_put_blocking` and `caf_get_blocking`:
-
-REMOVE_NOTE: remove following table as integrate information into above argument descriptions
-
-| Argument | Type | Rank | Dimensions | Intent | Additional attributes | Notes |
-| -------- | ---- | ---- | ---------- | ------ | --------------------- | ----- |
-| `target` | `type(*)` | 1 | dimension(..) | `intent(in)` | n/a | ----- |
-| `value`  | `type(*)` | 1 | dimension(..) | `intent(in)` for gets, `intent(inout)` for puts | n/a | ----- |
-| `team` | `team_type` | 0 | n/a | `intent(in)` | optional | Both optional arguments `team` and `team_number` shall not be present in the same call|
-| `team_number` |  `integer` | 0 | n/a | `intent(in)` | optional | Both optional arguments `team` and `team_number` shall not be present in the same call|
-| `stat` | `integer` | 0 | n/a | `intent(out)` | optional | ----- |
-
-  * **Asynchrony:**
-    -   Could be handle based or fence based approaches
-    -   Handle based - return can individual operation handle, later on compiler synchronizes handle
-    -   Fence based - implicit handle operations, closer to MPI
-#
-## Atomic subroutines
-
-  * **caf_atomic_define:**
-    -   Description: ...
-    -   Procedure Interface: ...
-    -   Arguments: ...
-
-  * **caf_atomic_ref:**
-    -   Description: ...
-    -   Procedure Interface: ...
-    -   Arguments: ...
-
-  * **caf_atomic_add:**
-    -   Description: Blocking atomic operation...
-    -   Procedure Interface:   `subroutine caf_atomic_add(coarray_handle, coindicies, offset, value, stat)` or `subroutine caf_atomic_add(coarray_handle, coindicies, target, value, stat)`
-    -   Arguments: ...
-
-
-Current pseudo code. May not stay in design doc.
-
-Option 1 with offset:
-```
-  module subroutine caf_atomic_add(coarray_handle, coindicies, offset, value, stat) ! blocking atomic operation
-    type(caf_co_handle_t) :: coarray_handle
-    integer, intent(in) :: coindices(:)
-    integer :: offset, value, stat
-  end subroutine
-```
-
-Option 2 with target:
-```
-  module subroutine caf_atomic_add(coarray_handle, coindicies, target, value, stat) ! blocking atomic operation
-    type(caf_co_handle_t) :: coarray_handle
-    integer, intent(in) :: coindices(:) ! names image num
-    integer(kind=atomic_int_kind), intent(in) :: target !location of target is relevant, not the value of target, need this to compute the offset when the `atom` dummy argument to the intrinsic is part of a derived type
-    integer :: value, stat
-  end subroutine
-```
-
-
 ## Delegation of tasks between Flang and Caffeine
 
 | Tasks | Flang | Caffeine |
@@ -487,11 +457,10 @@ Option 2 with target:
 | Track corank of coarrays                |     ✓     |           |
 | Track teams associated with a coarray   |     ✓     |           |
 | Assigning variables of type `team-type` |     ✓     |           |
-| Translate critical construct to lock/unlock |     ✓     |           |
 | Track coarrays for implicit deallocation when exiting a scope |     ✓     |           |
 | Initialize a coarray with SOURCE= as part of allocate-stmt |     ✓     |           |
 | Keeping track of corank |     ✓     |     ?      |
-| Implementing the intrinsics `coshape`, `lcobound`, and `ucobound`, `image_index`  |     ?     |     ?     |
+| Implementing the intrinsics `coshape`, `lcobound`, and `ucobound`, `image_index`  |          |     ✓     |
 | Track allocatable coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
 | Team stack abstraction                  |           |     ✓     |
 | `form-team-stmt`                        |           |     ✓     |
@@ -506,35 +475,28 @@ Add to table: teams, events, synchronization statements, critical construct, loc
 
 
 
-
-
 Current pseudo code. May not stay in design doc.
 
 Draft:
 
 #### caf_allocate pseudo code
 
-Compiler-tracks-codescriptor
 ```
-  module subroutine caf_allocate(coarray_handle, local_slice)
+  module subroutine caf_allocate(lbounds, sizes, coarray_handle, local_slice)
     implicit none
     type(caf_co_handle_t), intent(out) :: coarray_handle
     type(*), dimension(..), intent(inout) :: local_slice
+    integer, dimension(:), intent(in) :: lbounds, sizes !precondition these args must be same size
   end subroutine
 ```
-In this case, compiler would provide `image_index`, `coshape`, `lcobound`, `ucobound`
 
-Caffeine-tracks-codescriptor
+#### caf_deallocate pseudo code
 ```
-  module subroutine caf_allocate(lbounds, sizes, coarray_handle, local_slice)
+  module subroutine caf_deallocate(coarray_handles)
     implicit none
-    type(caf_co_handle_t), intent(out) :: coarray_handle
-    type(*), dimension(..), intent(inout) :: local_slice
-    integer, dimension(:), intent(in) :: lbounds, sizes !precondition these args must be same size
+    type(caf_co_handle_t), dimension(:), intent(out) :: coarray_handles
   end subroutine
 ```
-In this case, Caffeine would provide `image_index`, `coshape`, `lcobound`, `ucobound`
-
 
 #### caf_put pseudo code
 
@@ -582,8 +544,8 @@ In this case, Caffeine would provide `image_index`, `coshape`, `lcobound`, `ucob
 ```
   module subroutine caf_get_async(coarray_handle, coindices, team, team_number, source, value, stat, async_handle)
     implicit none
-    type(caf_co_handle_t), intent(in) :: coarray_handle
-    integer, intent(in) :: coindices(:)
+    type(caf_co_handle_t),  intent(in) :: coarray_handle
+    integer, dimension(:),  intent(in) :: coindices
     type(*), dimension(..), intent(in) :: source
     type(*), dimension(..), intent(inout) :: value ! may need asynchronous attribute or may be implicitly asynchronous
     type(team_type), optional, intent(in) :: team
@@ -598,7 +560,7 @@ In this case, Caffeine would provide `image_index`, `coshape`, `lcobound`, `ucob
 ```
   ! waits until operation
   ! consumes handle
-  module subroutine caf_wait_for(async_handle)
+  module subroutine caf_async_wait_for(async_handle)
     implicit none
     type(caf_async_handle_t), intent(inout) :: async_handle
   end subroutine
@@ -608,7 +570,7 @@ In this case, Caffeine would provide `image_index`, `coshape`, `lcobound`, `ucob
 
 ```
   ! consumes handle IF finished
-  module subroutine caf_try_for(async_handle, finished)
+  module subroutine caf_async_try_for(async_handle, finished)
     implicit none
     type(caf_async_handle_t), intent(inout) :: async_handle
     logical, intent(out) :: finished
@@ -617,12 +579,38 @@ In this case, Caffeine would provide `image_index`, `coshape`, `lcobound`, `ucob
 ```
 
 
+#### caf_atomic_add pseudo code
 
+Option 1 with offset:
+```
+  module subroutine caf_atomic_add(coarray_handle, coindicies, offset, value, stat)
+    type(caf_co_handle_t) :: coarray_handle
+    integer, intent(in) :: coindices(:)
+    integer :: offset, value, stat
+  end subroutine
+```
 
+Option 2 with target:
+```
+  module subroutine caf_atomic_add(coarray_handle, coindicies, target, value, stat)
+    type(caf_co_handle_t) :: coarray_handle
+    integer, intent(in) :: coindices(:) ! names image num
+    integer(kind=atomic_int_kind), intent(in) :: target !location of target is relevant, not the value of target, need this to compute the offset when the `atom` dummy argument to the intrinsic is part of a derived type
+    integer :: value, stat
+  end subroutine
+```
 
 
 ## Berkeley Lab internal Notes: (REMOVE_NOTES before submission)
 
+
+  * **Asynchrony:**
+    -   Could be handle based or fence based approaches
+    -   Handle based - return can individual operation handle, later on compiler synchronizes handle
+    -   Fence based - implicit handle operations, closer to MPI
+
+
+
 Same non-blocking semantics (has to be started and finished in the same segment) will likely apply to collectives, use caf_wait_for, caf_try_for, etc
 Should change team and critical be non-blocking? sync-all?)
 
@@ -639,7 +627,7 @@ Should change team and critical be non-blocking? sync-all?)
    end type
    ```
 
-TODOs:
+REMOVE_NOTEs:
     allow for non-blocking collective subroutines
     need to be able to track puts in flight, may need a write buffer, record boundaries in a hash table struct
     every single rma needs to check the table to see if there is a conflicting overlap

>From c14e06aff67489c53d5c949b5919c53a9dfb0729 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Thu, 2 Mar 2023 16:00:08 -0800
Subject: [PATCH 11/33] Update formatting.

---
 flang/docs/CoarrayFortranRuntime.md | 34 ++++++++++++++---------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index bb3a782d117fa2..a0df8ec27b1ed8 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -96,56 +96,56 @@ One consequence of the statements being categorized as image control statements
  **Fortran Intrinsic Derived types**
    These types will be defined in the runtime library and it is proposed that the compiler will use a rename to use the runtime library definitions for these types in the compiler's implementation of the `ISO_Fortran_Env` module.
 
-   ### `caf_team_type`
+   #### `caf_team_type`
      * implementation for `team_type` from `ISO_Fortran_Env`
-   ### `caf_event_type`
+   #### `caf_event_type`
      * implementation for `event_type` from `ISO_Fortran_Env`
-   ### `caf_lock_type`
+   #### `caf_lock_type`
      * implementation for `lock_type` from `ISO_Fortran_Env`
 
  **Caffeine specific types**
-   ### `caf_co_handle_t`
+   #### `caf_co_handle_t`
      * `caf_co_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler.
-   ### `caf_async_handle_t`
+   #### `caf_async_handle_t`
      * `caf_async_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler. This type will help the runtime library track and provide asynchrony.
-   ### `caf_source_loc_t`
+   #### `caf_source_loc_t`
      * `caf_source_loc_t` will be used to track the location of the critical construct blocks. The runtime library will handle critical constructs, and not expect the compiler to rewrite them as blocks with lock and unlock statements. This would be burdensome on the compiler because a lock_type variable would need to be declared, but as it needs to be a coarray, it would have to hoist its (REMOVE_NOTE: reword?!?!) declaration. TODO_DECISION: The compiler will control the implementation of the type and pass it off to the runtime library OR The runtime library will control the implementation of the type and receive the required information from the compiler to create the needed instances of the type.
 
 
 ## Common arguments' descriptions
 
- ### `coarray_handle`
+ #### `coarray_handle`
    * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking), [`caf_get_async`](#caf_get_async)
    * scalar of type `caf_co_handle_t`
    * This argument is a handle for the established coarray. The handle will be created when the coarray is established.
- ### `coarray_handles`
+ #### `coarray_handles`
    * array of type `caf_co_handle_t`
- ### `async_handle`
+ #### `async_handle`
    * Argument for [`caf_get_async`](#caf_get_async), [`caf_async_wait_for`](#caf_async_wait_for), [`caf_async_try_for`](#caf_async_try_for)
    * scalar of type `caf_async_handle_t`
    * This argument is
- ### `finished`
+ #### `finished`
    * Argument for [`caf_async_try_for`](#caf_async_try_for)
    * scalar of type `caf_async_handle_t`
    * This argument is
- ### `coindices`
+ #### `coindices`
    * 1d assumed-shape array of type `integer`
- ### `target`
+ #### `target`
    * assumed-rank array of `type(*)`
    * (REMOVE_NOTE: Is this note true for the puts and gets? And not just the atomics?) The location of this argument is the relevant information, not its value. This means that the compiler needs to ensure that when codegen (REMOVE_NOTE: ?) occurs, this argument is pass by reference and there is no copy made. The location of `target` is needed to compute the offset when the atomic operations' `atom` dummy argument is part of a derived type.
- ### `value`
+ #### `value`
    * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking), [`caf_get_async`](#caf_get_async)
    * assumed-rank array of `type(*)`
- ### `source`
+ #### `source`
    * Argument for [`caf_get_async`](#caf_get_async)
    * assumed-rank array of `type(*)`
- ### `team`
+ #### `team`
    * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking)
    * scalar of type `team_type`
- ### `team_number`
+ #### `team_number`
    * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking)
    * scalar of type `integer`
- ### `stat`
+ #### `stat`
   * Argument for [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum), [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking)
   * scalar of type `integer`
   * if no error condition occurs on that image, it is assigned the value `0` (REMOVE_NOTE: ?)

>From fb5970eaff9b002f6b6aeee81e652b77759cc8b5 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Thu, 2 Mar 2023 16:14:14 -0800
Subject: [PATCH 12/33] More formatting changes.

---
 flang/docs/CoarrayFortranRuntime.md | 35 +++++++++++++++++------------
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index a0df8ec27b1ed8..56a68cfd432bd8 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -93,23 +93,24 @@ One consequence of the statements being categorized as image control statements
 
 ## Types Descriptions
 
- **Fortran Intrinsic Derived types**
+ ### Fortran Intrinsic Derived types
    These types will be defined in the runtime library and it is proposed that the compiler will use a rename to use the runtime library definitions for these types in the compiler's implementation of the `ISO_Fortran_Env` module.
 
-   #### `caf_team_type`
-     * implementation for `team_type` from `ISO_Fortran_Env`
-   #### `caf_event_type`
-     * implementation for `event_type` from `ISO_Fortran_Env`
-   #### `caf_lock_type`
-     * implementation for `lock_type` from `ISO_Fortran_Env`
+ #### `caf_team_type`
+   * implementation for `team_type` from `ISO_Fortran_Env`
+ #### `caf_event_type`
+   * implementation for `event_type` from `ISO_Fortran_Env`
+ #### `caf_lock_type`
+   * implementation for `lock_type` from `ISO_Fortran_Env`
 
- **Caffeine specific types**
-   #### `caf_co_handle_t`
-     * `caf_co_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler.
-   #### `caf_async_handle_t`
-     * `caf_async_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler. This type will help the runtime library track and provide asynchrony.
-   #### `caf_source_loc_t`
-     * `caf_source_loc_t` will be used to track the location of the critical construct blocks. The runtime library will handle critical constructs, and not expect the compiler to rewrite them as blocks with lock and unlock statements. This would be burdensome on the compiler because a lock_type variable would need to be declared, but as it needs to be a coarray, it would have to hoist its (REMOVE_NOTE: reword?!?!) declaration. TODO_DECISION: The compiler will control the implementation of the type and pass it off to the runtime library OR The runtime library will control the implementation of the type and receive the required information from the compiler to create the needed instances of the type.
+ ### Caffeine specific types
+
+ #### `caf_co_handle_t`
+   * `caf_co_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler.
+ #### `caf_async_handle_t`
+   * `caf_async_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler. This type will help the runtime library track and provide asynchrony.
+ #### `caf_source_loc_t`
+   * `caf_source_loc_t` will be used to track the location of the critical construct blocks. The runtime library will handle critical constructs, and not expect the compiler to rewrite them as blocks with lock and unlock statements. This would be burdensome on the compiler because a lock_type variable would need to be declared, but as it needs to be a coarray, it would have to hoist its (REMOVE_NOTE: reword?!?!) declaration. TODO_DECISION: The compiler will control the implementation of the type and pass it off to the runtime library OR The runtime library will control the implementation of the type and receive the required information from the compiler to create the needed instances of the type.
 
 
 ## Common arguments' descriptions
@@ -234,6 +235,12 @@ One consequence of the statements being categorized as image control statements
   * **Notes**: Both optional arguments `team` and `team_number` shall not be present in the same call
   * [caf_put pseudo code](#caf_put-pseudo-code) (temporarily in design doc)
 
+(REMOVE_NOTE): Is this procedure going to be visible to the compiler? If not, do we include discussions of it here?
+ #### `caf_end_segment`
+  * **Description**: This procedure ends a segment. Any puts that are still in flight will be committed (and any caches will be thrown away TODO_DECISION: if we decide to do caches). Calls to this procedure will be side effects of invocations of the image control statements. It is not a synchronizing operation.
+  * **Procedure Interface**: `subroutine caf_end_segment()`
+  * **Arguments**: n/a (REMOVE_NOTE: is this true? or is it just that we haven't sketched out the args yet?)
+
  #### `caf_get_blocking`
   * **Description**:
   * **Procedure Interface**: `subroutine caf_get_blocking(coarray_handle, coindices, team, team_number, source, value, stat)`

>From 526e7d6bdaddae817384eebfe6c0a3abbf0374b1 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Mon, 6 Mar 2023 09:59:35 -0800
Subject: [PATCH 13/33] Add some details to design doc.

---
 flang/docs/CoarrayFortranRuntime.md | 42 ++++++++++++++++++++++-------
 1 file changed, 32 insertions(+), 10 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 56a68cfd432bd8..9ba7eb18fb163f 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -46,6 +46,28 @@ One consequence of the statements being categorized as image control statements
 
 ## Fortran Parallel Runtime Interface
 
+  The Fortran Parallel Runtime Interface is a proposed interface in which the runtime library is responsible for coarray allocation, deallocation and accesses, operation and image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the source code to add Fortran procedure calls to the necessary runtime library procedures. Below you can find a table showing the delegation of tasks between the compile and the runtime library.
+
+## Delegation of tasks between Flang and Caffeine
+
+| Tasks | Flang | Caffeine |
+| ----  | ----- | -------- |
+| Track corank of coarrays                |     ✓     |           |
+| Track teams associated with a coarray   |     ✓     |           |
+| Assigning variables of type `team-type` |     ✓     |           |
+| Track coarrays for implicit deallocation when exiting a scope |     ✓     |           |
+| Initialize a coarray with SOURCE= as part of allocate-stmt |     ✓     |           |
+| Keeping track of corank |     ✓     |     ?      |
+| Implementing the intrinsics `coshape`, `lcobound`, and `ucobound`, `image_index`  |          |     ✓     |
+| Track allocatable coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
+| Team stack abstraction                  |           |     ✓     |
+| `form-team-stmt`                        |           |     ✓     |
+| `change-team-stmt`                      |           |     ✓     |
+| `end-team-stmt`                         |           |     ✓     |
+| Allocate a coarray                      |           |     ✓     |
+| Deallocate a coarray                    |           |     ✓     |
+| Reference a coindexed-object             |           |     ✓     |
+
 ## Types
 
  **Provided Fortran types:** [`caf_event_type`](#caf_event_type), [`caf_team_type`](#caf_team_type), [`caf_lock_type`](#caf_lock_type)
@@ -116,20 +138,21 @@ One consequence of the statements being categorized as image control statements
 ## Common arguments' descriptions
 
  #### `coarray_handle`
-   * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking), [`caf_get_async`](#caf_get_async)
-   * scalar of type `caf_co_handle_t`
+   * Argument for [`caf_allocate`](#caf_allocate), [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking), [`caf_get_async`](#caf_get_async) and all of the [atomic operations](#atomic-memory-operation)
+   * scalar of type [`caf_co_handle_t`](#caf_co_handle_t)
    * This argument is a handle for the established coarray. The handle will be created when the coarray is established.
  #### `coarray_handles`
-   * array of type `caf_co_handle_t`
+   * array of type [`caf_co_handle_t`](#caf_co_handle_t)
  #### `async_handle`
    * Argument for [`caf_get_async`](#caf_get_async), [`caf_async_wait_for`](#caf_async_wait_for), [`caf_async_try_for`](#caf_async_try_for)
-   * scalar of type `caf_async_handle_t`
+   * scalar of type [`caf_async_handle_t`](#caf_async_handle_t)
    * This argument is
  #### `finished`
    * Argument for [`caf_async_try_for`](#caf_async_try_for)
-   * scalar of type `caf_async_handle_t`
+   * scalar of type [`caf_async_handle_t`](#caf_async_handle_t)
    * This argument is
  #### `coindices`
+   * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking), [`caf_get_async`](#caf_get_async)
    * 1d assumed-shape array of type `integer`
  #### `target`
    * assumed-rank array of `type(*)`
@@ -212,15 +235,15 @@ One consequence of the statements being categorized as image control statements
 ### Allocation and deallocation
 
  #### `caf_allocate`
-  * **Description**: Calls to `caf_allocate` will be inserted when the compiler wants to allocate a coarray or when there is a statically declared coarray. This procedure allocates memory for a coarray.
+  * **Description**: Calls to `caf_allocate` will be inserted when the compiler wants to allocate a coarray or when there is a statically declared coarray. This procedure allocates memory for a coarray. The `coarray_handle` dummy argument will pass back a handle that the runtime library will have created to be used for all future accesses and deallocation of the associated coarray.
   * **Procedure Interface**: `subroutine caf_allocate(lbounds, sizes, coarray_handle, local_slice)`
-  * **Arguments**:
+  * **Arguments**: [`lbounds`](#lbounds) is `intent(in)`, [`sizes`](#sizes) is `intent(in)`, [`coarray_handle`](#coarray_handle) is `intent(out)`, [`local_slice`](#local_slice) is `intent(inout)`
   * [caf_allocate pseudo code](#caf_allocate-pseudo-code) (temporarily in design doc)
 
  #### `caf_deallocate`
-  * **Description**:
+  * **Description**: This procedure
   * **Procedure Interface**: `subroutine caf_deallocate(coarray_handles)`
-  * **Arguments**: `coarray_handles` is `intent(out)`
+  * **Arguments**: [`coarray_handles`](#coarray_handles) is `intent(out)` (REMOVE_NOTE: is coarray_handles supposed to be `intent(out)`?)
   * [caf_deallocate pseudo code](#caf_deallocate-pseudo-code) (temporarily in design doc)
 
 ### Coarray Access
@@ -481,7 +504,6 @@ All atomic operations are blocking operations.
 Add to table: teams, events, synchronization statements, critical construct, locks
 
 
-
 Current pseudo code. May not stay in design doc.
 
 Draft:

>From b9b860adc50a0f176c6d43ff062325b223691fcc Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Mon, 6 Mar 2023 10:10:35 -0800
Subject: [PATCH 14/33] Remove duplicated task delegation table. Still haven't
 decided where it should be placed, but delete the duplicate created when
 playing around with placement.

---
 flang/docs/CoarrayFortranRuntime.md | 23 -----------------------
 1 file changed, 23 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 9ba7eb18fb163f..5c1b0b13ac6dd1 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -480,29 +480,6 @@ All atomic operations are blocking operations.
   * **Arguments**:
 
 
-## Delegation of tasks between Flang and Caffeine
-
-| Tasks | Flang | Caffeine |
-| ----  | ----- | -------- |
-| Track corank of coarrays                |     ✓     |           |
-| Track teams associated with a coarray   |     ✓     |           |
-| Assigning variables of type `team-type` |     ✓     |           |
-| Track coarrays for implicit deallocation when exiting a scope |     ✓     |           |
-| Initialize a coarray with SOURCE= as part of allocate-stmt |     ✓     |           |
-| Keeping track of corank |     ✓     |     ?      |
-| Implementing the intrinsics `coshape`, `lcobound`, and `ucobound`, `image_index`  |          |     ✓     |
-| Track allocatable coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
-| Team stack abstraction                  |           |     ✓     |
-| `form-team-stmt`                        |           |     ✓     |
-| `change-team-stmt`                      |           |     ✓     |
-| `end-team-stmt`                         |           |     ✓     |
-| Allocate a coarray                      |           |     ✓     |
-| Deallocate a coarray                    |           |     ✓     |
-| Reference a coindexed-object             |           |     ✓     |
-
-
-Add to table: teams, events, synchronization statements, critical construct, locks
-
 
 Current pseudo code. May not stay in design doc.
 

>From 04dcec559441bf7d4c18d0f9776ad0232d72ca39 Mon Sep 17 00:00:00 2001
From: Brad Richardson <everythingfunctional at protonmail.com>
Date: Mon, 6 Mar 2023 14:29:58 -0600
Subject: [PATCH 15/33] Fix a few grammatical mistakes

---
 flang/docs/CoarrayFortranRuntime.md | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 5c1b0b13ac6dd1..a0cc2cabc47c78 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -39,14 +39,14 @@ In addition to being able to support syntax related to the above features, compi
 One consequence of the statements being categorized as image control statements will be the need to restrict code movement by optimizing compilers.
 
 # Proposed solution
-  This design document proposes an interface to support the above features, named Fortran Parallel Runtime Interface.  Implementations of some parts of the interface exist in [Caffeine], a parallel runtime library targeting coarray Fortran compilers.  By defining a library-agnostic interface, we envision facilitating the development of alternative parallel runtime libraries that support the same interface.  One benefit of this approach is the ability to vary the communication substrate.  For example, Caffeine uses the [GASNet-EX] exascale networking middleware, whereas it might also be possible to develop wrappers that would support the proposed interface with [OpenCoarrays], which uses the Message Passing Interface ([MPI]). A central aim of this document is to use a parallel runtime interface in standard Fortran syntax, which enables us to leverage the Fortran to succinctly express various properties of the procedure interfaces, including argument attributes.  See [Rouson and Bonachea (2022)] for additional details.
+  This design document proposes an interface to support the above features, named Fortran Parallel Runtime Interface.  Implementations of some parts of the interface exist in [Caffeine], a parallel runtime library targeting coarray Fortran compilers.  By defining a library-agnostic interface, we envision facilitating the development of alternative parallel runtime libraries that support the same interface.  One benefit of this approach is the ability to vary the communication substrate.  For example, Caffeine uses the [GASNet-EX] exascale networking middleware, whereas it might also be possible to develop wrappers that would support the proposed interface with [OpenCoarrays], which uses the Message Passing Interface ([MPI]). A central aim of this document is to use a parallel runtime interface in standard Fortran syntax, which enables us to leverage Fortran to succinctly express various properties of the procedure interfaces, including argument attributes.  See [Rouson and Bonachea (2022)] for additional details.
 
 # Interface overview
   This document proposes a design for the Fortran Parallel Runtime Interface. It outlines which tasks will be the responsibility of the Fortran compiler and which tasks will be the responsibility of the runtime library. For the rest of the document, we will refer to the design in terms of Flang and Caffeine.
 
 ## Fortran Parallel Runtime Interface
 
-  The Fortran Parallel Runtime Interface is a proposed interface in which the runtime library is responsible for coarray allocation, deallocation and accesses, operation and image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the source code to add Fortran procedure calls to the necessary runtime library procedures. Below you can find a table showing the delegation of tasks between the compile and the runtime library.
+  The Fortran Parallel Runtime Interface is a proposed interface in which the runtime library is responsible for coarray allocation, deallocation and accesses, image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the source code to add Fortran procedure calls to the necessary runtime library procedures. Below you can find a table showing the delegation of tasks between the compiler and the runtime library.
 
 ## Delegation of tasks between Flang and Caffeine
 
@@ -57,7 +57,6 @@ One consequence of the statements being categorized as image control statements
 | Assigning variables of type `team-type` |     ✓     |           |
 | Track coarrays for implicit deallocation when exiting a scope |     ✓     |           |
 | Initialize a coarray with SOURCE= as part of allocate-stmt |     ✓     |           |
-| Keeping track of corank |     ✓     |     ?      |
 | Implementing the intrinsics `coshape`, `lcobound`, and `ucobound`, `image_index`  |          |     ✓     |
 | Track allocatable coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
 | Team stack abstraction                  |           |     ✓     |

>From 210e2277b917457985538f81bb383651ad1524a4 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Wed, 8 Mar 2023 10:58:26 -0800
Subject: [PATCH 16/33] Update table.

---
 flang/docs/CoarrayFortranRuntime.md | 55 ++++++++++++++++-------------
 1 file changed, 30 insertions(+), 25 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index a0cc2cabc47c78..e0d05e2446652f 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -5,7 +5,7 @@
    SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 
 -->
-# THIS IS A WORK IN PROGRESS - DECISIONS REGARDING THE DESIGNS DISCUSSED IN THIS DOCUMENT ARE ONGOING AND MAY CHANGE
+# THIS IS A WORK IN PROGRESS - DECISIONS REGARDING THE DESIGNS DISCUSSED IN THIS DOCUMENT ARE ONGOING AND MAY CHANGE AND THE DOCUMENT IS INCOMPLETE
 
 
 # Problem description
@@ -39,39 +39,44 @@ In addition to being able to support syntax related to the above features, compi
 One consequence of the statements being categorized as image control statements will be the need to restrict code movement by optimizing compilers.
 
 # Proposed solution
-  This design document proposes an interface to support the above features, named Fortran Parallel Runtime Interface.  Implementations of some parts of the interface exist in [Caffeine], a parallel runtime library targeting coarray Fortran compilers.  By defining a library-agnostic interface, we envision facilitating the development of alternative parallel runtime libraries that support the same interface.  One benefit of this approach is the ability to vary the communication substrate.  For example, Caffeine uses the [GASNet-EX] exascale networking middleware, whereas it might also be possible to develop wrappers that would support the proposed interface with [OpenCoarrays], which uses the Message Passing Interface ([MPI]). A central aim of this document is to use a parallel runtime interface in standard Fortran syntax, which enables us to leverage Fortran to succinctly express various properties of the procedure interfaces, including argument attributes.  See [Rouson and Bonachea (2022)] for additional details.
+  This design document proposes an interface to support the above features, named Coarray Fortran Parallel Runtime Interface.  Implementations of some parts of the interface exist in [Caffeine], a parallel runtime library targeting coarray Fortran compilers.  By defining a library-agnostic interface, we envision facilitating the development of alternative parallel runtime libraries that support the same interface.  One benefit of this approach is the ability to vary the communication substrate.  For example, Caffeine uses the [GASNet-EX] exascale networking middleware, whereas it might also be possible to develop wrappers that would support the proposed interface with [OpenCoarrays], which uses the Message Passing Interface ([MPI]). A central aim of this document is to use a parallel runtime interface in standard Fortran syntax, which enables us to leverage Fortran to succinctly express various properties of the procedure interfaces, including argument attributes.  See [Rouson and Bonachea (2022)] for additional details.
 
 # Interface overview
-  This document proposes a design for the Fortran Parallel Runtime Interface. It outlines which tasks will be the responsibility of the Fortran compiler and which tasks will be the responsibility of the runtime library. For the rest of the document, we will refer to the design in terms of Flang and Caffeine.
+  This document proposes a design for the Coarray Fortran Parallel Runtime Interface. It outlines which tasks will be the responsibility of the Fortran compiler and which tasks will be the responsibility of the runtime library.
 
-## Fortran Parallel Runtime Interface
+## Coarray Fortran (CAF) Parallel Runtime Interface
 
-  The Fortran Parallel Runtime Interface is a proposed interface in which the runtime library is responsible for coarray allocation, deallocation and accesses, image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the source code to add Fortran procedure calls to the necessary runtime library procedures. Below you can find a table showing the delegation of tasks between the compiler and the runtime library.
+  The Coarray Fortran Parallel Runtime Interface is a proposed interface in which the runtime library is responsible for coarray allocation, deallocation and accesses, image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the source code to add Fortran procedure calls to the necessary runtime library procedures. Below you can find a table showing the delegation of tasks between the compiler and the runtime library.
 
-## Delegation of tasks between Flang and Caffeine
+## Delegation of tasks between the Fortran compiler and the runtime library
 
-| Tasks | Flang | Caffeine |
+| Tasks | Fortran compiler | Runtime library |
 | ----  | ----- | -------- |
+| Establish and initialize static coarrays prior to `main`        |     ✓     |           |
 | Track corank of coarrays                |     ✓     |           |
-| Track teams associated with a coarray   |     ✓     |           |
 | Assigning variables of type `team-type` |     ✓     |           |
-| Track coarrays for implicit deallocation when exiting a scope |     ✓     |           |
+| Track locals coarrays for implicit deallocation when exiting a scope |     ✓     |           |
 | Initialize a coarray with SOURCE= as part of allocate-stmt |     ✓     |           |
 | Implementing the intrinsics `coshape`, `lcobound`, and `ucobound`, `image_index`  |          |     ✓     |
-| Track allocatable coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
+| Track coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
 | Team stack abstraction                  |           |     ✓     |
 | `form-team-stmt`                        |           |     ✓     |
 | `change-team-stmt`                      |           |     ✓     |
 | `end-team-stmt`                         |           |     ✓     |
 | Allocate a coarray                      |           |     ✓     |
 | Deallocate a coarray                    |           |     ✓     |
-| Reference a coindexed-object             |           |     ✓     |
+| Reference a coindexed-object            |           |     ✓     |
+
+
+## Establish and initialize static coarrays prior to `main` (REMOVE_NOTE: MOVE SOMEWHERE BELOW)
+
+  Compiler will need to: call caf_init, call caf_allocate ... for each coarray and in the right order. And then copy any initializers.
 
 ## Types
 
  **Provided Fortran types:** [`caf_event_type`](#caf_event_type), [`caf_team_type`](#caf_team_type), [`caf_lock_type`](#caf_lock_type)
 
- **Caffeine specific types:** [`caf_co_handle_t`](#caf_co_handle_t), [`caf_async_handle_t`](#caf_async_handle_t), [`caf_source_loc_t`](#caf_source_loc_t)
+ **Runtime library specific types:** [`caf_co_handle_t`](#caf_co_handle_t), [`caf_async_handle_t`](#caf_async_handle_t), [`caf_source_loc_t`](#caf_source_loc_t)
 
 ## Common arguments
 
@@ -89,7 +94,7 @@ One consequence of the statements being categorized as image control statements
      [`caf_allocate`](#caf_allocate), [`caf_deallocate`](#caf_deallocate)
 
    **Coarray Access:**
-     [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking), [`caf_get_async`](#caf_get_async)
+     [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
 
    **Operation Synchronization:**
      [`caf_async_wait_for`](#caf_aync_wait_for), [`caf_async_try_for`](#caf_async_try_for), [`caf_sync_memory`](#caf_sync_memory)
@@ -124,7 +129,7 @@ One consequence of the statements being categorized as image control statements
  #### `caf_lock_type`
    * implementation for `lock_type` from `ISO_Fortran_Env`
 
- ### Caffeine specific types
+ ### Runtime library specific types
 
  #### `caf_co_handle_t`
    * `caf_co_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler.
@@ -137,7 +142,7 @@ One consequence of the statements being categorized as image control statements
 ## Common arguments' descriptions
 
  #### `coarray_handle`
-   * Argument for [`caf_allocate`](#caf_allocate), [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking), [`caf_get_async`](#caf_get_async) and all of the [atomic operations](#atomic-memory-operation)
+   * Argument for [`caf_allocate`](#caf_allocate), [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async) and all of the [atomic operations](#atomic-memory-operation)
    * scalar of type [`caf_co_handle_t`](#caf_co_handle_t)
    * This argument is a handle for the established coarray. The handle will be created when the coarray is established.
  #### `coarray_handles`
@@ -151,25 +156,25 @@ One consequence of the statements being categorized as image control statements
    * scalar of type [`caf_async_handle_t`](#caf_async_handle_t)
    * This argument is
  #### `coindices`
-   * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking), [`caf_get_async`](#caf_get_async)
+   * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
    * 1d assumed-shape array of type `integer`
  #### `target`
    * assumed-rank array of `type(*)`
    * (REMOVE_NOTE: Is this note true for the puts and gets? And not just the atomics?) The location of this argument is the relevant information, not its value. This means that the compiler needs to ensure that when codegen (REMOVE_NOTE: ?) occurs, this argument is pass by reference and there is no copy made. The location of `target` is needed to compute the offset when the atomic operations' `atom` dummy argument is part of a derived type.
  #### `value`
-   * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking), [`caf_get_async`](#caf_get_async)
+   * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
    * assumed-rank array of `type(*)`
  #### `source`
    * Argument for [`caf_get_async`](#caf_get_async)
    * assumed-rank array of `type(*)`
  #### `team`
-   * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking)
+   * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get)
    * scalar of type `team_type`
  #### `team_number`
-   * Argument for [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking)
+   * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get)
    * scalar of type `integer`
  #### `stat`
-  * Argument for [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum), [`caf_put`](#caf_put), [`caf_get_blocking`](#caf_get_blocking)
+  * Argument for [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum), [`caf_put`](#caf_put), [`caf_get`](#caf_get)
   * scalar of type `integer`
   * if no error condition occurs on that image, it is assigned the value `0` (REMOVE_NOTE: ?)
 
@@ -251,7 +256,7 @@ One consequence of the statements being categorized as image control statements
 
 
  #### `caf_put`
-  * **Description**:
+  * **Description**: Blocks on local completion. (REMOVE_NOTE: eventually would like a caf_put that doesn't block on local completion).
   * **Procedure Interface**: `subroutine caf_put(coarray_handle, coindices, team, team_number, target, value, stat)`
   * **Arguments**: [`coarray_handle`](#coarray_handle) is `intent(in)`, [`coindices`](#coindices) is `intent(in)`, [`target`](#target) is `intent(in)`, [`value`](#value) is `intent(inout)`, [`team`](#team) is `intent(in)` and `optional`, [`team_number`](#team_number) is `intent(in)` and `optional`, [`stat`](#stat) is `intent(out)` and `optional`
   * **Notes**: Both optional arguments `team` and `team_number` shall not be present in the same call
@@ -263,9 +268,9 @@ One consequence of the statements being categorized as image control statements
   * **Procedure Interface**: `subroutine caf_end_segment()`
   * **Arguments**: n/a (REMOVE_NOTE: is this true? or is it just that we haven't sketched out the args yet?)
 
- #### `caf_get_blocking`
+ #### `caf_get`
   * **Description**:
-  * **Procedure Interface**: `subroutine caf_get_blocking(coarray_handle, coindices, team, team_number, source, value, stat)`
+  * **Procedure Interface**: `subroutine caf_get(coarray_handle, coindices, team, team_number, source, value, stat)`
   * **Arguments**: [`coarray_handle`](#coarray_handle) is `intent(in)`, [`coindices`](#coindices) is `intent(in)`, [`target`](#target) is `intent(in)`, [`value`](#value) is `intent(in)`, [`team`](#team) is `intent(in)` and `optional`, [`team_number`](#team_number) is `intent(in)` and `optional`, [`stat`](#stat) is `intent(out)` and `optional`
   * **Notes**: Both optional arguments `team` and `team_number` shall not be present in the same call
 
@@ -529,10 +534,10 @@ Draft:
   end subroutine
 ```
 
-#### caf_get_blocking pseudo code
+#### caf_get pseudo code
 
 ```
-  module subroutine caf_get_blocking(coarray_handle, coindices, team, team_number, source, value, stat)
+  module subroutine caf_get(coarray_handle, coindices, team, team_number, source, value, stat)
     implicit none
     type(caf_co_handle_t), intent(in) :: coarray_handle
     integer, intent(in) :: coindices(:)

>From 7e7e5b3dce22658d40d7899791687f2d491463b5 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Wed, 8 Mar 2023 17:50:05 -0800
Subject: [PATCH 17/33] Update proposed solutions description and move
 interfaces for procedures into the description areas.

---
 flang/docs/CoarrayFortranRuntime.md | 293 ++++++++++++----------------
 1 file changed, 130 insertions(+), 163 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index e0d05e2446652f..ac6491e6be8eae 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -39,20 +39,19 @@ In addition to being able to support syntax related to the above features, compi
 One consequence of the statements being categorized as image control statements will be the need to restrict code movement by optimizing compilers.
 
 # Proposed solution
-  This design document proposes an interface to support the above features, named Coarray Fortran Parallel Runtime Interface.  Implementations of some parts of the interface exist in [Caffeine], a parallel runtime library targeting coarray Fortran compilers.  By defining a library-agnostic interface, we envision facilitating the development of alternative parallel runtime libraries that support the same interface.  One benefit of this approach is the ability to vary the communication substrate.  For example, Caffeine uses the [GASNet-EX] exascale networking middleware, whereas it might also be possible to develop wrappers that would support the proposed interface with [OpenCoarrays], which uses the Message Passing Interface ([MPI]). A central aim of this document is to use a parallel runtime interface in standard Fortran syntax, which enables us to leverage Fortran to succinctly express various properties of the procedure interfaces, including argument attributes.  See [Rouson and Bonachea (2022)] for additional details.
-
-# Interface overview
-  This document proposes a design for the Coarray Fortran Parallel Runtime Interface. It outlines which tasks will be the responsibility of the Fortran compiler and which tasks will be the responsibility of the runtime library.
+  This design document proposes an interface to support the above features, named Coarray Fortran Parallel Runtime Interface. By defining a library-agnostic interface, we envision facilitating the development of alternative parallel runtime libraries that support the same interface.  One benefit of this approach is the ability to vary the communication substrate. A central aim of this document is to use a parallel runtime interface in standard Fortran syntax, which enables us to leverage Fortran to succinctly express various properties of the procedure interfaces, including argument attributes.  See [Rouson and Bonachea (2022)] for additional details.
 
 ## Coarray Fortran (CAF) Parallel Runtime Interface
 
-  The Coarray Fortran Parallel Runtime Interface is a proposed interface in which the runtime library is responsible for coarray allocation, deallocation and accesses, image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the source code to add Fortran procedure calls to the necessary runtime library procedures. Below you can find a table showing the delegation of tasks between the compiler and the runtime library.
+  The Coarray Fortran Parallel Runtime Interface is a proposed interface in which the runtime library is responsible for coarray allocation, deallocation and accesses, image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the source code to add Fortran procedure calls to the necessary runtime library procedures. Below you can find a table showing the delegation of tasks between the compiler and the runtime library. The interface is designed for portability across shared and distributed memory machines, different operating systems, and multiple architectures. The Caffeine implementation,[see below](#caffeine-lbl's-implementation-of-the-coarray-fortran-parallel-runtime-interface), of the Coarray Fortran Parallel Runtime Interface plans to support the following architectures: x86_64, PowerPC64, AArch64, with the possibility of supporting more as requested. Implementations of this interface is intended as an augmentation for the compiler's own runtime library. While the interface can support multiple implementations, we envision needing to build the runtime library as part of installing the compiler.
 
 ## Delegation of tasks between the Fortran compiler and the runtime library
 
+The following table outlines which tasks will be the responsibility of the Fortran compiler and which tasks will be the responsibility of the runtime library.
+
 | Tasks | Fortran compiler | Runtime library |
 | ----  | ----- | -------- |
-| Establish and initialize static coarrays prior to `main`        |     ✓     |           |
+| Establish and initialize static coarrays prior to `main` -[see more](#establish-and-initialize-static-coarrays-priorto-`main`)        |     ✓     |           |
 | Track corank of coarrays                |     ✓     |           |
 | Assigning variables of type `team-type` |     ✓     |           |
 | Track locals coarrays for implicit deallocation when exiting a scope |     ✓     |           |
@@ -67,11 +66,6 @@ One consequence of the statements being categorized as image control statements
 | Deallocate a coarray                    |           |     ✓     |
 | Reference a coindexed-object            |           |     ✓     |
 
-
-## Establish and initialize static coarrays prior to `main` (REMOVE_NOTE: MOVE SOMEWHERE BELOW)
-
-  Compiler will need to: call caf_init, call caf_allocate ... for each coarray and in the right order. And then copy any initializers.
-
 ## Types
 
  **Provided Fortran types:** [`caf_event_type`](#caf_event_type), [`caf_team_type`](#caf_team_type), [`caf_lock_type`](#caf_lock_type)
@@ -117,6 +111,11 @@ One consequence of the statements being categorized as image control statements
    **Image Queries:**
      [`caf_num_images`](#caf_num_images), [`caf_this_image`](#caf_this_image), [`caf_failed_images`](#caf_failed_images), [`caf_stopped_images`](#caf_stopped_images), [`caf_image_status`](#caf_image_status)
 
+
+### Caffeine - LBL's Implementation of the Coarray Fortran Parallel Runtime Interface
+  Implementations of some parts of the Coarray Fortran Parallel Runtime Interface exist in [Caffeine], a parallel runtime library targeting coarray Fortran compilers. Caffeine will continue to be developed in order to fully implement the proposed Coarray Fortran Parallel Runtime Interface. Caffeine uses the [GASNet-EX] exascale networking middleware but with the library-agnostic interface and the ability to vary the communication substrate, it might also be possible to develop wrappers that would support the proposed interface with [OpenCoarrays], which uses the Message Passing Interface ([MPI]).
+
+
 ## Types Descriptions
 
  ### Fortran Intrinsic Derived types
@@ -129,6 +128,10 @@ One consequence of the statements being categorized as image control statements
  #### `caf_lock_type`
    * implementation for `lock_type` from `ISO_Fortran_Env`
 
+
+--------------------------------------------------------------------
+
+
  ### Runtime library specific types
 
  #### `caf_co_handle_t`
@@ -136,7 +139,7 @@ One consequence of the statements being categorized as image control statements
  #### `caf_async_handle_t`
    * `caf_async_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler. This type will help the runtime library track and provide asynchrony.
  #### `caf_source_loc_t`
-   * `caf_source_loc_t` will be used to track the location of the critical construct blocks. The runtime library will handle critical constructs, and not expect the compiler to rewrite them as blocks with lock and unlock statements. This would be burdensome on the compiler because a lock_type variable would need to be declared, but as it needs to be a coarray, it would have to hoist its (REMOVE_NOTE: reword?!?!) declaration. TODO_DECISION: The compiler will control the implementation of the type and pass it off to the runtime library OR The runtime library will control the implementation of the type and receive the required information from the compiler to create the needed instances of the type.
+   * `caf_source_loc_t` will be used to track the location of the critical construct blocks. The runtime library will handle critical constructs, and not expect the compiler to rewrite them as blocks with lock and unlock statements. This would be burdensome on the compiler because a lock_type variable would need to be declared, but as it needs to be a coarray, it would have to hoist its (REMOVE_NOTE: reword?!?!) declaration. REMOVE_NOTE_TODO_DECISION: The compiler will control the implementation of the type and pass it off to the runtime library OR The runtime library will control the implementation of the type and receive the required information from the compiler to create the needed instances of the type.
 
 
 ## Common arguments' descriptions
@@ -239,16 +242,27 @@ One consequence of the statements being categorized as image control statements
 ### Allocation and deallocation
 
  #### `caf_allocate`
-  * **Description**: Calls to `caf_allocate` will be inserted when the compiler wants to allocate a coarray or when there is a statically declared coarray. This procedure allocates memory for a coarray. The `coarray_handle` dummy argument will pass back a handle that the runtime library will have created to be used for all future accesses and deallocation of the associated coarray.
-  * **Procedure Interface**: `subroutine caf_allocate(lbounds, sizes, coarray_handle, local_slice)`
-  * **Arguments**: [`lbounds`](#lbounds) is `intent(in)`, [`sizes`](#sizes) is `intent(in)`, [`coarray_handle`](#coarray_handle) is `intent(out)`, [`local_slice`](#local_slice) is `intent(inout)`
-  * [caf_allocate pseudo code](#caf_allocate-pseudo-code) (temporarily in design doc)
+  * **Description**: Calls to `caf_allocate` will be inserted when the compiler wants to allocate a coarray or when there is a statically declared coarray. This procedure allocates memory for a coarray. The `coarray_handle` dummy argument will pass back a handle that the runtime library will have created to be used for all future accesses and deallocation of the associated coarray. The `lbounds` and `sizes` arguments must 1d arrays with the same dimensions.
+  * **Procedure Interface**:
+  ```
+    module subroutine caf_allocate(lbounds, sizes, coarray_handle, local_slice)
+      implicit none
+      type(caf_co_handle_t), intent(out) :: coarray_handle
+      type(*), dimension(..), intent(inout) :: local_slice
+      integer, dimension(:), intent(in) :: lbounds, sizes
+    end subroutine
+  ```
+  REMOVE_NOTE: Fix pseudo code
 
  #### `caf_deallocate`
   * **Description**: This procedure
-  * **Procedure Interface**: `subroutine caf_deallocate(coarray_handles)`
-  * **Arguments**: [`coarray_handles`](#coarray_handles) is `intent(out)` (REMOVE_NOTE: is coarray_handles supposed to be `intent(out)`?)
-  * [caf_deallocate pseudo code](#caf_deallocate-pseudo-code) (temporarily in design doc)
+  * **Procedure Interface**:
+  ```
+    module subroutine caf_deallocate(coarray_handles)
+      implicit none
+      type(caf_co_handle_t), dimension(:), intent(out) :: coarray_handles (REMOVE_NOTE: is coarray_handles supposed to be `intent(out)`?)
+    end subroutine
+  ```
 
 ### Coarray Access
 
@@ -257,41 +271,90 @@ One consequence of the statements being categorized as image control statements
 
  #### `caf_put`
   * **Description**: Blocks on local completion. (REMOVE_NOTE: eventually would like a caf_put that doesn't block on local completion).
-  * **Procedure Interface**: `subroutine caf_put(coarray_handle, coindices, team, team_number, target, value, stat)`
-  * **Arguments**: [`coarray_handle`](#coarray_handle) is `intent(in)`, [`coindices`](#coindices) is `intent(in)`, [`target`](#target) is `intent(in)`, [`value`](#value) is `intent(inout)`, [`team`](#team) is `intent(in)` and `optional`, [`team_number`](#team_number) is `intent(in)` and `optional`, [`stat`](#stat) is `intent(out)` and `optional`
+  * **Procedure Interface**:
+  ```
+    module subroutine caf_put(coarray_handle, coindices, team, team_number, target, value, stat)
+      implicit none
+      type(caf_co_handle_t), intent(in) :: coarray_handle
+      integer, intent(in) :: coindices(:)
+      type(*), dimension(..), intent(in) :: target, value
+      type(team_type), optional, intent(in) :: team
+      integer, optional, intent(in) :: team_number
+      integer, optional, intent(out) :: stat
+    end subroutine
+  ```
   * **Notes**: Both optional arguments `team` and `team_number` shall not be present in the same call
-  * [caf_put pseudo code](#caf_put-pseudo-code) (temporarily in design doc)
 
-(REMOVE_NOTE): Is this procedure going to be visible to the compiler? If not, do we include discussions of it here?
+
  #### `caf_end_segment`
-  * **Description**: This procedure ends a segment. Any puts that are still in flight will be committed (and any caches will be thrown away TODO_DECISION: if we decide to do caches). Calls to this procedure will be side effects of invocations of the image control statements. It is not a synchronizing operation.
-  * **Procedure Interface**: `subroutine caf_end_segment()`
-  * **Arguments**: n/a (REMOVE_NOTE: is this true? or is it just that we haven't sketched out the args yet?)
+(REMOVE_NOTE): Is this procedure going to be visible to the compiler? If not, do we include discussions of it here?
+  * **Description**: This procedure ends a segment. Any puts that are still in flight will be committed (and any caches will be thrown away REMOVE_NOTE_TODO_DECISION: if we decide to do caches). Calls to this procedure will be side effects of invocations of the image control statements. It is not a synchronizing operation.
+  * **Procedure Interface**:
+  ```
+    module subroutine caf_end_segment()
+      implicit none
+      (REMOVE_NOTE: are there no arguments? or is it just that we haven't sketched out the args yet?)
+    end subroutine
+  ```
 
  #### `caf_get`
   * **Description**:
-  * **Procedure Interface**: `subroutine caf_get(coarray_handle, coindices, team, team_number, source, value, stat)`
-  * **Arguments**: [`coarray_handle`](#coarray_handle) is `intent(in)`, [`coindices`](#coindices) is `intent(in)`, [`target`](#target) is `intent(in)`, [`value`](#value) is `intent(in)`, [`team`](#team) is `intent(in)` and `optional`, [`team_number`](#team_number) is `intent(in)` and `optional`, [`stat`](#stat) is `intent(out)` and `optional`
+  * **Procedure Interface**:
+  ```
+    module subroutine caf_get(coarray_handle, coindices, team, team_number, source, value, stat)
+      implicit none
+      type(caf_co_handle_t), intent(in) :: coarray_handle
+      integer, intent(in) :: coindices(:)
+      type(*), dimension(..), intent(in) :: source
+      type(*), dimension(..), intent(inout) :: value
+      type(team_type), optional, intent(in) :: team
+      integer, optional, intent(in) :: team_number
+      integer, optional, intent(out) :: stat
+    end subroutine
+  ```
   * **Notes**: Both optional arguments `team` and `team_number` shall not be present in the same call
 
  #### `caf_get_async`
   * **Description**:
-  * **Procedure Interface**: `subroutine caf_get_async(coarray_handle, coindices, team, team_number, source, value, stat, async_handle)`
-  * **Arguments**: [`coarray_handle`](#coarray_handle) is `intent(in)`, [`coindices`](#coindices) is `intent(in)`, [`team`](#team) is `intent(in)` and `optional`, [`team_number`](#team_number) is `intent(in)` and `optional`, [`source`](#source) is `intent(in)`, [`value`](#value) is `intent(inout)`, [`stat`](#stat) is `intent(out)`, [`async_handle`](#async_handle) is `intent(out)`
+  * **Procedure Interface**:
+  ```
+    module subroutine caf_get_async(coarray_handle, coindices, team, team_number, source, value, stat, async_handle)
+      implicit none
+      type(caf_co_handle_t),  intent(in) :: coarray_handle
+      integer, dimension(:),  intent(in) :: coindices
+      type(*), dimension(..), intent(in) :: source
+      type(*), dimension(..), intent(inout) :: value
+      type(team_type), optional, intent(in) :: team
+      integer, optional, intent(in) :: team_number
+      integer, optional, intent(out) :: stat
+      type(caf_async_handle_t), intent(out) :: async_handle
+    end subroutine
+  ```
+
 
 ###  Operation Synchronization
 
  #### `caf_async_wait_for`
   * **Description**: This procedure waits until (REMOVE_NOTE: asynchronous?) operation is complete and then consumes the async handle
-  * **Procedure Interface**: `subroutine caf_async_wait_for(async_handle)`
-  * **Arguments**: [`async_handle`](#async_handle) is `intent(inout)`
-  * [caf_async_wait_for pseudo code](#caf_async_wait_for-pseudo-code) (temporarily in design doc)
+  * **Procedure Interface**:
+  ```
+    module subroutine caf_async_wait_for(async_handle)
+      implicit none
+      type(caf_async_handle_t), intent(inout) :: async_handle
+    end subroutine
+  ```
 
  #### `caf_async_try_for`
   * **Description**: This procedure consumes the async handle if and only if the operation is complete
-  * **Procedure Interface**: `subroutine caf_async_try_for(async_handle, finished)`
-  * **Arguments**: [`async_handle`](#async_handle) is `intent(inout)`, [`finished`](#finished) is `intent(out)`
-  * [caf_async_try_for pseudo code](#caf_async_try_for-pseudo-code) (temporarily in design doc)
+  * **Procedure Interface**:
+  ```
+    module subroutine caf_async_try_for(async_handle, finished)
+      implicit none
+      type(caf_async_handle_t), intent(inout) :: async_handle
+      logical, intent(out) :: finished
+    end subroutine
+
+  ```
 
  #### `caf_sync_memory`
   * **Description**:
@@ -380,9 +443,25 @@ All atomic operations are blocking operations.
 
  #### `caf_atomic_add`
   * **Description**:
-  * **Procedure Interface**: TODO_DECISION: `subroutine caf_atomic_add(coarray_handle, coindicies, offset, value, stat)` or `subroutine caf_atomic_add(coarray_handle, coindicies, target, value, stat)`
-  * **Arguments**:
-  * [caf_atomic_add pseudo code](#caf_atomic_add-pseudo-code) (temporarily in design doc)
+  * **Procedure Interface**: REMOVE_NOTE_TODO_DECISION:
+  Option 1 with offset:
+  ```
+    module subroutine caf_atomic_add(coarray_handle, coindicies, offset, value, stat)
+      type(caf_co_handle_t) :: coarray_handle
+      integer, intent(in) :: coindices(:)
+      integer :: offset, value, stat
+    end subroutine
+  ```
+
+  Option 2 with target:
+  ```
+    module subroutine caf_atomic_add(coarray_handle, coindicies, target, value, stat)
+      type(caf_co_handle_t) :: coarray_handle
+      integer, intent(in) :: coindices(:) ! names image num
+      integer(kind=atomic_int_kind), intent(in) :: target !location of target is relevant, not the value of target, need this to compute the offset when the `atom` dummy argument to the intrinsic is part of a derived type
+      integer :: value, stat
+    end subroutine
+  ```
 
  #### `caf_atomic_and`
   * **Description**:
@@ -484,135 +563,15 @@ All atomic operations are blocking operations.
   * **Arguments**:
 
 
+## Establish and initialize static coarrays prior to `main`
 
-Current pseudo code. May not stay in design doc.
-
-Draft:
-
-#### caf_allocate pseudo code
-
-```
-  module subroutine caf_allocate(lbounds, sizes, coarray_handle, local_slice)
-    implicit none
-    type(caf_co_handle_t), intent(out) :: coarray_handle
-    type(*), dimension(..), intent(inout) :: local_slice
-    integer, dimension(:), intent(in) :: lbounds, sizes !precondition these args must be same size
-  end subroutine
-```
-
-#### caf_deallocate pseudo code
-```
-  module subroutine caf_deallocate(coarray_handles)
-    implicit none
-    type(caf_co_handle_t), dimension(:), intent(out) :: coarray_handles
-  end subroutine
-```
-
-#### caf_put pseudo code
-
-```
-  module subroutine caf_put(coarray_handle, coindices, team, team_number, target, value, stat)
-    implicit none
-    type(caf_co_handle_t), intent(in) :: coarray_handle
-    integer, intent(in) :: coindices(:)
-    type(*), dimension(..), intent(in) :: target, value
-    type(team_type), optional, intent(in) :: team
-    integer, optional, intent(in) :: team_number
-    integer, optional, intent(out) :: stat
-  end subroutine
-```
-
-#### caf_end_segment pseudo code
-
-```
-  ! any puts that are still in flight need to commited
-  ! throw away any caches
-  ! not synchronizing operation
-  ! caf_end_segment is a side effect of image control stmts
-  module subroutine caf_end_segment()
-    implicit none
-  end subroutine
-```
-
-#### caf_get pseudo code
-
-```
-  module subroutine caf_get(coarray_handle, coindices, team, team_number, source, value, stat)
-    implicit none
-    type(caf_co_handle_t), intent(in) :: coarray_handle
-    integer, intent(in) :: coindices(:)
-    type(*), dimension(..), intent(in) :: source ! useful to get the "shape" of the thing, not the value of this dummy arg, compiler needs to ensure this dummy arg is not a copy for this strategy to work, compiler's codegen needs to ensure that this (and other subroutine calls) are not using copies for this arg
-    type(*), dimension(..), intent(inout) :: value
-    type(team_type), optional, intent(in) :: team
-    integer, optional, intent(in) :: team_number
-    integer, optional, intent(out) :: stat
-  end subroutine
-```
-
-#### caf_get_async pseudo code
-
-```
-  module subroutine caf_get_async(coarray_handle, coindices, team, team_number, source, value, stat, async_handle)
-    implicit none
-    type(caf_co_handle_t),  intent(in) :: coarray_handle
-    integer, dimension(:),  intent(in) :: coindices
-    type(*), dimension(..), intent(in) :: source
-    type(*), dimension(..), intent(inout) :: value ! may need asynchronous attribute or may be implicitly asynchronous
-    type(team_type), optional, intent(in) :: team
-    integer, optional, intent(in) :: team_number
-    integer, optional, intent(out) :: stat
-    type(caf_async_handle_t), intent(out) :: async_handle
-  end subroutine
-```
-
-#### caf_async_wait_for pseudo code
-
-```
-  ! waits until operation
-  ! consumes handle
-  module subroutine caf_async_wait_for(async_handle)
-    implicit none
-    type(caf_async_handle_t), intent(inout) :: async_handle
-  end subroutine
-```
-
-#### caf_async_try_for pseudo code
-
-```
-  ! consumes handle IF finished
-  module subroutine caf_async_try_for(async_handle, finished)
-    implicit none
-    type(caf_async_handle_t), intent(inout) :: async_handle
-    logical, intent(out) :: finished
-  end subroutine
-
-```
-
-
-#### caf_atomic_add pseudo code
-
-Option 1 with offset:
-```
-  module subroutine caf_atomic_add(coarray_handle, coindicies, offset, value, stat)
-    type(caf_co_handle_t) :: coarray_handle
-    integer, intent(in) :: coindices(:)
-    integer :: offset, value, stat
-  end subroutine
-```
-
-Option 2 with target:
-```
-  module subroutine caf_atomic_add(coarray_handle, coindicies, target, value, stat)
-    type(caf_co_handle_t) :: coarray_handle
-    integer, intent(in) :: coindices(:) ! names image num
-    integer(kind=atomic_int_kind), intent(in) :: target !location of target is relevant, not the value of target, need this to compute the offset when the `atom` dummy argument to the intrinsic is part of a derived type
-    integer :: value, stat
-  end subroutine
-```
+  (REMOVE_NOTE: complete this section, potentially move to earlier in doc) Compiler will need to: call caf_init, call caf_allocate ... for each coarray and in the right order. And then copy any initializers.
 
 
 ## Berkeley Lab internal Notes: (REMOVE_NOTES before submission)
 
+  - Search for REMOVE_NOTE_TODO_DECISION to find locations where specific decisions/options are outlined, but not yet made.
+  - Search for, resolve, and remove all REMOVE_NOTE and REMOVE_NOTE_TODO_DECISIONS before finalizing this document.
 
   * **Asynchrony:**
     -   Could be handle based or fence based approaches
@@ -644,6 +603,14 @@ REMOVE_NOTEs:
     could add caching
     if the constants (stat_failed_image, etc) are compiler provided, we need to get C access to these values
 
+
+#### Implementation internal notes
+  * In `caf_allocate`, add precondition that `lbounds` and `sizes` are the same size - use assert or other similar solution
+  * In `caf_get` (and others?), `source` arg useful to get the "shape" of the thing, not the value of this dummy arg, compiler needs to ensure this dummy arg is not a copy for this strategy to work, compiler's codegen needs to ensure that this (and other subroutine calls) are not using copies for this arg
+  * In `caf_get_async`, the `value` arg may need asynchronous attribute or may be implicitly asynchronous
+
+
+
 flexible array member in c
 
 ### Caffeine internals for coarray accesses

>From b46260a2e37b3afd95e18922192068840db15e91 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Mon, 13 Mar 2023 15:30:47 -0700
Subject: [PATCH 18/33] Update design doc.

---
 flang/docs/CoarrayFortranRuntime.md | 32 ++++++++++++++++++-----------
 1 file changed, 20 insertions(+), 12 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index ac6491e6be8eae..cd83249ca92013 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -43,28 +43,31 @@ One consequence of the statements being categorized as image control statements
 
 ## Coarray Fortran (CAF) Parallel Runtime Interface
 
-  The Coarray Fortran Parallel Runtime Interface is a proposed interface in which the runtime library is responsible for coarray allocation, deallocation and accesses, image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the source code to add Fortran procedure calls to the necessary runtime library procedures. Below you can find a table showing the delegation of tasks between the compiler and the runtime library. The interface is designed for portability across shared and distributed memory machines, different operating systems, and multiple architectures. The Caffeine implementation,[see below](#caffeine-lbl's-implementation-of-the-coarray-fortran-parallel-runtime-interface), of the Coarray Fortran Parallel Runtime Interface plans to support the following architectures: x86_64, PowerPC64, AArch64, with the possibility of supporting more as requested. Implementations of this interface is intended as an augmentation for the compiler's own runtime library. While the interface can support multiple implementations, we envision needing to build the runtime library as part of installing the compiler.
+  The Coarray Fortran Parallel Runtime Interface is a proposed interface in which the runtime library is responsible for coarray allocation, deallocation and accesses, image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the source code to add Fortran procedure calls to the necessary runtime library procedures. Below you can find a table showing the delegation of tasks between the compiler and the runtime library. The interface is designed for portability across shared and distributed memory machines, different operating systems, and multiple architectures. The Caffeine implementation, [see below](#caffeine-lbl's-implementation-of-the-coarray-fortran-parallel-runtime-interface), of the Coarray Fortran Parallel Runtime Interface plans to support the following architectures: x86_64, PowerPC64, AArch64, with the possibility of supporting more as requested. Implementations of this interface is intended as an augmentation for the compiler's own runtime library. While the interface can support multiple implementations, we envision needing to build the runtime library as part of installing the compiler.
 
 ## Delegation of tasks between the Fortran compiler and the runtime library
 
-The following table outlines which tasks will be the responsibility of the Fortran compiler and which tasks will be the responsibility of the runtime library.
+The following table outlines which tasks will be the responsibility of the Fortran compiler and which tasks will be the responsibility of the runtime library. A '✓' in the Fortran compiler column indicates that the compiler has the primary responsibility for that task, while a '✓' in the Runtime library column indicates that the compiler will invoke the runtime library to perform the task and the runtime library has primary responsibility for the task's implementation. See the [Runtime Interface Procedures](#runtime-interface-procedures) for the list of runtime library procedures that the compiler will invoke.
+
 
 | Tasks | Fortran compiler | Runtime library |
 | ----  | ----- | -------- |
-| Establish and initialize static coarrays prior to `main` -[see more](#establish-and-initialize-static-coarrays-priorto-`main`)        |     ✓     |           |
+| Establish and initialize static coarrays prior to `main` - [see more](#establish-and-initialize-static-coarrays-prior-to-`main`)        |     ✓     |           |
 | Track corank of coarrays                |     ✓     |           |
 | Assigning variables of type `team-type` |     ✓     |           |
 | Track locals coarrays for implicit deallocation when exiting a scope |     ✓     |           |
 | Initialize a coarray with SOURCE= as part of allocate-stmt |     ✓     |           |
-| Implementing the intrinsics `coshape`, `lcobound`, and `ucobound`, `image_index`  |          |     ✓     |
 | Track coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
-| Team stack abstraction                  |           |     ✓     |
-| `form-team-stmt`                        |           |     ✓     |
-| `change-team-stmt`                      |           |     ✓     |
-| `end-team-stmt`                         |           |     ✓     |
-| Allocate a coarray                      |           |     ✓     |
-| Deallocate a coarray                    |           |     ✓     |
+| Allocate and deallocate a coarray       |           |     ✓     |
 | Reference a coindexed-object            |           |     ✓     |
+| Team stack abstraction                  |           |     ✓     |
+| `form-team-stmt`, `change-team-stmt`, `end-team-stmt` |           |     ✓     |
+| Atomic subroutines                      |           |     ✓     |
+| Synchronization statements              |           |     ✓     |
+| Events              |           |     ✓     |
+| Locks              |           |     ✓     |
+| `critical-construct`             |           |     ✓     |
+
 
 ## Types
 
@@ -76,7 +79,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
 
    [`coarray_handle`](#coarray_handle), [`coindices`](#coindices), [`target`](#target), [`value`](#value), [`team`](#team), [`team_number`](#team_number), [`stat`](#stat)
 
-## Procedures
+## Runtime Interface Procedures
 
    **Collectives:**
      [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum)
@@ -94,7 +97,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
      [`caf_async_wait_for`](#caf_aync_wait_for), [`caf_async_try_for`](#caf_async_try_for), [`caf_sync_memory`](#caf_sync_memory)
 
    **Image Synchronization:**
-     [`caf_sync_all`](#caf_sync_all), [`caf_sync_images`](#caf_sync_images), [`caf_lock`](#caf_lock), [`caf_unlock`](#caf_unlock), [`caf_critical`](#caf_critical)
+     [`caf_sync_all`](#caf_sync_all), [`caf_sync_images`](#caf_sync_images), [`caf_lock`](#caf_lock), [`caf_unlock`](#caf_unlock), [`caf_critical`](#caf_critical), [`caf_end_critical`](#caf_end_critical)
 
    **Events:**
      [`caf_event_post`](#caf_event_post), [`caf_event_wait`](#caf_event_wait), [`caf_event_query`](#caf_event_query)
@@ -388,6 +391,11 @@ The following table outlines which tasks will be the responsibility of the Fortr
   * **Procedure Interface**:
   * **Arguments**:
 
+#### `caf_end_critical`
+  * **Description**:
+  * **Procedure Interface**:
+  * **Arguments**:
+
 ### Events
 
  #### `caf_event_post`

>From ef932c2a276251495d4c05f511a8749be26d0d8e Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Mon, 13 Mar 2023 16:12:15 -0700
Subject: [PATCH 19/33] Update design doc.

---
 flang/docs/CoarrayFortranRuntime.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index cd83249ca92013..2be0861a2ed2f9 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -62,6 +62,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
 | Reference a coindexed-object            |           |     ✓     |
 | Team stack abstraction                  |           |     ✓     |
 | `form-team-stmt`, `change-team-stmt`, `end-team-stmt` |           |     ✓     |
+| Intrinsic functions related to Coarray Fortran, like `num_images`, etc |           |     ✓     |
 | Atomic subroutines                      |           |     ✓     |
 | Synchronization statements              |           |     ✓     |
 | Events              |           |     ✓     |

>From 1f1b93e6e3f0779d3681eea76bd7a03797eaa8cc Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Wed, 15 Mar 2023 14:26:56 -0700
Subject: [PATCH 20/33] Update design doc

---
 flang/docs/CoarrayFortranRuntime.md | 93 ++++++++++++++++++++---------
 1 file changed, 66 insertions(+), 27 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 2be0861a2ed2f9..14349d6f01ccbf 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -57,6 +57,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
 | Assigning variables of type `team-type` |     ✓     |           |
 | Track locals coarrays for implicit deallocation when exiting a scope |     ✓     |           |
 | Initialize a coarray with SOURCE= as part of allocate-stmt |     ✓     |           |
+| `caf_critical_id_t` |     ✓     |           |
 | Track coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
 | Allocate and deallocate a coarray       |           |     ✓     |
 | Reference a coindexed-object            |           |     ✓     |
@@ -64,6 +65,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
 | `form-team-stmt`, `change-team-stmt`, `end-team-stmt` |           |     ✓     |
 | Intrinsic functions related to Coarray Fortran, like `num_images`, etc |           |     ✓     |
 | Atomic subroutines                      |           |     ✓     |
+| Collective subroutines                      |           |     ✓     |
 | Synchronization statements              |           |     ✓     |
 | Events              |           |     ✓     |
 | Locks              |           |     ✓     |
@@ -123,7 +125,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
 ## Types Descriptions
 
  ### Fortran Intrinsic Derived types
-   These types will be defined in the runtime library and it is proposed that the compiler will use a rename to use the runtime library definitions for these types in the compiler's implementation of the `ISO_Fortran_Env` module.
+   These types will be defined in the runtime library and it is proposed that the compiler will use a rename to use the runtime library definitions for these types in the compiler's implementation of the `ISO_Fortran_Env` module. REMOVE_NOTE_TODO: add rationale for this
 
  #### `caf_team_type`
    * implementation for `team_type` from `ISO_Fortran_Env`
@@ -133,17 +135,18 @@ The following table outlines which tasks will be the responsibility of the Fortr
    * implementation for `lock_type` from `ISO_Fortran_Env`
 
 
---------------------------------------------------------------------
-
-
  ### Runtime library specific types
+   REMOVE_NOTE_TODO:    ADD general description of types
 
  #### `caf_co_handle_t`
    * `caf_co_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler.
  #### `caf_async_handle_t`
    * `caf_async_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler. This type will help the runtime library track and provide asynchrony.
- #### `caf_source_loc_t`
-   * `caf_source_loc_t` will be used to track the location of the critical construct blocks. The runtime library will handle critical constructs, and not expect the compiler to rewrite them as blocks with lock and unlock statements. This would be burdensome on the compiler because a lock_type variable would need to be declared, but as it needs to be a coarray, it would have to hoist its (REMOVE_NOTE: reword?!?!) declaration. REMOVE_NOTE_TODO_DECISION: The compiler will control the implementation of the type and pass it off to the runtime library OR The runtime library will control the implementation of the type and receive the required information from the compiler to create the needed instances of the type.
+ #### `caf_critical_id_t`
+   * `caf_critical_id_t` will be used to track the location of the critical construct blocks. REMOVE_NOTE_TODO: this type should be a int which is a unique identifier , (may not need to be a derived type)
+
+need to have a state saying whether we are in a critical construct or not, if we are in one, then we can ignore any further nested critical constructs.
+
 
 
 ## Common arguments' descriptions
@@ -183,7 +186,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
  #### `stat`
   * Argument for [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum), [`caf_put`](#caf_put), [`caf_get`](#caf_get)
   * scalar of type `integer`
-  * if no error condition occurs on that image, it is assigned the value `0` (REMOVE_NOTE: ?)
+  * if no error condition occurs on that image, it is assigned the value `0`
 
 ## Procedure descriptions
 
@@ -219,12 +222,12 @@ The following table outlines which tasks will be the responsibility of the Fortr
   When the compiler identifies a program that uses "Coarray Fortran" features, it will insert calls to `caf_init` and `caf_finalize`. These procedures ...
 
  #### `caf_init`
-  * **Description**: (REMOVE_NOTE: should it be caf_caffeinate?)
+  * **Description**:
   * **Procedure Interface**: `function caf_init() result(exit_code)`
   * **Result**: `exit_code` is an `integer` whose value ...
 
  #### `caf_finalize`
-  * **Description**: (REMOVE_NOTE: should it be caf_decaffeinate?)
+  * **Description**:
   * **Procedure Interface**: `subroutine caf_finalize(exit_code)`
   * **Arguments**: `exit_code` is `intent(in)` and ...
 
@@ -247,24 +250,32 @@ The following table outlines which tasks will be the responsibility of the Fortr
 
  #### `caf_allocate`
   * **Description**: Calls to `caf_allocate` will be inserted when the compiler wants to allocate a coarray or when there is a statically declared coarray. This procedure allocates memory for a coarray. The `coarray_handle` dummy argument will pass back a handle that the runtime library will have created to be used for all future accesses and deallocation of the associated coarray. The `lbounds` and `sizes` arguments must 1d arrays with the same dimensions.
+  The `co_lbounds` and `co_sizes` arguments must be 1d arrays with the same dimensions. The product of the difference of the co_lbounds and co_ubounds need to equal the number of team members. `local_slice` shall not be allocated on entry
+
+   NOTE: the runtime library will stash away the coshape information at this time, runtime will have a mapping between the co_handle_t and the coshape information.
+   NOTE: considering adding a mold argument that would allow for dynamic typing for local_slice
   * **Procedure Interface**:
   ```
-    module subroutine caf_allocate(lbounds, sizes, coarray_handle, local_slice)
+    module subroutine caf_allocate(co_lbounds, co_ubounds, lbounds, ubounds, coarray_handle, local_slice)
       implicit none
-      type(caf_co_handle_t), intent(out) :: coarray_handle
-      type(*), dimension(..), intent(inout) :: local_slice
-      integer, dimension(:), intent(in) :: lbounds, sizes
+      integer, kind(c_intmax_t), dimension(:), intent(in) :: co_lbounds, co_ubounds REMOVE_NOTE_TODO: (move these fake comments into descriptions of the arguments)! information about the coshape of the coarray being allocated
+      integer, kind(c_intmax_t), dimension(:), intent(in) :: lbounds, ubounds  ! information about the shape of the local_slice (non_coindexed object)
+      type(caf_co_handle_t), intent(out) :: coarray_handle ! represents the distributed object of the coarray on the corresponding team
+      type(*), dimension(..), allocatable, intent(out) :: local_slice
     end subroutine
   ```
-  REMOVE_NOTE: Fix pseudo code
 
  #### `caf_deallocate`
-  * **Description**: This procedure
+  * **Description**: This procedure. Has side effect that the allocatable local_slices represented by the coarray_handles will be destroyed. Compiler needs to know by association which local_slices are being deallocated.
+  if there is a local coarray that doesn't have the save attribute, enter a subroutine that
+
+when exiting a local scope where implicit deallocation of a coarray occurs, compiler must call this
+and when a deallocate stmt of a coarray occurs, compiler must call this
   * **Procedure Interface**:
   ```
     module subroutine caf_deallocate(coarray_handles)
       implicit none
-      type(caf_co_handle_t), dimension(:), intent(out) :: coarray_handles (REMOVE_NOTE: is coarray_handles supposed to be `intent(out)`?)
+      type(caf_co_handle_t), dimension(:), intent(in) :: coarray_handles
     end subroutine
   ```
 
@@ -290,17 +301,6 @@ The following table outlines which tasks will be the responsibility of the Fortr
   * **Notes**: Both optional arguments `team` and `team_number` shall not be present in the same call
 
 
- #### `caf_end_segment`
-(REMOVE_NOTE): Is this procedure going to be visible to the compiler? If not, do we include discussions of it here?
-  * **Description**: This procedure ends a segment. Any puts that are still in flight will be committed (and any caches will be thrown away REMOVE_NOTE_TODO_DECISION: if we decide to do caches). Calls to this procedure will be side effects of invocations of the image control statements. It is not a synchronizing operation.
-  * **Procedure Interface**:
-  ```
-    module subroutine caf_end_segment()
-      implicit none
-      (REMOVE_NOTE: are there no arguments? or is it just that we haven't sketched out the args yet?)
-    end subroutine
-  ```
-
  #### `caf_get`
   * **Description**:
   * **Procedure Interface**:
@@ -578,6 +578,11 @@ All atomic operations are blocking operations.
 
 
 ## Berkeley Lab internal Notes: (REMOVE_NOTES before submission)
+  - REMOVE_NOTE_TODO_DECISION: Need to decide the thread semantics
+  - REMOVE_NOTE_TODO_DECISION: Are we going to have Caffeine be thread safe? Have a thread safety option? Is it a build time option? or runtime?
+        Dan advocates having a thread-safety option that is build time.
+  - Do we need to add any discussion of what it would look like when code has mixed OpenMP and Coarray Fortran?
+
 
   - Search for REMOVE_NOTE_TODO_DECISION to find locations where specific decisions/options are outlined, but not yet made.
   - Search for, resolve, and remove all REMOVE_NOTE and REMOVE_NOTE_TODO_DECISIONS before finalizing this document.
@@ -589,10 +594,25 @@ All atomic operations are blocking operations.
 
 
 
+POTENTIAL RATIONALE TO PRESENT SOMEWHERE maybe
+The runtime library will handle critical constructs, and not expect the compiler to rewrite them as blocks with lock and unlock statements. This would be burdensome on the compiler because a lock_type variable would need to be declared, but as it needs to be a coarray, it would have to hoist its (REMOVE_NOTE: reword?!?!) declaration.
+
 Same non-blocking semantics (has to be started and finished in the same segment) will likely apply to collectives, use caf_wait_for, caf_try_for, etc
 Should change team and critical be non-blocking? sync-all?)
 
 
+Caffeine internal procedure, so not part of the CAF PRI.
+ #### `caf_end_segment`
+  * **Description**: This procedure ends a segment. Any puts that are still in flight will be committed (and any caches will be thrown away REMOVE_NOTE_TODO_DECISION: if we decide to do caches). Calls to this procedure will be side effects of invocations of the image control statements. It is not a synchronizing operation.
+  * **Procedure Interface**:
+  ```
+    module subroutine caf_end_segment()
+      implicit none
+      (REMOVE_NOTE: are there no arguments? or is it just that we haven't sketched out the args yet?)
+    end subroutine
+  ```
+
+
 
 ### `caf_co_handle_t`
 
@@ -619,6 +639,25 @@ REMOVE_NOTEs:
   * In `caf_get_async`, the `value` arg may need asynchronous attribute or may be implicitly asynchronous
 
 
+REMOVE_NOTE_
+
+examples where if a user writes this example code, then the compiler should rewrite it to look like this other piece of example code
+
+
+TODO after writing example, try compiling it and see if it compiles at least until breaking at linking because no def for caf_allocate etc
+1. basic caf example
+    - static coarray declaration
+         - transform by adding caf_allocate and caf_deallocate calls
+    - write to coarrays
+    - read from coarrays
+    - sync-all-stmt (?)
+
+2. allocatable coarray example
+    - allocatable coarray with an initializer
+    - compiler transforms code and adds assignment statement after calls to caf_allocate
+
+3. implicit deallocation of a coarray example
+    - local, allocatable coarray -> compiler must insert caf_deallocate call
 
 flexible array member in c
 

>From 24f9699d291fa5c135b71ccfb526700354c50632 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Wed, 15 Mar 2023 18:23:37 -0700
Subject: [PATCH 21/33] Add first sketch of interfaces for the collective
 subroutines and update other parts of design doc.

---
 flang/docs/CoarrayFortranRuntime.md | 213 ++++++++++++++++++++--------
 1 file changed, 154 insertions(+), 59 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 14349d6f01ccbf..593011c568c39d 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -78,10 +78,6 @@ The following table outlines which tasks will be the responsibility of the Fortr
 
  **Runtime library specific types:** [`caf_co_handle_t`](#caf_co_handle_t), [`caf_async_handle_t`](#caf_async_handle_t), [`caf_source_loc_t`](#caf_source_loc_t)
 
-## Common arguments
-
-   [`coarray_handle`](#coarray_handle), [`coindices`](#coindices), [`target`](#target), [`value`](#value), [`team`](#team), [`team_number`](#team_number), [`stat`](#stat)
-
 ## Runtime Interface Procedures
 
    **Collectives:**
@@ -151,29 +147,13 @@ need to have a state saying whether we are in a critical construct or not, if we
 
 ## Common arguments' descriptions
 
- #### `coarray_handle`
-   * Argument for [`caf_allocate`](#caf_allocate), [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async) and all of the [atomic operations](#atomic-memory-operation)
-   * scalar of type [`caf_co_handle_t`](#caf_co_handle_t)
-   * This argument is a handle for the established coarray. The handle will be created when the coarray is established.
- #### `coarray_handles`
-   * array of type [`caf_co_handle_t`](#caf_co_handle_t)
- #### `async_handle`
-   * Argument for [`caf_get_async`](#caf_get_async), [`caf_async_wait_for`](#caf_async_wait_for), [`caf_async_try_for`](#caf_async_try_for)
-   * scalar of type [`caf_async_handle_t`](#caf_async_handle_t)
-   * This argument is
+REMOVE_NOTE_TODO: finish moving these down below in the procedure description area
+
  #### `finished`
    * Argument for [`caf_async_try_for`](#caf_async_try_for)
    * scalar of type [`caf_async_handle_t`](#caf_async_handle_t)
    * This argument is
- #### `coindices`
-   * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
-   * 1d assumed-shape array of type `integer`
- #### `target`
-   * assumed-rank array of `type(*)`
-   * (REMOVE_NOTE: Is this note true for the puts and gets? And not just the atomics?) The location of this argument is the relevant information, not its value. This means that the compiler needs to ensure that when codegen (REMOVE_NOTE: ?) occurs, this argument is pass by reference and there is no copy made. The location of `target` is needed to compute the offset when the atomic operations' `atom` dummy argument is part of a derived type.
- #### `value`
-   * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
-   * assumed-rank array of `type(*)`
+
  #### `source`
    * Argument for [`caf_get_async`](#caf_get_async)
    * assumed-rank array of `type(*)`
@@ -183,53 +163,133 @@ need to have a state saying whether we are in a critical construct or not, if we
  #### `team_number`
    * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get)
    * scalar of type `integer`
- #### `stat`
-  * Argument for [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum), [`caf_put`](#caf_put), [`caf_get`](#caf_get)
-  * scalar of type `integer`
-  * if no error condition occurs on that image, it is assigned the value `0`
 
 ## Procedure descriptions
 
 ### Collectives
 
+ #### Common arguments
+  * **`a`**
+    * Argument for all the collective subroutines: [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum),
+    * may be any type
+    * is always `intent(inout)`
+    * for [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum) it is assigned the value computed by the collective operation, if no error conditions occurs and if `result_image` is absent, or the executing image is the one identified by `result_image`, otherwise `a` becomes undefined
+    * for [`co_broadcast`](#co_broadcast), the value of the argument on the `source_image` is assigned to the `a` argument on all other images
+
+  * **`stat`**
+    * Argument for all the collective subroutines: [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum),
+    * is always of type `integer`
+    * is always `intent(out)`
+    * is assigned the value `0` when the execution of the procedure is succcessful
+    * is assigned a positive value when the execution of the procedure is not succcessful and the `a` argument becomes undefined
+
+  * **`errmsg`**
+    * Argument for all the collective subroutines: [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum),
+    * is always of type `integer`
+    * is always `intent(inout)`
+    * if an error condition does not occur, the value is unchanged
+    * if an error condition occurs, an explanatory message is assigned to the argument
+
+
+  (REMOVE_NOTE_TODO: check the interfaces for these collectives, currently are same as the procedures in Caffeine, but these interfaces have not yet been discussed and decided upon for the Coarray Fortran Parallel Runtime Library Interface. May need to add something?)
+
  #### `caf_co_broadcast`
   * **Description**:
   * **Procedure Interface**:
+  ```
+     subroutine caf_co_broadcast(a, source_image, stat, errmsg)
+       implicit none
+       type(*), intent(inout), contiguous, target :: a(..)
+       integer, optional, intent(in) :: source_image
+       integer, optional, intent(out), target :: stat
+       character(len=*), intent(inout), optional, target :: errmsg
+     end subroutine
+  ```
   * **Arguments**: [`stat`](#stat)
 
  #### `caf_co_max`
   * **Description**:
   * **Procedure Interface**:
+  ```
+     subroutine caf_co_max(a, result_image, stat, errmsg)
+       implicit none
+       type(*), intent(inout), contiguous, target :: a(..)
+       integer, intent(in), optional, target :: result_image
+       integer, intent(out), optional, target :: stat
+       character(len=*), intent(inout), optional, target :: errmsg
+     end subroutine
+  ```
   * **Arguments**: [`stat`](#stat)
 
  #### `caf_co_min`
   * **Description**:
   * **Procedure Interface**:
+  ```
+     subroutine caf_co_min(a, result_image, stat, errmsg)
+       implicit none
+       type(*), intent(inout), contiguous, target :: a(..)
+       integer, intent(in), optional, target :: result_image
+       integer, intent(out), optional, target :: stat
+       character(len=*), intent(inout), optional, target :: errmsg
+     end subroutine
+  ```
   * **Arguments**: [`stat`](#stat)
 
  #### `caf_co_reduce`
   * **Description**:
   * **Procedure Interface**:
+  ```
+     subroutine caf_co_reduce(a, operation, result_image, stat, errmsg)
+       implicit none
+       type(*), intent(inout), contiguous, target :: a(..)
+       type(c_funptr), value :: operation
+       integer, intent(in), optional, target :: result_image
+       integer, intent(out), optional, target :: stat
+       character(len=*), intent(inout), optional, target :: errmsg
+     end subroutine
+  ```
   * **Arguments**: [`stat`](#stat)
 
  #### `caf_co_sum`
   * **Description**:
   * **Procedure Interface**:
+  ```
+     subroutine caf_co_sum(a, result_image, stat, errmsg)
+       implicit none
+       type(*), intent(inout), contiguous, target :: a(..)
+       integer, intent(in), target, optional :: result_image
+       integer, intent(out), target, optional :: stat
+       character(len=*), intent(inout), target, optional :: errmsg
+     end subroutine
+  ```
   * **Arguments**: [`stat`](#stat)
 
 ### Program startup and shutdown
 
-  When the compiler identifies a program that uses "Coarray Fortran" features, it will insert calls to `caf_init` and `caf_finalize`. These procedures ...
+  When the compiler identifies a program that uses "Coarray Fortran" features, it will insert calls to `caf_init` and `caf_finalize`. These procedures will intalize and terminate the Coarray Fortran environment.
 
  #### `caf_init`
-  * **Description**:
-  * **Procedure Interface**: `function caf_init() result(exit_code)`
-  * **Result**: `exit_code` is an `integer` whose value ...
+  * **Description**: This procedure will initialize the Coarray Fortran environment.
+  * **Procedure Interface**:
+    ```
+      function caf_init() result(exit_code)
+        implicit none
+        integer :: exit_code
+      end function
+    ```
+  * **Result**: `exit_code` is an `integer` whose value ... (REMOVE_NOTE_TODO: fill in)
 
  #### `caf_finalize`
-  * **Description**:
-  * **Procedure Interface**: `subroutine caf_finalize(exit_code)`
-  * **Arguments**: `exit_code` is `intent(in)` and ...
+  * **Description**: This procedure will terminate the Coarray Fortran environment.
+  * **Procedure Interface**:
+    ```
+      subroutine caf_finalize(exit_code)
+        implicit none
+        integer, intent(in) :: exit_code
+      end subroutine
+    ```
+  * **Arguments**:
+    * **`exit_code`**: is .. (REMOVE_NOTE_TODO: fill in)
 
  #### `caf_error_stop`
   * **Description**:
@@ -249,46 +309,60 @@ need to have a state saying whether we are in a critical construct or not, if we
 ### Allocation and deallocation
 
  #### `caf_allocate`
-  * **Description**: Calls to `caf_allocate` will be inserted when the compiler wants to allocate a coarray or when there is a statically declared coarray. This procedure allocates memory for a coarray. The `coarray_handle` dummy argument will pass back a handle that the runtime library will have created to be used for all future accesses and deallocation of the associated coarray. The `lbounds` and `sizes` arguments must 1d arrays with the same dimensions.
-  The `co_lbounds` and `co_sizes` arguments must be 1d arrays with the same dimensions. The product of the difference of the co_lbounds and co_ubounds need to equal the number of team members. `local_slice` shall not be allocated on entry
-
-   NOTE: the runtime library will stash away the coshape information at this time, runtime will have a mapping between the co_handle_t and the coshape information.
-   NOTE: considering adding a mold argument that would allow for dynamic typing for local_slice
+  * **Description**: This procedure allocates memory for a coarray. Calls to `caf_allocate` will be inserted by the compiler when there is an explicit coarray allocation or a statically declared coarray in the source code. The runtime library will stash away the coshape information at this time in order to internally track it during the lifetime of the coarray.
   * **Procedure Interface**:
   ```
-    module subroutine caf_allocate(co_lbounds, co_ubounds, lbounds, ubounds, coarray_handle, local_slice)
+    subroutine caf_allocate(co_lbounds, co_ubounds, lbounds, ubounds, coarray_handle, local_slice)
       implicit none
-      integer, kind(c_intmax_t), dimension(:), intent(in) :: co_lbounds, co_ubounds REMOVE_NOTE_TODO: (move these fake comments into descriptions of the arguments)! information about the coshape of the coarray being allocated
-      integer, kind(c_intmax_t), dimension(:), intent(in) :: lbounds, ubounds  ! information about the shape of the local_slice (non_coindexed object)
-      type(caf_co_handle_t), intent(out) :: coarray_handle ! represents the distributed object of the coarray on the corresponding team
+      integer, kind(c_intmax_t), dimension(:), intent(in) :: co_lbounds, co_ubounds
+      integer, kind(c_intmax_t), dimension(:), intent(in) :: lbounds, ubounds
+      type(caf_co_handle_t), intent(out) :: coarray_handle
       type(*), dimension(..), allocatable, intent(out) :: local_slice
     end subroutine
   ```
+  * **Further argument descriptions**:
+    * **`co_lbounds` and `co_ubounds`**: Shall be the the lower and upper bounds of the coarray being allocated. Shall be 1d arrays with the same dimensions as each other. The product of the difference of the `co_lbounds` and `co_ubounds` shall equal the number of team members (REMOVE_NOTE_TODO: check wording).
+    * **`lbounds` and `ubounds`**: Shall be the the lower and upper bounds of the `local_slice`. Shall be 1d arrays with the same dimensions as each other.
+    * **`coarray_handle`**: Represents the distributed object of the coarray on the corresponding team. Shall return the handle created by the runtime library that the compiler shall use for future coindexed-object references of the associated coarray and for deallocation of the associated coarray.
+    * **`local_slice`**: Shall not be allocated on entry. Shall return the
 
  #### `caf_deallocate`
-  * **Description**: This procedure. Has side effect that the allocatable local_slices represented by the coarray_handles will be destroyed. Compiler needs to know by association which local_slices are being deallocated.
-  if there is a local coarray that doesn't have the save attribute, enter a subroutine that
-
-when exiting a local scope where implicit deallocation of a coarray occurs, compiler must call this
-and when a deallocate stmt of a coarray occurs, compiler must call this
+  * **Description**: This procedure releases memory previously allocated for all of the coarrays associated with the handles in `coarray_handles`, resulting in the destruction of any associated `local_slices` received by the compiler after `caf_allocate` calls.  (REMOVE_NOTE_TODO: reword) The compiler will insert calls to this procedure when exiting a local scope where implicit deallocation of a coarray is mandated by the standard and when a coarray is explicitly deallocated through a `deallocate-stmt` in the source code.
   * **Procedure Interface**:
   ```
-    module subroutine caf_deallocate(coarray_handles)
+    subroutine caf_deallocate(coarray_handles)
       implicit none
       type(caf_co_handle_t), dimension(:), intent(in) :: coarray_handles
     end subroutine
   ```
+  * **Argument descriptions**:
+    * **`coarray_handles`**: Is an array of all of the handles for the coarrays that shall be deallocated.
+
 
 ### Coarray Access
 
  Coarray accesses will maintain serial dependencies for the issuing image. A non-blocking get has to be started and finished in the same segment. The interface provides puts that are fence-based and gets that are split phased.
 
+ #### Common arguments
+  * **`coarray_handle`**
+    * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
+    * scalar of type [`caf_co_handle_t`](#caf_co_handle_t)
+    * is a handle for the established coarray
+    * represents the distributed object of the coarray on the corresponding team
+  * **`coindices`**
+    * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
+    * 1d assumed-shape array of type `integer`
+    * (REMOVE_NOTE_TODO: fill in)
+  * **`value`**
+    * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
+    * assumed-rank array of `type(*)`
+    * is
 
  #### `caf_put`
-  * **Description**: Blocks on local completion. (REMOVE_NOTE: eventually would like a caf_put that doesn't block on local completion).
+  * **Description**: This procedure assigns to a coarray. The compiler shall call this procedure when there is a coarray reference that is a `coindexed-object`. The compiler shall not (REMOVE_NOTE: need to?) call this procedure when the coarray reference is not a `coindexed-object`. This procedure blocks on local completion. (REMOVE_NOTE: eventually would like a caf_put that doesn't block on local completion).
   * **Procedure Interface**:
   ```
-    module subroutine caf_put(coarray_handle, coindices, team, team_number, target, value, stat)
+    subroutine caf_put(coarray_handle, coindices, team, team_number, target, value, stat)
       implicit none
       type(caf_co_handle_t), intent(in) :: coarray_handle
       integer, intent(in) :: coindices(:)
@@ -298,14 +372,16 @@ and when a deallocate stmt of a coarray occurs, compiler must call this
       integer, optional, intent(out) :: stat
     end subroutine
   ```
-  * **Notes**: Both optional arguments `team` and `team_number` shall not be present in the same call
+  * **Further argument descriptions**:
+    * **`team` and `team_number`**: optional arguments that specify a team. They shall not both be present in the same call.
+    * **`value`**: The value that shall be assigned to (REMOVE_NOTE_TODO: fill in)
 
 
  #### `caf_get`
   * **Description**:
   * **Procedure Interface**:
   ```
-    module subroutine caf_get(coarray_handle, coindices, team, team_number, source, value, stat)
+    subroutine caf_get(coarray_handle, coindices, team, team_number, source, value, stat)
       implicit none
       type(caf_co_handle_t), intent(in) :: coarray_handle
       integer, intent(in) :: coindices(:)
@@ -322,7 +398,7 @@ and when a deallocate stmt of a coarray occurs, compiler must call this
   * **Description**:
   * **Procedure Interface**:
   ```
-    module subroutine caf_get_async(coarray_handle, coindices, team, team_number, source, value, stat, async_handle)
+    subroutine caf_get_async(coarray_handle, coindices, team, team_number, source, value, stat, async_handle)
       implicit none
       type(caf_co_handle_t),  intent(in) :: coarray_handle
       integer, dimension(:),  intent(in) :: coindices
@@ -338,11 +414,19 @@ and when a deallocate stmt of a coarray occurs, compiler must call this
 
 ###  Operation Synchronization
 
+
+ #### Common arguments
+  * **`async_handle`**
+    * Argument for [`caf_async_wait_for`](#caf_async_wait_for), [`caf_async_try_for`](#caf_async_try_for)
+    * scalar of type [`caf_async_handle_t`](#caf_async_handle_t)
+    * This argument is a handle used to track the asynchronous operation REMOVE_NOTE_TODO: reword and buff out this sentence
+
+
  #### `caf_async_wait_for`
   * **Description**: This procedure waits until (REMOVE_NOTE: asynchronous?) operation is complete and then consumes the async handle
   * **Procedure Interface**:
   ```
-    module subroutine caf_async_wait_for(async_handle)
+    subroutine caf_async_wait_for(async_handle)
       implicit none
       type(caf_async_handle_t), intent(inout) :: async_handle
     end subroutine
@@ -352,7 +436,7 @@ and when a deallocate stmt of a coarray occurs, compiler must call this
   * **Description**: This procedure consumes the async handle if and only if the operation is complete
   * **Procedure Interface**:
   ```
-    module subroutine caf_async_try_for(async_handle, finished)
+    subroutine caf_async_try_for(async_handle, finished)
       implicit none
       type(caf_async_handle_t), intent(inout) :: async_handle
       logical, intent(out) :: finished
@@ -450,12 +534,19 @@ and when a deallocate stmt of a coarray occurs, compiler must call this
 
 All atomic operations are blocking operations.
 
+ #### Common arguments
+  * **`target`**
+    * Argument for all of the atomics (REMOVE_NOTE_TODO_DECISION: have we decided to deal with atomics with the offset option or the target option?)
+    * assumed-rank array of `type(*)`
+    * The location of this argument is the relevant information, not its value. As such, the compiler needs to ensure that when codegen (REMOVE_NOTE: ?) occurs, this argument is pass by reference and there is no copy made. The location of `target` is needed to compute the offset when the atomic operations' `atom` dummy argument is part of a derived type.
+
+
  #### `caf_atomic_add`
   * **Description**:
   * **Procedure Interface**: REMOVE_NOTE_TODO_DECISION:
   Option 1 with offset:
   ```
-    module subroutine caf_atomic_add(coarray_handle, coindicies, offset, value, stat)
+    subroutine caf_atomic_add(coarray_handle, coindicies, offset, value, stat)
       type(caf_co_handle_t) :: coarray_handle
       integer, intent(in) :: coindices(:)
       integer :: offset, value, stat
@@ -464,7 +555,7 @@ All atomic operations are blocking operations.
 
   Option 2 with target:
   ```
-    module subroutine caf_atomic_add(coarray_handle, coindicies, target, value, stat)
+    subroutine caf_atomic_add(coarray_handle, coindicies, target, value, stat)
       type(caf_co_handle_t) :: coarray_handle
       integer, intent(in) :: coindices(:) ! names image num
       integer(kind=atomic_int_kind), intent(in) :: target !location of target is relevant, not the value of target, need this to compute the offset when the `atom` dummy argument to the intrinsic is part of a derived type
@@ -587,6 +678,9 @@ All atomic operations are blocking operations.
   - Search for REMOVE_NOTE_TODO_DECISION to find locations where specific decisions/options are outlined, but not yet made.
   - Search for, resolve, and remove all REMOVE_NOTE and REMOVE_NOTE_TODO_DECISIONS before finalizing this document.
 
+  - `caf_allocate` - after getting the basic necessities for this procedure sorted, considering adding a mold argument that would allow for dynamic typing for `local_slice`
+
+
   * **Asynchrony:**
     -   Could be handle based or fence based approaches
     -   Handle based - return can individual operation handle, later on compiler synchronizes handle
@@ -606,7 +700,7 @@ Caffeine internal procedure, so not part of the CAF PRI.
   * **Description**: This procedure ends a segment. Any puts that are still in flight will be committed (and any caches will be thrown away REMOVE_NOTE_TODO_DECISION: if we decide to do caches). Calls to this procedure will be side effects of invocations of the image control statements. It is not a synchronizing operation.
   * **Procedure Interface**:
   ```
-    module subroutine caf_end_segment()
+    subroutine caf_end_segment()
       implicit none
       (REMOVE_NOTE: are there no arguments? or is it just that we haven't sketched out the args yet?)
     end subroutine
@@ -658,6 +752,7 @@ TODO after writing example, try compiling it and see if it compiles at least unt
 
 3. implicit deallocation of a coarray example
     - local, allocatable coarray -> compiler must insert caf_deallocate call
+    - have multiple coarrays, with only one call to caf_deallocate since it takes an array of handles
 
 flexible array member in c
 

>From d2020172c3a548790df45b876d97fb495d65ab48 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Wed, 15 Mar 2023 20:56:12 -0700
Subject: [PATCH 22/33] Add first sketches of interfaces for `caf_error_stop`,
 `caf_stop`, `caf_num_images`, `caf_form_team`, and `caf_change_team` and
 update other parts of design doc.

---
 flang/docs/CoarrayFortranRuntime.md | 612 ++++++++++++++++++----------
 1 file changed, 405 insertions(+), 207 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 593011c568c39d..12973e05d9273d 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -57,7 +57,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
 | Assigning variables of type `team-type` |     ✓     |           |
 | Track locals coarrays for implicit deallocation when exiting a scope |     ✓     |           |
 | Initialize a coarray with SOURCE= as part of allocate-stmt |     ✓     |           |
-| `caf_critical_id_t` |     ✓     |           |
+| Provide unique identifiers for location of each `critical-construct` |     ✓     |           |
 | Track coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
 | Allocate and deallocate a coarray       |           |     ✓     |
 | Reference a coindexed-object            |           |     ✓     |
@@ -138,31 +138,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
    * `caf_co_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler.
  #### `caf_async_handle_t`
    * `caf_async_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler. This type will help the runtime library track and provide asynchrony.
- #### `caf_critical_id_t`
-   * `caf_critical_id_t` will be used to track the location of the critical construct blocks. REMOVE_NOTE_TODO: this type should be a int which is a unique identifier , (may not need to be a derived type)
 
-need to have a state saying whether we are in a critical construct or not, if we are in one, then we can ignore any further nested critical constructs.
-
-
-
-## Common arguments' descriptions
-
-REMOVE_NOTE_TODO: finish moving these down below in the procedure description area
-
- #### `finished`
-   * Argument for [`caf_async_try_for`](#caf_async_try_for)
-   * scalar of type [`caf_async_handle_t`](#caf_async_handle_t)
-   * This argument is
-
- #### `source`
-   * Argument for [`caf_get_async`](#caf_get_async)
-   * assumed-rank array of `type(*)`
- #### `team`
-   * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get)
-   * scalar of type `team_type`
- #### `team_number`
-   * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get)
-   * scalar of type `integer`
 
 ## Procedure descriptions
 
@@ -196,77 +172,77 @@ REMOVE_NOTE_TODO: finish moving these down below in the procedure description ar
  #### `caf_co_broadcast`
   * **Description**:
   * **Procedure Interface**:
-  ```
-     subroutine caf_co_broadcast(a, source_image, stat, errmsg)
-       implicit none
-       type(*), intent(inout), contiguous, target :: a(..)
-       integer, optional, intent(in) :: source_image
-       integer, optional, intent(out), target :: stat
-       character(len=*), intent(inout), optional, target :: errmsg
-     end subroutine
-  ```
-  * **Arguments**: [`stat`](#stat)
+    ```
+       subroutine caf_co_broadcast(a, source_image, stat, errmsg)
+         implicit none
+         type(*), intent(inout), contiguous, target :: a(..)
+         integer, optional, intent(in) :: source_image
+         integer, optional, intent(out), target :: stat
+         character(len=*), intent(inout), optional, target :: errmsg
+       end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_co_max`
   * **Description**:
   * **Procedure Interface**:
-  ```
-     subroutine caf_co_max(a, result_image, stat, errmsg)
-       implicit none
-       type(*), intent(inout), contiguous, target :: a(..)
-       integer, intent(in), optional, target :: result_image
-       integer, intent(out), optional, target :: stat
-       character(len=*), intent(inout), optional, target :: errmsg
-     end subroutine
-  ```
-  * **Arguments**: [`stat`](#stat)
+    ```
+       subroutine caf_co_max(a, result_image, stat, errmsg)
+         implicit none
+         type(*), intent(inout), contiguous, target :: a(..)
+         integer, intent(in), optional, target :: result_image
+         integer, intent(out), optional, target :: stat
+         character(len=*), intent(inout), optional, target :: errmsg
+       end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_co_min`
   * **Description**:
   * **Procedure Interface**:
-  ```
-     subroutine caf_co_min(a, result_image, stat, errmsg)
-       implicit none
-       type(*), intent(inout), contiguous, target :: a(..)
-       integer, intent(in), optional, target :: result_image
-       integer, intent(out), optional, target :: stat
-       character(len=*), intent(inout), optional, target :: errmsg
-     end subroutine
-  ```
-  * **Arguments**: [`stat`](#stat)
+    ```
+       subroutine caf_co_min(a, result_image, stat, errmsg)
+         implicit none
+         type(*), intent(inout), contiguous, target :: a(..)
+         integer, intent(in), optional, target :: result_image
+         integer, intent(out), optional, target :: stat
+         character(len=*), intent(inout), optional, target :: errmsg
+       end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_co_reduce`
   * **Description**:
   * **Procedure Interface**:
-  ```
-     subroutine caf_co_reduce(a, operation, result_image, stat, errmsg)
-       implicit none
-       type(*), intent(inout), contiguous, target :: a(..)
-       type(c_funptr), value :: operation
-       integer, intent(in), optional, target :: result_image
-       integer, intent(out), optional, target :: stat
-       character(len=*), intent(inout), optional, target :: errmsg
-     end subroutine
-  ```
-  * **Arguments**: [`stat`](#stat)
+    ```
+       subroutine caf_co_reduce(a, operation, result_image, stat, errmsg)
+         implicit none
+         type(*), intent(inout), contiguous, target :: a(..)
+         type(c_funptr), value :: operation
+         integer, intent(in), optional, target :: result_image
+         integer, intent(out), optional, target :: stat
+         character(len=*), intent(inout), optional, target :: errmsg
+       end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_co_sum`
   * **Description**:
   * **Procedure Interface**:
-  ```
-     subroutine caf_co_sum(a, result_image, stat, errmsg)
-       implicit none
-       type(*), intent(inout), contiguous, target :: a(..)
-       integer, intent(in), target, optional :: result_image
-       integer, intent(out), target, optional :: stat
-       character(len=*), intent(inout), target, optional :: errmsg
-     end subroutine
-  ```
-  * **Arguments**: [`stat`](#stat)
+    ```
+       subroutine caf_co_sum(a, result_image, stat, errmsg)
+         implicit none
+         type(*), intent(inout), contiguous, target :: a(..)
+         integer, intent(in), target, optional :: result_image
+         integer, intent(out), target, optional :: stat
+         character(len=*), intent(inout), target, optional :: errmsg
+       end subroutine
+    ```
+  * **Further argument descriptions**:
 
 ### Program startup and shutdown
 
-  When the compiler identifies a program that uses "Coarray Fortran" features, it will insert calls to `caf_init` and `caf_finalize`. These procedures will intalize and terminate the Coarray Fortran environment.
+  When the compiler identifies a program that uses "Coarray Fortran" features, it will insert calls to `caf_init` and `caf_finalize`. These procedures will initialize and terminate the Coarray Fortran environment.
 
  #### `caf_init`
   * **Description**: This procedure will initialize the Coarray Fortran environment.
@@ -288,38 +264,60 @@ REMOVE_NOTE_TODO: finish moving these down below in the procedure description ar
         integer, intent(in) :: exit_code
       end subroutine
     ```
-  * **Arguments**:
+  * **Further argument descriptions**:
     * **`exit_code`**: is .. (REMOVE_NOTE_TODO: fill in)
 
+  (REMOVE_NOTE_TODO: check the interfaces for caf_error_stop and caf_stop, currently are same as the procedures in Caffeine, but these interfaces have not yet been discussed and decided upon for the Coarray Fortran Parallel Runtime Library Interface. May need to add something? Change something?)
+
  #### `caf_error_stop`
-  * **Description**:
-  * **Procedure Interface**:
-  * **Arguments**:
+  * **Description**: This procedure stops all images and provides the `stop_code` passed, or `0` if no `stop_code` is passed, as the process exit status
+  * **Procedure Interface**: REMOVE_NOTE_TODO_DECISION: should error_stop be implemented with two optional arguments with the precondition that they both shall not be passed at the same time? Or have overloaded procedures?
+    ```
+      subroutine caf_error_stop(stop_code_int, stop_code_char)
+        integer, intent(in), optional :: stop_code_int
+        character(len=*), intent(in), optional :: stop_code_char
+      end subroutine
+    ```
+  * **Further argument descriptions**:
+    * **`stop_code_int` and `stop_code_char`**: shall not both be present in the same call (if provide only one procedure instead of overloading caf_error_stop)
 
  #### `caf_stop`
-  * **Description**:
-  * **Procedure Interface**:
-  * **Arguments**:
+  * **Description**: This procedure synchronizes and stops the executing image. It provides the `stop_code` or `0` if no `stop_code` is passed, as the process exit status.
+  * **Procedure Interface**:  REMOVE_NOTE_TODO_DECISION: should stop be implemented with two optional arguments with the precondition that they both shall not be passed at the same time? Or have overloaded procedures?
+    ```
+      subroutine caf_stop(stop_code_int, stop_code_char)
+        implicit none
+        integer, intent(in) :: stop_code_int
+        character(len=*), intent(in) :: stop_code_char
+      end subroutine
+
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_fail_image`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_fail_image(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
 ### Allocation and deallocation
 
  #### `caf_allocate`
   * **Description**: This procedure allocates memory for a coarray. Calls to `caf_allocate` will be inserted by the compiler when there is an explicit coarray allocation or a statically declared coarray in the source code. The runtime library will stash away the coshape information at this time in order to internally track it during the lifetime of the coarray.
   * **Procedure Interface**:
-  ```
-    subroutine caf_allocate(co_lbounds, co_ubounds, lbounds, ubounds, coarray_handle, local_slice)
-      implicit none
-      integer, kind(c_intmax_t), dimension(:), intent(in) :: co_lbounds, co_ubounds
-      integer, kind(c_intmax_t), dimension(:), intent(in) :: lbounds, ubounds
-      type(caf_co_handle_t), intent(out) :: coarray_handle
-      type(*), dimension(..), allocatable, intent(out) :: local_slice
-    end subroutine
-  ```
+    ```
+      subroutine caf_allocate(co_lbounds, co_ubounds, lbounds, ubounds, coarray_handle, local_slice)
+        implicit none
+        integer, kind(c_intmax_t), dimension(:), intent(in) :: co_lbounds, co_ubounds
+        integer, kind(c_intmax_t), dimension(:), intent(in) :: lbounds, ubounds
+        type(caf_co_handle_t), intent(out) :: coarray_handle
+        type(*), dimension(..), allocatable, intent(out) :: local_slice
+      end subroutine
+    ```
   * **Further argument descriptions**:
     * **`co_lbounds` and `co_ubounds`**: Shall be the the lower and upper bounds of the coarray being allocated. Shall be 1d arrays with the same dimensions as each other. The product of the difference of the `co_lbounds` and `co_ubounds` shall equal the number of team members (REMOVE_NOTE_TODO: check wording).
     * **`lbounds` and `ubounds`**: Shall be the the lower and upper bounds of the `local_slice`. Shall be 1d arrays with the same dimensions as each other.
@@ -329,12 +327,12 @@ REMOVE_NOTE_TODO: finish moving these down below in the procedure description ar
  #### `caf_deallocate`
   * **Description**: This procedure releases memory previously allocated for all of the coarrays associated with the handles in `coarray_handles`, resulting in the destruction of any associated `local_slices` received by the compiler after `caf_allocate` calls.  (REMOVE_NOTE_TODO: reword) The compiler will insert calls to this procedure when exiting a local scope where implicit deallocation of a coarray is mandated by the standard and when a coarray is explicitly deallocated through a `deallocate-stmt` in the source code.
   * **Procedure Interface**:
-  ```
-    subroutine caf_deallocate(coarray_handles)
-      implicit none
-      type(caf_co_handle_t), dimension(:), intent(in) :: coarray_handles
-    end subroutine
-  ```
+    ```
+      subroutine caf_deallocate(coarray_handles)
+        implicit none
+        type(caf_co_handle_t), dimension(:), intent(in) :: coarray_handles
+      end subroutine
+    ```
   * **Argument descriptions**:
     * **`coarray_handles`**: Is an array of all of the handles for the coarrays that shall be deallocated.
 
@@ -356,60 +354,66 @@ REMOVE_NOTE_TODO: finish moving these down below in the procedure description ar
   * **`value`**
     * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
     * assumed-rank array of `type(*)`
-    * is
+    * is (REMOVE_NOTE_TODO: fill in)
+  * **`source`**
+    * Argument for [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
+    * assumed-rank array of `type(*)`
+    * is (REMOVE_NOTE_TODO: fill in)
+  * **`team` and `team_number`**
+    * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
+    * are optional arguments that specify a team
+    * shall not both be present in the same call
 
  #### `caf_put`
   * **Description**: This procedure assigns to a coarray. The compiler shall call this procedure when there is a coarray reference that is a `coindexed-object`. The compiler shall not (REMOVE_NOTE: need to?) call this procedure when the coarray reference is not a `coindexed-object`. This procedure blocks on local completion. (REMOVE_NOTE: eventually would like a caf_put that doesn't block on local completion).
   * **Procedure Interface**:
-  ```
-    subroutine caf_put(coarray_handle, coindices, team, team_number, target, value, stat)
-      implicit none
-      type(caf_co_handle_t), intent(in) :: coarray_handle
-      integer, intent(in) :: coindices(:)
-      type(*), dimension(..), intent(in) :: target, value
-      type(team_type), optional, intent(in) :: team
-      integer, optional, intent(in) :: team_number
-      integer, optional, intent(out) :: stat
-    end subroutine
-  ```
+    ```
+      subroutine caf_put(coarray_handle, coindices, team, team_number, target, value, stat)
+        implicit none
+        type(caf_co_handle_t), intent(in) :: coarray_handle
+        integer, intent(in) :: coindices(:)
+        type(*), dimension(..), intent(in) :: target, value
+        type(team_type), optional, intent(in) :: team
+        integer, optional, intent(in) :: team_number
+        integer, optional, intent(out) :: stat
+      end subroutine
+    ```
   * **Further argument descriptions**:
-    * **`team` and `team_number`**: optional arguments that specify a team. They shall not both be present in the same call.
     * **`value`**: The value that shall be assigned to (REMOVE_NOTE_TODO: fill in)
 
 
  #### `caf_get`
   * **Description**:
   * **Procedure Interface**:
-  ```
-    subroutine caf_get(coarray_handle, coindices, team, team_number, source, value, stat)
-      implicit none
-      type(caf_co_handle_t), intent(in) :: coarray_handle
-      integer, intent(in) :: coindices(:)
-      type(*), dimension(..), intent(in) :: source
-      type(*), dimension(..), intent(inout) :: value
-      type(team_type), optional, intent(in) :: team
-      integer, optional, intent(in) :: team_number
-      integer, optional, intent(out) :: stat
-    end subroutine
-  ```
-  * **Notes**: Both optional arguments `team` and `team_number` shall not be present in the same call
+    ```
+      subroutine caf_get(coarray_handle, coindices, team, team_number, source, value, stat)
+        implicit none
+        type(caf_co_handle_t), intent(in) :: coarray_handle
+        integer, intent(in) :: coindices(:)
+        type(*), dimension(..), intent(in) :: source
+        type(*), dimension(..), intent(inout) :: value
+        type(team_type), optional, intent(in) :: team
+        integer, optional, intent(in) :: team_number
+        integer, optional, intent(out) :: stat
+      end subroutine
+    ```
 
  #### `caf_get_async`
   * **Description**:
   * **Procedure Interface**:
-  ```
-    subroutine caf_get_async(coarray_handle, coindices, team, team_number, source, value, stat, async_handle)
-      implicit none
-      type(caf_co_handle_t),  intent(in) :: coarray_handle
-      integer, dimension(:),  intent(in) :: coindices
-      type(*), dimension(..), intent(in) :: source
-      type(*), dimension(..), intent(inout) :: value
-      type(team_type), optional, intent(in) :: team
-      integer, optional, intent(in) :: team_number
-      integer, optional, intent(out) :: stat
-      type(caf_async_handle_t), intent(out) :: async_handle
-    end subroutine
-  ```
+    ```
+      subroutine caf_get_async(coarray_handle, coindices, team, team_number, source, value, stat, async_handle)
+        implicit none
+        type(caf_co_handle_t),  intent(in) :: coarray_handle
+        integer, dimension(:),  intent(in) :: coindices
+        type(*), dimension(..), intent(in) :: source
+        type(*), dimension(..), intent(inout) :: value
+        type(team_type), optional, intent(in) :: team
+        integer, optional, intent(in) :: team_number
+        integer, optional, intent(out) :: stat
+        type(caf_async_handle_t), intent(out) :: async_handle
+      end subroutine
+    ```
 
 
 ###  Operation Synchronization
@@ -425,110 +429,201 @@ REMOVE_NOTE_TODO: finish moving these down below in the procedure description ar
  #### `caf_async_wait_for`
   * **Description**: This procedure waits until (REMOVE_NOTE: asynchronous?) operation is complete and then consumes the async handle
   * **Procedure Interface**:
-  ```
-    subroutine caf_async_wait_for(async_handle)
-      implicit none
-      type(caf_async_handle_t), intent(inout) :: async_handle
-    end subroutine
-  ```
+    ```
+      subroutine caf_async_wait_for(async_handle)
+        implicit none
+        type(caf_async_handle_t), intent(inout) :: async_handle
+      end subroutine
+    ```
 
  #### `caf_async_try_for`
   * **Description**: This procedure consumes the async handle if and only if the operation is complete
   * **Procedure Interface**:
-  ```
-    subroutine caf_async_try_for(async_handle, finished)
-      implicit none
-      type(caf_async_handle_t), intent(inout) :: async_handle
-      logical, intent(out) :: finished
-    end subroutine
-
-  ```
+    ```
+      subroutine caf_async_try_for(async_handle, finished)
+        implicit none
+        type(caf_async_handle_t), intent(inout) :: async_handle
+        logical, intent(out) :: finished
+      end subroutine
+    ```
+  * **Further argument descriptions**:
+    * **`finished`**: This argument returns `true` if the asynchronous operation is complete
 
  #### `caf_sync_memory`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_sync_memory(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
 ### Image Synchronization
 
  #### `caf_sync_all`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_sync_all(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_sync_images`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_sync_images(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_lock`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_lock(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_unlock`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_unlock(fill in...)
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_critical`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_critical(critical_id, REMOVE_NOTE_TODO: fill in)
+        implicit none
+        integer(kind=c_int64_t), intent(in) :: critical_id
+      end subroutine
+    ```
+  * **Further argument descriptions**:
+    * **`critical_id`**: shall be a unique identifier for a critical construct. This unique identifier will be used by the runtime library to track the location of the critical construct blocks.
 
 #### `caf_end_critical`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_end_critical(critical_id, REMOVE_NOTE_TODO: fill in)
+        implicit none
+        integer(kind=c_int64_t), intent(in) :: critical_id
+      end subroutine
+    ```
+  * **Further argument descriptions**:
+      * **`critical_id`**: shall be the same unique identifier for the critical construct that was passed to the runtime library during the corresponding call to `caf_critical`. REMOVE_NOTE_TODO: reword?
 
 ### Events
 
  #### `caf_event_post`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_event_post(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_event_wait`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_event_wait(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_event_query`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_event_query(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
 ### Teams
+  (REMOVE_NOTE_TODO: check the interface for caf_change_team and caf_form_team, currently are same as the procedures in Caffeine, but these interfaces have not yet been discussed and decided upon for the Coarray Fortran Parallel Runtime Library Interface. May need to add something? Change something?)
 
  #### `caf_change_team`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_change_team(team)
+        implicit none
+        type(team_type), target, intent(in) :: team
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_end_team`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_end_team(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_form_team`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_form_team(num, team, new_index, stat, errmsg)
+        implicit none
+        integer,          intent(in)  :: num
+        type(team_type),  intent(out) :: team
+        integer,          intent(in),    optional :: new_index
+        integer,          intent(out),   optional :: stat
+        character(len=*), intent(inout), optional :: errmsg
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_sync_team`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_sync_team(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_get_team`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_get_team(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_team_number`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_team_number(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
 ### Atomic Memory Operation
 
@@ -545,122 +640,223 @@ All atomic operations are blocking operations.
   * **Description**:
   * **Procedure Interface**: REMOVE_NOTE_TODO_DECISION:
   Option 1 with offset:
-  ```
-    subroutine caf_atomic_add(coarray_handle, coindicies, offset, value, stat)
-      type(caf_co_handle_t) :: coarray_handle
-      integer, intent(in) :: coindices(:)
-      integer :: offset, value, stat
-    end subroutine
-  ```
+    ```
+      subroutine caf_atomic_add(coarray_handle, coindicies, offset, value, stat)
+        implicit none
+        type(caf_co_handle_t) :: coarray_handle
+        integer, intent(in) :: coindices(:)
+        integer :: offset, value, stat
+      end subroutine
+    ```
 
   Option 2 with target:
-  ```
-    subroutine caf_atomic_add(coarray_handle, coindicies, target, value, stat)
-      type(caf_co_handle_t) :: coarray_handle
-      integer, intent(in) :: coindices(:) ! names image num
-      integer(kind=atomic_int_kind), intent(in) :: target !location of target is relevant, not the value of target, need this to compute the offset when the `atom` dummy argument to the intrinsic is part of a derived type
-      integer :: value, stat
-    end subroutine
-  ```
+    ```
+      subroutine caf_atomic_add(coarray_handle, coindicies, target, value, stat)
+        implicit none
+        type(caf_co_handle_t) :: coarray_handle
+        integer, intent(in) :: coindices(:) ! names image num
+        integer(kind=atomic_int_kind), intent(in) :: target !location of target is relevant, not the value of target, need this to compute the offset when the `atom` dummy argument to the intrinsic is part of a derived type
+        integer :: value, stat
+      end subroutine
+    ```
 
  #### `caf_atomic_and`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_atomic_and(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_atomic_cas`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_atomic_cas(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_atomic_define`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_atomic_define(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_atomic_fetch_add`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_atomic_fetch_add(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_atomic_fetch_and`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_atomic_fetch_and(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_atomic_fetch_or`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_atomic_fetch_or(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_atomic_fetch_xor`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_fetch_xor(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_atomic_or`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_atomic_or(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_atomic_ref`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_atomic_ref(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_atomic_xor`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_atomic_xor(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
 ### Coarray Queries
 
  #### `caf_lcobound`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_lcobound(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_ucobound`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_ucobound(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_coshape`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_coshape(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_image_index`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_image_index(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
 ### Image Queries
 
  #### `caf_num_images`
   * **Description**:
-  * **Procedure Interface**:
-  * **Arguments**:
+  * **Procedure Interface**:   (REMOVE_NOTE_TODO: check the interface for caf_num_images, currently is same as the procedure in Caffeine, but this interface has not yet been discussed and decided upon for the Coarray Fortran Parallel Runtime Library Interface. May need to add something? Change something?)
+    ```
+      function caf_num_images(team, team_number) result(image_count)
+        implicit none
+        type(team_type), intent(in), optional :: team
+        integer, intent(in), optional :: team_number
+        integer :: image_count
+      end function
+    ```
+  * **Further argument descriptions**:
+  * **Result**:
 
  #### `caf_this_image`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_this_image(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_failed_images`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_failed_images(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_stopped_images`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_stopped_images(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
  #### `caf_image_status`
   * **Description**:
   * **Procedure Interface**:
-  * **Arguments**:
+    ```
+      subroutine caf_image_status(fill in...)
+        implicit none
+      end subroutine
+    ```
+  * **Further argument descriptions**:
 
 
 ## Establish and initialize static coarrays prior to `main`
@@ -668,12 +864,16 @@ All atomic operations are blocking operations.
   (REMOVE_NOTE: complete this section, potentially move to earlier in doc) Compiler will need to: call caf_init, call caf_allocate ... for each coarray and in the right order. And then copy any initializers.
 
 
-## Berkeley Lab internal Notes: (REMOVE_NOTES before submission)
+## Internal Development Notes: (REMOVE_NOTES before submission)
   - REMOVE_NOTE_TODO_DECISION: Need to decide the thread semantics
   - REMOVE_NOTE_TODO_DECISION: Are we going to have Caffeine be thread safe? Have a thread safety option? Is it a build time option? or runtime?
         Dan advocates having a thread-safety option that is build time.
   - Do we need to add any discussion of what it would look like when code has mixed OpenMP and Coarray Fortran?
 
+ - boilerplate was added for all of the interfaces and initially everything was made a subroutine
+ - when all the interfaces are done, check them to make sure there were no interfaces created that could be a function, but were left a subroutine because forgot to change that aspect of the boilerplate
+
+  - critical construct - need to have a state saying whether we are in a critical construct or not, if we are in one, then we can ignore any further nested critical constructs.
 
   - Search for REMOVE_NOTE_TODO_DECISION to find locations where specific decisions/options are outlined, but not yet made.
   - Search for, resolve, and remove all REMOVE_NOTE and REMOVE_NOTE_TODO_DECISIONS before finalizing this document.
@@ -706,8 +906,6 @@ Caffeine internal procedure, so not part of the CAF PRI.
     end subroutine
   ```
 
-
-
 ### `caf_co_handle_t`
 
    The following is a Fortran heavy pseudo code, not the exact implementation we plan

>From 1bf84495fc7f9a85039a95fa2a82c1c010d10174 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Wed, 15 Mar 2023 21:02:07 -0700
Subject: [PATCH 23/33] Update design doc with more argument notes

---
 flang/docs/CoarrayFortranRuntime.md | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 12973e05d9273d..556b9273d8b85a 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -287,12 +287,13 @@ The following table outlines which tasks will be the responsibility of the Fortr
     ```
       subroutine caf_stop(stop_code_int, stop_code_char)
         implicit none
-        integer, intent(in) :: stop_code_int
-        character(len=*), intent(in) :: stop_code_char
+        integer, intent(in), optional :: stop_code_int
+        character(len=*), intent(in), optional :: stop_code_char
       end subroutine
 
     ```
   * **Further argument descriptions**:
+    * **`stop_code_int` and `stop_code_char`**: shall not both be present in the same call (if provide only one procedure instead of overloading caf_stop)
 
  #### `caf_fail_image`
   * **Description**:
@@ -816,6 +817,7 @@ All atomic operations are blocking operations.
       end function
     ```
   * **Further argument descriptions**:
+    * **`team` and `team_number`**: optional arguments that specify a team. They shall not both be present in the same call.
   * **Result**:
 
  #### `caf_this_image`

>From 534dea1b55e31a63d0e839b8e2c9806dc2c975ee Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Wed, 22 Mar 2023 12:27:57 -0700
Subject: [PATCH 24/33] Reorder arguments in some procedure interfaces

---
 flang/docs/CoarrayFortranRuntime.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 556b9273d8b85a..f439ea5f13b200 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -369,7 +369,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
   * **Description**: This procedure assigns to a coarray. The compiler shall call this procedure when there is a coarray reference that is a `coindexed-object`. The compiler shall not (REMOVE_NOTE: need to?) call this procedure when the coarray reference is not a `coindexed-object`. This procedure blocks on local completion. (REMOVE_NOTE: eventually would like a caf_put that doesn't block on local completion).
   * **Procedure Interface**:
     ```
-      subroutine caf_put(coarray_handle, coindices, team, team_number, target, value, stat)
+      subroutine caf_put(coarray_handle, coindices, target, value, team, team_number, stat)
         implicit none
         type(caf_co_handle_t), intent(in) :: coarray_handle
         integer, intent(in) :: coindices(:)
@@ -387,7 +387,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
   * **Description**:
   * **Procedure Interface**:
     ```
-      subroutine caf_get(coarray_handle, coindices, team, team_number, source, value, stat)
+      subroutine caf_get(coarray_handle, coindices, source, value, team, team_number, stat)
         implicit none
         type(caf_co_handle_t), intent(in) :: coarray_handle
         integer, intent(in) :: coindices(:)
@@ -403,16 +403,16 @@ The following table outlines which tasks will be the responsibility of the Fortr
   * **Description**:
   * **Procedure Interface**:
     ```
-      subroutine caf_get_async(coarray_handle, coindices, team, team_number, source, value, stat, async_handle)
+      subroutine caf_get_async(coarray_handle, coindices, source, value, async_handle, team, team_number, stat)
         implicit none
         type(caf_co_handle_t),  intent(in) :: coarray_handle
         integer, dimension(:),  intent(in) :: coindices
         type(*), dimension(..), intent(in) :: source
         type(*), dimension(..), intent(inout) :: value
+        type(caf_async_handle_t), intent(out) :: async_handle
         type(team_type), optional, intent(in) :: team
         integer, optional, intent(in) :: team_number
         integer, optional, intent(out) :: stat
-        type(caf_async_handle_t), intent(out) :: async_handle
       end subroutine
     ```
 

>From 02d949920cb1a9b41b0ef4a828b0b9e38e5ed8cf Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Wed, 29 Mar 2023 10:01:52 -0700
Subject: [PATCH 25/33] Update doc based on meeting

---
 flang/docs/CoarrayFortranRuntime.md | 54 +++++++++++++++++++++--------
 1 file changed, 40 insertions(+), 14 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index f439ea5f13b200..4cb6ea2dff2bef 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -43,7 +43,7 @@ One consequence of the statements being categorized as image control statements
 
 ## Coarray Fortran (CAF) Parallel Runtime Interface
 
-  The Coarray Fortran Parallel Runtime Interface is a proposed interface in which the runtime library is responsible for coarray allocation, deallocation and accesses, image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the source code to add Fortran procedure calls to the necessary runtime library procedures. Below you can find a table showing the delegation of tasks between the compiler and the runtime library. The interface is designed for portability across shared and distributed memory machines, different operating systems, and multiple architectures. The Caffeine implementation, [see below](#caffeine-lbl's-implementation-of-the-coarray-fortran-parallel-runtime-interface), of the Coarray Fortran Parallel Runtime Interface plans to support the following architectures: x86_64, PowerPC64, AArch64, with the possibility of supporting more as requested. Implementations of this interface is intended as an augmentation for the compiler's own runtime library. While the interface can support multiple implementations, we envision needing to build the runtime library as part of installing the compiler.
+  The Coarray Fortran Parallel Runtime Interface is a proposed interface in which the runtime library is responsible for coarray allocation, deallocation and accesses, image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the source code to add Fortran procedure calls to the necessary runtime library procedures. Below you can find a table showing the delegation of tasks between the compiler and the runtime library. The interface is designed for portability across shared and distributed memory machines, different operating systems, and multiple architectures. The Caffeine implementation, [see below](#caffeine-lbl's-implementation-of-the-coarray-fortran-parallel-runtime-interface), of the Coarray Fortran Parallel Runtime Interface plans to support the following architectures: x86_64, PowerPC64, AArch64, with the possibility of supporting more as requested. Implementations of this interface is intended as an augmentation for the compiler's own runtime library. While the interface can support multiple implementations, we envision needing to build the runtime library as part of installing the compiler. REMOVE_NOTE_TODO: write sentence about how a module will be defined with the procedures. name of module must be: caf_pri.
 
 ## Delegation of tasks between the Fortran compiler and the runtime library
 
@@ -255,8 +255,9 @@ The following table outlines which tasks will be the responsibility of the Fortr
     ```
   * **Result**: `exit_code` is an `integer` whose value ... (REMOVE_NOTE_TODO: fill in)
 
+  REMOVE_NOTE: remove caf_finalize, because it has been determined that it iss redundant with caf_stop
  #### `caf_finalize`
-  * **Description**: This procedure will terminate the Coarray Fortran environment.
+  * **Description**: This procedure may or may terminate the Coarray Fortran environment. REMOVE_NOTE_TODO_DECISION: does it terminate for sure or not? can caf_init be called twice?
   * **Procedure Interface**:
     ```
       subroutine caf_finalize(exit_code)
@@ -294,6 +295,8 @@ The following table outlines which tasks will be the responsibility of the Fortr
     ```
   * **Further argument descriptions**:
     * **`stop_code_int` and `stop_code_char`**: shall not both be present in the same call (if provide only one procedure instead of overloading caf_stop)
+          * In `caf_stop`, runtime library will need to call c_exit() REMOVE_NOTE_TODO: fix this note
+
 
  #### `caf_fail_image`
   * **Description**:
@@ -313,10 +316,11 @@ The following table outlines which tasks will be the responsibility of the Fortr
     ```
       subroutine caf_allocate(co_lbounds, co_ubounds, lbounds, ubounds, coarray_handle, local_slice)
         implicit none
-        integer, kind(c_intmax_t), dimension(:), intent(in) :: co_lbounds, co_ubounds
-        integer, kind(c_intmax_t), dimension(:), intent(in) :: lbounds, ubounds
+        integer(kind=c_intmax_t), dimension(:), intent(in) :: co_lbounds, co_ubounds
+        integer(kind=c_intmax_t), dimension(:), intent(in) :: lbounds, ubounds
         type(caf_co_handle_t), intent(out) :: coarray_handle
-        type(*), dimension(..), allocatable, intent(out) :: local_slice
+        ! type(*), dimension(..), allocatable, intent(out) :: local_slice ! REMOVE_NOTE: when testing this interface out, didn't compile because `Assumed-type variable local_slice at (1) may not have the ALLOCATABLE, CODIMENSION, POINTER or VALUE attribute`
+        class(*), dimension(..), allocatable, intent(out) :: local_slice
       end subroutine
     ```
   * **Further argument descriptions**:
@@ -356,7 +360,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
     * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
     * assumed-rank array of `type(*)`
     * is (REMOVE_NOTE_TODO: fill in)
-  * **`source`**
+  * **`mold`**
     * Argument for [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
     * assumed-rank array of `type(*)`
     * is (REMOVE_NOTE_TODO: fill in)
@@ -369,11 +373,11 @@ The following table outlines which tasks will be the responsibility of the Fortr
   * **Description**: This procedure assigns to a coarray. The compiler shall call this procedure when there is a coarray reference that is a `coindexed-object`. The compiler shall not (REMOVE_NOTE: need to?) call this procedure when the coarray reference is not a `coindexed-object`. This procedure blocks on local completion. (REMOVE_NOTE: eventually would like a caf_put that doesn't block on local completion).
   * **Procedure Interface**:
     ```
-      subroutine caf_put(coarray_handle, coindices, target, value, team, team_number, stat)
+      subroutine caf_put(coarray_handle, coindices, mold, value, team, team_number, stat) !REMOVE_NOTE_TODO: make sure rename of target dummy arg to mold is changed in other places in doc as well
         implicit none
         type(caf_co_handle_t), intent(in) :: coarray_handle
         integer, intent(in) :: coindices(:)
-        type(*), dimension(..), intent(in) :: target, value
+        type(*), dimension(..), intent(in) :: mold, value
         type(team_type), optional, intent(in) :: team
         integer, optional, intent(in) :: team_number
         integer, optional, intent(out) :: stat
@@ -387,11 +391,11 @@ The following table outlines which tasks will be the responsibility of the Fortr
   * **Description**:
   * **Procedure Interface**:
     ```
-      subroutine caf_get(coarray_handle, coindices, source, value, team, team_number, stat)
+      subroutine caf_get(coarray_handle, coindices, mold, value, team, team_number, stat)
         implicit none
         type(caf_co_handle_t), intent(in) :: coarray_handle
         integer, intent(in) :: coindices(:)
-        type(*), dimension(..), intent(in) :: source
+        type(*), dimension(..), intent(in) :: mold
         type(*), dimension(..), intent(inout) :: value
         type(team_type), optional, intent(in) :: team
         integer, optional, intent(in) :: team_number
@@ -403,11 +407,11 @@ The following table outlines which tasks will be the responsibility of the Fortr
   * **Description**:
   * **Procedure Interface**:
     ```
-      subroutine caf_get_async(coarray_handle, coindices, source, value, async_handle, team, team_number, stat)
+      subroutine caf_get_async(coarray_handle, coindices, mold, value, async_handle, team, team_number, stat)
         implicit none
         type(caf_co_handle_t),  intent(in) :: coarray_handle
         integer, dimension(:),  intent(in) :: coindices
-        type(*), dimension(..), intent(in) :: source
+        type(*), dimension(..), intent(in) :: mold
         type(*), dimension(..), intent(inout) :: value
         type(caf_async_handle_t), intent(out) :: async_handle
         type(team_type), optional, intent(in) :: team
@@ -824,9 +828,9 @@ All atomic operations are blocking operations.
   * **Description**:
   * **Procedure Interface**:
     ```
-      subroutine caf_this_image(fill in...)
+      function caf_this_image(fill in...)
         implicit none
-      end subroutine
+      end function
     ```
   * **Further argument descriptions**:
 
@@ -954,6 +958,28 @@ TODO after writing example, try compiling it and see if it compiles at least unt
     - local, allocatable coarray -> compiler must insert caf_deallocate call
     - have multiple coarrays, with only one call to caf_deallocate since it takes an array of handles
 
+4. example with coarray initializer to express the idea written earlier that compiler is reponsible for this
+
+5. include somewhere in examples
+  ! integer :: example[2:4,3:*]
+  ! integer :: example[3:*]
+
+
+
+REMOVE_NOTE_TODO: FIX CAF_ALLOCATE LINK
+
+CAF_ALLOCATE implementation notes:
+  ! figure out how much space it needs, c_size_of
+  ! internally consult its own allocater
+  ! will know the memory address
+  ! associate the allocatable local slice with the local memory
+  ! once caf_allocate returns, example_local_slice will refer to the local slice of the coarray that is in the shared heap
+  ! create meta data block that contains info about a given coarray, stash away the cobounds
+
+
+UNIT TESTS TODOs
+  unit test with user code tries to directly call caf_{...} procedures, should be possible for user to do so
+
 flexible array member in c
 
 ### Caffeine internals for coarray accesses

>From b322570b2f29b5156a20710afb32571751f91435 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Mon, 14 Aug 2023 10:18:13 -0700
Subject: [PATCH 26/33] Update CAF_PRI design doc based on changes made while
 prototyping the interface. Includes the addition of an interface for
 `caf_allocate_non_symmetric`.

---
 flang/docs/CoarrayFortranRuntime.md | 53 ++++++++++++++++++++++-------
 1 file changed, 41 insertions(+), 12 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 4cb6ea2dff2bef..b0cff4f404e574 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -58,6 +58,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
 | Track locals coarrays for implicit deallocation when exiting a scope |     ✓     |           |
 | Initialize a coarray with SOURCE= as part of allocate-stmt |     ✓     |           |
 | Provide unique identifiers for location of each `critical-construct` |     ✓     |           |
+| Provide final subroutine for all derived types with allocatable components that appear in a coarray |     ✓     |           |
 | Track coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
 | Allocate and deallocate a coarray       |           |     ✓     |
 | Reference a coindexed-object            |           |     ✓     |
@@ -142,6 +143,11 @@ The following table outlines which tasks will be the responsibility of the Fortr
 
 ## Procedure descriptions
 
+### sync-stat-list
+
+  * **`stat`** : TODO
+  * **`errmsg`** : There will be two optional arguments for this, one which is allocatable and one which is not. It is the compiler's responsibility to ensure the appropriate optional argument is passed. The allocatable argument will satisfy Fortan 2023 semantics.
+
 ### Collectives
 
  #### Common arguments
@@ -314,20 +320,36 @@ The following table outlines which tasks will be the responsibility of the Fortr
   * **Description**: This procedure allocates memory for a coarray. Calls to `caf_allocate` will be inserted by the compiler when there is an explicit coarray allocation or a statically declared coarray in the source code. The runtime library will stash away the coshape information at this time in order to internally track it during the lifetime of the coarray.
   * **Procedure Interface**:
     ```
-      subroutine caf_allocate(co_lbounds, co_ubounds, lbounds, ubounds, coarray_handle, local_slice)
+      subroutine caf_allocate(co_lbounds, co_ubounds, lbounds, ubounds, element_length, final_func, coarray_handle, allocated_memory)
         implicit none
         integer(kind=c_intmax_t), dimension(:), intent(in) :: co_lbounds, co_ubounds
         integer(kind=c_intmax_t), dimension(:), intent(in) :: lbounds, ubounds
+        integer(kind=c_size_t) :: element_length
+        type(c_funptr), intent(in) :: final_func
         type(caf_co_handle_t), intent(out) :: coarray_handle
-        ! type(*), dimension(..), allocatable, intent(out) :: local_slice ! REMOVE_NOTE: when testing this interface out, didn't compile because `Assumed-type variable local_slice at (1) may not have the ALLOCATABLE, CODIMENSION, POINTER or VALUE attribute`
-        class(*), dimension(..), allocatable, intent(out) :: local_slice
+        type(c_ptr), intent(out) :: allocated_memory
       end subroutine
     ```
   * **Further argument descriptions**:
-    * **`co_lbounds` and `co_ubounds`**: Shall be the the lower and upper bounds of the coarray being allocated. Shall be 1d arrays with the same dimensions as each other. The product of the difference of the `co_lbounds` and `co_ubounds` shall equal the number of team members (REMOVE_NOTE_TODO: check wording).
+    * **`co_lbounds` and `co_ubounds`**: Shall be the lower and upper bounds of the coarray being allocated. Shall be 1d arrays with the same dimensions as each other. The product of the difference of the `co_lbounds` and `co_ubounds` shall equal the number of team members (REMOVE_NOTE_TODO: check wording).
     * **`lbounds` and `ubounds`**: Shall be the the lower and upper bounds of the `local_slice`. Shall be 1d arrays with the same dimensions as each other.
+    * **`element_length`**: Length of the element in the coarray (REMOVE_NOTE_TODO: reword)
+    * **`final_func`**: (REMOVE_NOTE_TODO: fill in)
     * **`coarray_handle`**: Represents the distributed object of the coarray on the corresponding team. Shall return the handle created by the runtime library that the compiler shall use for future coindexed-object references of the associated coarray and for deallocation of the associated coarray.
-    * **`local_slice`**: Shall not be allocated on entry. Shall return the
+    * **`allocated_memory`**: Shall not be allocated on entry. Shall return the
+
+ #### `caf_allocate_non_symmetric`
+  * **Description**: Use to allocate components of coarray objects. If the object to be allocated is polymorphic, it is the compiler's responsibility to set the dynamic type and it shall not be accessed remotely.
+  * **Procedure Interface**:
+    ```
+     module subroutine caf_allocate_non_symmetric(size_in_bytes, allocated_memory)
+       implicit none
+       integer(kind=c_size_t) :: size_in_bytes
+       type(c_ptr), intent(out) :: allocated_memory
+     end subroutine
+    ```
+  * **Further argument descriptions**:
+    * **``**:
 
  #### `caf_deallocate`
   * **Description**: This procedure releases memory previously allocated for all of the coarrays associated with the handles in `coarray_handles`, resulting in the destruction of any associated `local_slices` received by the compiler after `caf_allocate` calls.  (REMOVE_NOTE_TODO: reword) The compiler will insert calls to this procedure when exiting a local scope where implicit deallocation of a coarray is mandated by the standard and when a coarray is explicitly deallocated through a `deallocate-stmt` in the source code.
@@ -466,6 +488,9 @@ The following table outlines which tasks will be the responsibility of the Fortr
 
 ### Image Synchronization
 
+  Compilers need an "optimization fence",
+  inline assembly available in c/c++ to provide optimization fences
+
  #### `caf_sync_all`
   * **Description**:
   * **Procedure Interface**:
@@ -506,24 +531,26 @@ The following table outlines which tasks will be the responsibility of the Fortr
   * **Further argument descriptions**:
 
  #### `caf_critical`
-  * **Description**:
+  * **Description**: For each critical construct, the compiler shall define a coarray that shall only be used to begin and end the critical block. The coarray shall be a scalar coarray of type lock_type and the associated coarray handle shall be passed to the procedure.
   * **Procedure Interface**:
     ```
-      subroutine caf_critical(critical_id, REMOVE_NOTE_TODO: fill in)
+      subroutine caf_critical(critical_coarray, stat, REMOVE_NOTE_TODO: fill in)
         implicit none
-        integer(kind=c_int64_t), intent(in) :: critical_id
+        type(caf_co_handle_t), intent(in) :: critical_coarray
+        integer, optional, intent(out) :: stat
       end subroutine
     ```
   * **Further argument descriptions**:
-    * **`critical_id`**: shall be a unique identifier for a critical construct. This unique identifier will be used by the runtime library to track the location of the critical construct blocks.
+    * **`critical_coarray`**:
 
 #### `caf_end_critical`
   * **Description**:
   * **Procedure Interface**:
     ```
-      subroutine caf_end_critical(critical_id, REMOVE_NOTE_TODO: fill in)
+      subroutine caf_end_critical(critical_coarray, REMOVE_NOTE_TODO: fill in)
         implicit none
-        integer(kind=c_int64_t), intent(in) :: critical_id
+        type(caf_co_handle_t), intent(in) :: critical_coarray
+        integer, optional, intent(out) :: stat
       end subroutine
     ```
   * **Further argument descriptions**:
@@ -874,12 +901,13 @@ All atomic operations are blocking operations.
   - REMOVE_NOTE_TODO_DECISION: Need to decide the thread semantics
   - REMOVE_NOTE_TODO_DECISION: Are we going to have Caffeine be thread safe? Have a thread safety option? Is it a build time option? or runtime?
         Dan advocates having a thread-safety option that is build time.
+        Dan advocates having a debug versus optimized compliation mode (Debug mode - does sanity checks of preconditions)
   - Do we need to add any discussion of what it would look like when code has mixed OpenMP and Coarray Fortran?
 
  - boilerplate was added for all of the interfaces and initially everything was made a subroutine
  - when all the interfaces are done, check them to make sure there were no interfaces created that could be a function, but were left a subroutine because forgot to change that aspect of the boilerplate
 
-  - critical construct - need to have a state saying whether we are in a critical construct or not, if we are in one, then we can ignore any further nested critical constructs.
+  - critical construct - if we track state saying whether we are in a critical construct or not, then we can have additional checks for non compliant behavior that are done when in debug mode
 
   - Search for REMOVE_NOTE_TODO_DECISION to find locations where specific decisions/options are outlined, but not yet made.
   - Search for, resolve, and remove all REMOVE_NOTE and REMOVE_NOTE_TODO_DECISIONS before finalizing this document.
@@ -979,6 +1007,7 @@ CAF_ALLOCATE implementation notes:
 
 UNIT TESTS TODOs
   unit test with user code tries to directly call caf_{...} procedures, should be possible for user to do so
+  tests that check to make sure copy in copy out semantics are NOT occurring in the places that it matters
 
 flexible array member in c
 

>From 3448d4731937cb319681222570963f1cbee2fd7e Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Mon, 14 Aug 2023 11:42:45 -0700
Subject: [PATCH 27/33] Update and add interfaces based on prototyping. Update
 interface for `caf_put` and add interfaces for `caf_put_strided` and
 `caf_deallocate_non_symmetric`.

---
 flang/docs/CoarrayFortranRuntime.md | 57 ++++++++++++++++++++++++-----
 1 file changed, 48 insertions(+), 9 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index b0cff4f404e574..42bf51c5ae6f9f 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -363,6 +363,18 @@ The following table outlines which tasks will be the responsibility of the Fortr
   * **Argument descriptions**:
     * **`coarray_handles`**: Is an array of all of the handles for the coarrays that shall be deallocated.
 
+ #### `caf_deallocate_non_symmetric`
+  * **Description**: TODO: fill in
+  * **Procedure Interface**:
+    ```
+      subroutine caf_deallocate_non_symmetric(mem)
+        implicit none
+        type(c_ptr), intent(in) :: mem
+      end subroutine
+    ```
+  * **Argument descriptions**:
+    * **`mem`**:
+
 
 ### Coarray Access
 
@@ -395,20 +407,47 @@ The following table outlines which tasks will be the responsibility of the Fortr
   * **Description**: This procedure assigns to a coarray. The compiler shall call this procedure when there is a coarray reference that is a `coindexed-object`. The compiler shall not (REMOVE_NOTE: need to?) call this procedure when the coarray reference is not a `coindexed-object`. This procedure blocks on local completion. (REMOVE_NOTE: eventually would like a caf_put that doesn't block on local completion).
   * **Procedure Interface**:
     ```
-      subroutine caf_put(coarray_handle, coindices, mold, value, team, team_number, stat) !REMOVE_NOTE_TODO: make sure rename of target dummy arg to mold is changed in other places in doc as well
-        implicit none
-        type(caf_co_handle_t), intent(in) :: coarray_handle
-        integer, intent(in) :: coindices(:)
-        type(*), dimension(..), intent(in) :: mold, value
-        type(team_type), optional, intent(in) :: team
-        integer, optional, intent(in) :: team_number
-        integer, optional, intent(out) :: stat
-      end subroutine
+     ! both sides are contiguous
+     subroutine caf_put(coarray_handle, coindices, value, element_storage_size, first_element_addr, team, team_number, stat)
+       implicit none
+       type(caf_co_handle_t), intent(in) :: coarray_handle
+       integer, intent(in) :: coindices(:)
+       type(*), dimension(..), intent(in), contiguous :: value
+       integer(kind=c_size_t), intent(in) :: element_storage_size
+       type(c_ptr), intent(in) :: first_element_addr ! represents the address in the local slice corresponding to where the first element lives on the remote
+       type(team_type), optional, intent(in) :: team
+       integer, optional, intent(in) :: team_number
+       integer, optional, intent(out) :: stat
+     end subroutine caf_put
     ```
   * **Further argument descriptions**:
     * **`value`**: The value that shall be assigned to (REMOVE_NOTE_TODO: fill in)
 
 
+ #### `caf_put_strided`
+  * **Description**: TODO: fill in
+  * **Procedure Interface**:
+    ```
+     ! more general case
+     subroutine caf_put_strided( &
+         coarray_handle, coindices, value, element_storage_size, &
+         first_element_addr, extent, stride, team, team_number, stat)
+       implicit none
+       type(caf_co_handle_t), intent(in) :: coarray_handle
+       integer, intent(in) :: coindices(:)
+       type(*), dimension(..), intent(in) :: value
+       integer(kind=c_size_t), intent(in) :: element_storage_size
+       type(c_ptr), intent(in) :: first_element_addr ! represents the address in the local slice corresponding to where the first element lives on the remote
+       integer(kind=c_size_t) :: extent(:)
+       integer(kind=c_ptrdiff_t) :: stride(:)
+       type(team_type), optional, intent(in) :: team
+       integer, optional, intent(in) :: team_number
+       integer, optional, intent(out) :: stat
+     end subroutine
+    ```
+  * **Further argument descriptions**:
+    * **`coarray_handle`**:
+
  #### `caf_get`
   * **Description**:
   * **Procedure Interface**:

>From b7b4288d3fb3a2260718444e2ef3e65861c944fc Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Mon, 14 Aug 2023 11:47:44 -0700
Subject: [PATCH 28/33] Add interfaces for `caf_put_raw` and
 `caf_put_raw_strided`.

---
 flang/docs/CoarrayFortranRuntime.md | 32 +++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 42bf51c5ae6f9f..3db4ee60776df7 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -81,6 +81,8 @@ The following table outlines which tasks will be the responsibility of the Fortr
 
 ## Runtime Interface Procedures
 
+TODO: Update this list and links with recently added procedures
+
    **Collectives:**
      [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum)
 
@@ -448,6 +450,36 @@ The following table outlines which tasks will be the responsibility of the Fortr
   * **Further argument descriptions**:
     * **`coarray_handle`**:
 
+ #### `caf_put_raw`
+  * **Description**: Assign to size number of bytes from image named by coindicies starting at remote pointer, copying from local_buffer.
+  * **Procedure Interface**:
+    ```
+      subroutine caf_put_raw(image_num, local_buffer, remote_ptr, size, stat)
+        implicit none
+        integer(kind=c_int), intent(in) :: image_num
+        type(c_ptr), intent(in) :: local_buffer
+        integer(kind=c_int64_t), intent(in) :: remote_ptr
+        integer(kind=c_size_t), intent(in) :: size
+        integer, optional, intent(out) :: stat
+      end subroutine
+    ```
+
+ #### `caf_put_raw_strided`
+  * **Description**: TODO: (reword) Assign to size number of bytes from image named by coindicies starting at remote pointer, use stride to assign, copying from local_buffer.
+  * **Procedure Interface**:
+    ```
+       subroutine caf_put_raw_strided(image_num, local_buffer, remote_ptr, size, extent, remote_ptr_stride, local_buffer_stride, stat)
+        implicit none
+        integer(kind=c_int), intent(in) :: image_num
+        type(c_ptr), intent(in) :: local_buffer
+        integer(kind=c_int64_t), intent(in) :: remote_ptr
+        integer(kind=c_size_t), intent(in) :: size
+        integer(kind=c_size_t) :: extent(:)
+        integer(kind=c_ptrdiff_t) :: remote_ptr_stride(:), local_buffer_stride(:)
+        integer, optional, intent(out) :: stat
+      end subroutine
+    ```
+
  #### `caf_get`
   * **Description**:
   * **Procedure Interface**:

>From aab8719d210417fe639184615dd3574e5f7e1109 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Mon, 28 Aug 2023 17:43:43 -0700
Subject: [PATCH 29/33] Update design doc with interfaces for two atomics and
 update other interfaces, such as caf_critical and caf_end_critical.

---
 flang/docs/CoarrayFortranRuntime.md | 111 ++++++++++++++++++++++------
 1 file changed, 89 insertions(+), 22 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 3db4ee60776df7..76699ae7c447a2 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -81,8 +81,6 @@ The following table outlines which tasks will be the responsibility of the Fortr
 
 ## Runtime Interface Procedures
 
-TODO: Update this list and links with recently added procedures
-
    **Collectives:**
      [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum)
 
@@ -90,10 +88,10 @@ TODO: Update this list and links with recently added procedures
      [`caf_init`](#caf_init), [`caf_finalize`](#caf_finalize), [`caf_error_stop`](#caf_error_stop), [`caf_stop`](#caf_stop), [`caf_fail_image`](#caf_fail_image)
 
    **Allocation and deallocation:**
-     [`caf_allocate`](#caf_allocate), [`caf_deallocate`](#caf_deallocate)
+     [`caf_allocate`](#caf_allocate), [`caf_deallocate`](#caf_deallocate), [`caf_allocate_non_symmetric`](#caf_allocate_non_symmetric), [`caf_deallocate_non_symmetric`](#caf_deallocate_non_symmetric)
 
    **Coarray Access:**
-     [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
+     [`caf_put`](#caf_put), [`caf_put_raw`](#caf_put_raw), [`caf_put_raw_strided`](#caf_put_raw_strided), [`caf_get`](#caf_get), [`caf_get_raw`](#caf_get_raw), [`caf_get_raw_strided`](#caf_get_raw_strided), [`caf_get_async`](#caf_get_async), [`caf_base_pointer`](#caf_base_pointer)
 
    **Operation Synchronization:**
      [`caf_async_wait_for`](#caf_aync_wait_for), [`caf_async_try_for`](#caf_async_try_for), [`caf_sync_memory`](#caf_sync_memory)
@@ -335,13 +333,13 @@ TODO: Update this list and links with recently added procedures
   * **Further argument descriptions**:
     * **`co_lbounds` and `co_ubounds`**: Shall be the lower and upper bounds of the coarray being allocated. Shall be 1d arrays with the same dimensions as each other. The product of the difference of the `co_lbounds` and `co_ubounds` shall equal the number of team members (REMOVE_NOTE_TODO: check wording).
     * **`lbounds` and `ubounds`**: Shall be the the lower and upper bounds of the `local_slice`. Shall be 1d arrays with the same dimensions as each other.
-    * **`element_length`**: Length of the element in the coarray (REMOVE_NOTE_TODO: reword)
-    * **`final_func`**: (REMOVE_NOTE_TODO: fill in)
+    * **`element_length`**: Length of the element
+    * **`final_func`**: Shall be a function pointer to the final function, if any, for derived types
     * **`coarray_handle`**: Represents the distributed object of the coarray on the corresponding team. Shall return the handle created by the runtime library that the compiler shall use for future coindexed-object references of the associated coarray and for deallocation of the associated coarray.
-    * **`allocated_memory`**: Shall not be allocated on entry. Shall return the
+    * **`allocated_memory`**: Shall not be allocated on entry. Shall return a pointer to the block of allocated memory for the Fortran object.
 
  #### `caf_allocate_non_symmetric`
-  * **Description**: Use to allocate components of coarray objects. If the object to be allocated is polymorphic, it is the compiler's responsibility to set the dynamic type and it shall not be accessed remotely.
+  * **Description**: This procedure allocates components of coarray objects. If the object to be allocated is polymorphic, it is the compiler's responsibility to set the dynamic type and it shall not be accessed remotely.
   * **Procedure Interface**:
     ```
      module subroutine caf_allocate_non_symmetric(size_in_bytes, allocated_memory)
@@ -351,7 +349,8 @@ TODO: Update this list and links with recently added procedures
      end subroutine
     ```
   * **Further argument descriptions**:
-    * **``**:
+    * **`size_in_bytes`**: The size, in bytes, of the object to be allocated.
+    * **`allocated_memory`**: Shall not be allocated on entry. Shall return a pointer to the block of allocated memory for the Fortran object.
 
  #### `caf_deallocate`
   * **Description**: This procedure releases memory previously allocated for all of the coarrays associated with the handles in `coarray_handles`, resulting in the destruction of any associated `local_slices` received by the compiler after `caf_allocate` calls.  (REMOVE_NOTE_TODO: reword) The compiler will insert calls to this procedure when exiting a local scope where implicit deallocation of a coarray is mandated by the standard and when a coarray is explicitly deallocated through a `deallocate-stmt` in the source code.
@@ -366,7 +365,7 @@ TODO: Update this list and links with recently added procedures
     * **`coarray_handles`**: Is an array of all of the handles for the coarrays that shall be deallocated.
 
  #### `caf_deallocate_non_symmetric`
-  * **Description**: TODO: fill in
+  * **Description**: This procedure releases memory previously allocated for a component of a derived type coarray.
   * **Procedure Interface**:
     ```
       subroutine caf_deallocate_non_symmetric(mem)
@@ -375,7 +374,7 @@ TODO: Update this list and links with recently added procedures
       end subroutine
     ```
   * **Argument descriptions**:
-    * **`mem`**:
+    * **`mem`**: Pointer to the block of memory to be released.
 
 
 ### Coarray Access
@@ -406,10 +405,9 @@ TODO: Update this list and links with recently added procedures
     * shall not both be present in the same call
 
  #### `caf_put`
-  * **Description**: This procedure assigns to a coarray. The compiler shall call this procedure when there is a coarray reference that is a `coindexed-object`. The compiler shall not (REMOVE_NOTE: need to?) call this procedure when the coarray reference is not a `coindexed-object`. This procedure blocks on local completion. (REMOVE_NOTE: eventually would like a caf_put that doesn't block on local completion).
+  * **Description**: This procedure assigns to a coarray, when both sides of the assignment are contiguous. The compiler shall call this procedure when there is a coarray reference that is a `coindexed-object`. The compiler shall not (REMOVE_NOTE: need to?) call this procedure when the coarray reference is not a `coindexed-object`. This procedure blocks on local completion. (REMOVE_NOTE: eventually would like a caf_put that doesn't block on local completion).
   * **Procedure Interface**:
     ```
-     ! both sides are contiguous
      subroutine caf_put(coarray_handle, coindices, value, element_storage_size, first_element_addr, team, team_number, stat)
        implicit none
        type(caf_co_handle_t), intent(in) :: coarray_handle
@@ -513,6 +511,50 @@ TODO: Update this list and links with recently added procedures
       end subroutine
     ```
 
+ #### `caf_get_raw`
+  * **Description**:
+  * **Procedure Interface**:
+    ```
+      subroutine caf_get_raw(image_num, local_buffer, remote_ptr, size, stat)
+        implicit none
+        integer(kind=c_int), intent(in) :: image_num
+        type(c_ptr), intent(in) :: local_buffer
+        integer(kind=c_int64_t), intent(in) :: remote_ptr
+        integer(kind=c_size_t), intent(in) :: size
+        integer, optional, intent(out) :: stat
+      end subroutine
+    ```
+
+ #### `caf_get_raw_strided`
+  * **Description**:
+  * **Procedure Interface**:
+    ```
+      subroutine caf_get_raw_strided(image_num, local_buffer, remote_ptr, size, extent, remote_ptr_stride, local_buffer_stride, stat)
+        implicit none
+        integer(kind=c_int), intent(in) :: image_num
+        type(c_ptr), intent(in) :: local_buffer
+        integer(kind=c_int64_t), intent(in) :: remote_ptr
+        integer(kind=c_size_t), intent(in) :: size
+        integer(kind=c_size_t) :: extent(:)
+        integer(kind=c_ptrdiff_t) :: remote_ptr_stride(:), local_buffer_stride(:)
+        integer, optional, intent(out) :: stat
+      end subroutine
+    ```
+
+ #### `caf_base_pointer`
+  * **Description**: This procedure provides a pointer to the base of the coarray elements on a given image and may be used in conjunction with caf_get_raw
+  * **Procedure Interface**:
+    ```
+      function caf_base_pointer(coarray_handle, coindices, team, team_number, stat) result (raw_ptr_int)
+        implicit none
+        type(caf_co_handle_t), intent(in) :: coarray_handle
+        integer, intent(in) :: coindices(:)
+        integer(kind=c_int64_t) :: raw_ptr_int
+        type(team_type), optional, intent(in) :: team
+        integer, optional, intent(in) :: team_number
+        integer, optional, intent(out) :: stat
+      end function
+    ```
 
 ###  Operation Synchronization
 
@@ -602,10 +644,10 @@ TODO: Update this list and links with recently added procedures
   * **Further argument descriptions**:
 
  #### `caf_critical`
-  * **Description**: For each critical construct, the compiler shall define a coarray that shall only be used to begin and end the critical block. The coarray shall be a scalar coarray of type lock_type and the associated coarray handle shall be passed to the procedure.
+  * **Description**: For each critical construct, the compiler shall define a coarray that shall only be used to begin and end the critical block. The coarray shall be a scalar coarray of type `lock_type` and the associated coarray handle shall be passed to the procedure.
   * **Procedure Interface**:
     ```
-      subroutine caf_critical(critical_coarray, stat, REMOVE_NOTE_TODO: fill in)
+      subroutine caf_critical(critical_coarray, stat)
         implicit none
         type(caf_co_handle_t), intent(in) :: critical_coarray
         integer, optional, intent(out) :: stat
@@ -618,10 +660,9 @@ TODO: Update this list and links with recently added procedures
   * **Description**:
   * **Procedure Interface**:
     ```
-      subroutine caf_end_critical(critical_coarray, REMOVE_NOTE_TODO: fill in)
+      subroutine caf_end_critical(critical_coarray)
         implicit none
         type(caf_co_handle_t), intent(in) :: critical_coarray
-        integer, optional, intent(out) :: stat
       end subroutine
     ```
   * **Further argument descriptions**:
@@ -775,10 +816,24 @@ All atomic operations are blocking operations.
 
  #### `caf_atomic_cas`
   * **Description**:
-  * **Procedure Interface**:
+  * **Procedure Interfaces**:
     ```
-      subroutine caf_atomic_cas(fill in...)
+      subroutine caf_atomic_cas_int_raw(image_num, atom_remote_ptr, old, compare, new, stat)
         implicit none
+        integer, intent(in) :: image_num
+        integer(kind=c_int64_t), intent(in) :: atom_remote_ptr
+        integer(kind=atomic_int_kind), intent(in)  :: compare, new
+        integer(kind=atomic_int_kind), intent(out) :: old
+        integer, optional, intent(out) :: stat
+      end subroutine
+
+      subroutine caf_atomic_cas_logical_raw(image_num, atom_remote_ptr, old, compare, new, stat)
+        implicit none
+        integer, intent(in) :: image_num
+        integer(kind=c_int64_t), intent(in) :: atom_remote_ptr
+        logical(kind=atomic_logical_kind), intent(in)  :: compare, new
+        logical(kind=atomic_logical_kind), intent(out) :: old
+        integer, optional, intent(out) :: stat
       end subroutine
     ```
   * **Further argument descriptions**:
@@ -797,8 +852,13 @@ All atomic operations are blocking operations.
   * **Description**:
   * **Procedure Interface**:
     ```
-      subroutine caf_atomic_fetch_add(fill in...)
+      subroutine caf_atomic_fetch_add_raw(image_num, atom_remote_ptr, value, old, stat)
         implicit none
+        integer, intent(in) :: image_num
+        integer(kind=c_int64_t), intent(in) :: atom_remote_ptr
+        integer(kind=atomic_int_kind), intent(in)  :: value
+        integer(kind=atomic_int_kind), intent(out) :: old
+        integer, optional, intent(out) :: stat
       end subroutine
     ```
   * **Further argument descriptions**:
@@ -899,9 +959,12 @@ All atomic operations are blocking operations.
   * **Description**:
   * **Procedure Interface**:
     ```
-      subroutine caf_image_index(fill in...)
+      function caf_image_index(coarray_handle, coindices)
         implicit none
-      end subroutine
+        type(caf_co_handle_t), intent(in) :: coarray_handle
+        integer, intent(in) :: coindices(:)
+        integer(kind=c_int) :: caf_image_index
+      end function
     ```
   * **Further argument descriptions**:
 
@@ -978,6 +1041,10 @@ All atomic operations are blocking operations.
  - boilerplate was added for all of the interfaces and initially everything was made a subroutine
  - when all the interfaces are done, check them to make sure there were no interfaces created that could be a function, but were left a subroutine because forgot to change that aspect of the boilerplate
 
+  - TODO: Question, reference standard: Can you explicitly deallocate a coarray in a child team that has been allocated by the parent team?
+  - need to understand more about how the change team construct affects allocation
+
+
   - critical construct - if we track state saying whether we are in a critical construct or not, then we can have additional checks for non compliant behavior that are done when in debug mode
 
   - Search for REMOVE_NOTE_TODO_DECISION to find locations where specific decisions/options are outlined, but not yet made.

>From 4925808a70b2bcfef1676c9140a7c3ea1684f753 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Thu, 21 Sep 2023 08:43:36 -0700
Subject: [PATCH 30/33] Rename caf_finalize to caf_stop, leaving with two
 current options for caf_stop to be resolved in the future.

---
 flang/docs/CoarrayFortranRuntime.md | 34 ++++++++++++++---------------
 1 file changed, 16 insertions(+), 18 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 76699ae7c447a2..035375de32bef6 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -85,7 +85,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
      [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum)
 
    **Program startup and shutdown:**
-     [`caf_init`](#caf_init), [`caf_finalize`](#caf_finalize), [`caf_error_stop`](#caf_error_stop), [`caf_stop`](#caf_stop), [`caf_fail_image`](#caf_fail_image)
+     [`caf_init`](#caf_init), [`caf_stop`](#caf_stop), [`caf_error_stop`](#caf_error_stop), [`caf_fail_image`](#caf_fail_image)
 
    **Allocation and deallocation:**
      [`caf_allocate`](#caf_allocate), [`caf_deallocate`](#caf_deallocate), [`caf_allocate_non_symmetric`](#caf_allocate_non_symmetric), [`caf_deallocate_non_symmetric`](#caf_deallocate_non_symmetric)
@@ -248,7 +248,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
 
 ### Program startup and shutdown
 
-  When the compiler identifies a program that uses "Coarray Fortran" features, it will insert calls to `caf_init` and `caf_finalize`. These procedures will initialize and terminate the Coarray Fortran environment.
+  When the compiler identifies a program that uses "Coarray Fortran" features, it will insert calls to `caf_init` and `caf_stop`. These procedures will initialize and terminate the Coarray Fortran environment.
 
  #### `caf_init`
   * **Description**: This procedure will initialize the Coarray Fortran environment.
@@ -261,12 +261,12 @@ The following table outlines which tasks will be the responsibility of the Fortr
     ```
   * **Result**: `exit_code` is an `integer` whose value ... (REMOVE_NOTE_TODO: fill in)
 
-  REMOVE_NOTE: remove caf_finalize, because it has been determined that it iss redundant with caf_stop
- #### `caf_finalize`
+REMOVE_NOTE_TODO: resolve the following two options for caf_stop
+ #### `caf_stop`
   * **Description**: This procedure may or may terminate the Coarray Fortran environment. REMOVE_NOTE_TODO_DECISION: does it terminate for sure or not? can caf_init be called twice?
   * **Procedure Interface**:
     ```
-      subroutine caf_finalize(exit_code)
+      subroutine caf_stop(exit_code)
         implicit none
         integer, intent(in) :: exit_code
       end subroutine
@@ -276,19 +276,6 @@ The following table outlines which tasks will be the responsibility of the Fortr
 
   (REMOVE_NOTE_TODO: check the interfaces for caf_error_stop and caf_stop, currently are same as the procedures in Caffeine, but these interfaces have not yet been discussed and decided upon for the Coarray Fortran Parallel Runtime Library Interface. May need to add something? Change something?)
 
- #### `caf_error_stop`
-  * **Description**: This procedure stops all images and provides the `stop_code` passed, or `0` if no `stop_code` is passed, as the process exit status
-  * **Procedure Interface**: REMOVE_NOTE_TODO_DECISION: should error_stop be implemented with two optional arguments with the precondition that they both shall not be passed at the same time? Or have overloaded procedures?
-    ```
-      subroutine caf_error_stop(stop_code_int, stop_code_char)
-        integer, intent(in), optional :: stop_code_int
-        character(len=*), intent(in), optional :: stop_code_char
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-    * **`stop_code_int` and `stop_code_char`**: shall not both be present in the same call (if provide only one procedure instead of overloading caf_error_stop)
-
- #### `caf_stop`
   * **Description**: This procedure synchronizes and stops the executing image. It provides the `stop_code` or `0` if no `stop_code` is passed, as the process exit status.
   * **Procedure Interface**:  REMOVE_NOTE_TODO_DECISION: should stop be implemented with two optional arguments with the precondition that they both shall not be passed at the same time? Or have overloaded procedures?
     ```
@@ -303,6 +290,17 @@ The following table outlines which tasks will be the responsibility of the Fortr
     * **`stop_code_int` and `stop_code_char`**: shall not both be present in the same call (if provide only one procedure instead of overloading caf_stop)
           * In `caf_stop`, runtime library will need to call c_exit() REMOVE_NOTE_TODO: fix this note
 
+ #### `caf_error_stop`
+  * **Description**: This procedure stops all images and provides the `stop_code` passed, or `0` if no `stop_code` is passed, as the process exit status
+  * **Procedure Interface**: REMOVE_NOTE_TODO_DECISION: should error_stop be implemented with two optional arguments with the precondition that they both shall not be passed at the same time? Or have overloaded procedures?
+    ```
+      subroutine caf_error_stop(stop_code_int, stop_code_char)
+        integer, intent(in), optional :: stop_code_int
+        character(len=*), intent(in), optional :: stop_code_char
+      end subroutine
+    ```
+  * **Further argument descriptions**:
+    * **`stop_code_int` and `stop_code_char`**: shall not both be present in the same call (if provide only one procedure instead of overloading caf_error_stop)
 
  #### `caf_fail_image`
   * **Description**:

>From 844ba58787683563a76d24cd032909dd8472125f Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Tue, 31 Oct 2023 10:48:02 -0700
Subject: [PATCH 31/33] Update coarray design doc with latest changes from
 prototyping. Includes updating interfaces related to teams and interfaces
 related to cobounds.

---
 flang/docs/CoarrayFortranRuntime.md | 226 ++++++++++------------------
 1 file changed, 80 insertions(+), 146 deletions(-)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/CoarrayFortranRuntime.md
index 035375de32bef6..4288a45000326e 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/CoarrayFortranRuntime.md
@@ -8,6 +8,11 @@
 # THIS IS A WORK IN PROGRESS - DECISIONS REGARDING THE DESIGNS DISCUSSED IN THIS DOCUMENT ARE ONGOING AND MAY CHANGE AND THE DOCUMENT IS INCOMPLETE
 
 
+#TODO:
+   add note in an appropriate place about move_alloc when args are coarrays, say that it is the compilers responsibility to use the handles
+provided from runtime library to do necessary swap
+
+
 # Problem description
   In order to be fully Fortran 2018 compliant, Flang needs to add support for what is commonly referred to as coarray fortran, which includes features related to parallelism. These features include the following statements, subroutines, functions, types, and kind type parameters:
 
@@ -103,7 +108,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
      [`caf_event_post`](#caf_event_post), [`caf_event_wait`](#caf_event_wait), [`caf_event_query`](#caf_event_query)
 
    **Teams:**
-     [`caf_change_team`](#caf_change_team), [`caf_end_team`](#caf_end_team), [`caf_form_team`](#caf_form_team), [`caf_sync_team`](#caf_sync_team), [`caf_get_team`](#caf_get_team), [`caf_team_number`](#caf_team_number)
+     [`caf_change_team`](#caf_change_team), [`caf_end_team`](#caf_end_team), [`caf_form_team`](#caf_form_team), [`caf_sync_team`](#caf_sync_team), [`caf_get_team`](#caf_get_team), [`caf_team_number`](#caf_team_number), [`caf_alias_create`](#caf_alias_create), [`caf_alias_destroy`](#caf_alias_destroy)
 
    **Atomic Memory Operation:**
      [`caf_atomic_add`](#caf_atomic_add), [`caf_atomic_and`](#caf_atomic_and), [`caf_atomic_cas`](#caf_atomic_cas), [`caf_atomic_define`](#caf_atomic_define), [`caf_atomic_fetch_add`](#caf_atomic_fetch_add), [`caf_atomic_fetch_and`](#caf_atomic_fetch_and), [`caf_atomic_fetch_or`](#caf_atomic_fetch_or), [`caf_atomic_fetch_xor`](#caf_atomic_fetch_xor), [`caf_atomic_or`](#caf_atomic_or), [`caf_atomic_ref`](#caf_atomic_ref), [`caf_atomic_xor`](#caf_atomic_xor)
@@ -318,9 +323,9 @@ REMOVE_NOTE_TODO: resolve the following two options for caf_stop
   * **Description**: This procedure allocates memory for a coarray. Calls to `caf_allocate` will be inserted by the compiler when there is an explicit coarray allocation or a statically declared coarray in the source code. The runtime library will stash away the coshape information at this time in order to internally track it during the lifetime of the coarray.
   * **Procedure Interface**:
     ```
-      subroutine caf_allocate(co_lbounds, co_ubounds, lbounds, ubounds, element_length, final_func, coarray_handle, allocated_memory)
+      subroutine caf_allocate(lcobounds, ucobounds, lbounds, ubounds, element_length, final_func, coarray_handle, allocated_memory)
         implicit none
-        integer(kind=c_intmax_t), dimension(:), intent(in) :: co_lbounds, co_ubounds
+        integer(kind=c_intmax_t), dimension(:), intent(in) :: lcobounds, ucobounds
         integer(kind=c_intmax_t), dimension(:), intent(in) :: lbounds, ubounds
         integer(kind=c_size_t) :: element_length
         type(c_funptr), intent(in) :: final_func
@@ -329,7 +334,7 @@ REMOVE_NOTE_TODO: resolve the following two options for caf_stop
       end subroutine
     ```
   * **Further argument descriptions**:
-    * **`co_lbounds` and `co_ubounds`**: Shall be the lower and upper bounds of the coarray being allocated. Shall be 1d arrays with the same dimensions as each other. The product of the difference of the `co_lbounds` and `co_ubounds` shall equal the number of team members (REMOVE_NOTE_TODO: check wording).
+    * **`lcobounds` and `ucobounds`**: Shall be the lower and upper bounds of the coarray being allocated. Shall be 1d arrays with the same dimensions as each other. The product of the difference of the `lcobounds` and `ucobounds` shall equal the number of team members (REMOVE_NOTE_TODO: check wording).
     * **`lbounds` and `ubounds`**: Shall be the the lower and upper bounds of the `local_slice`. Shall be 1d arrays with the same dimensions as each other.
     * **`element_length`**: Length of the element
     * **`final_func`**: Shall be a function pointer to the final function, if any, for derived types
@@ -699,25 +704,30 @@ REMOVE_NOTE_TODO: resolve the following two options for caf_stop
   * **Further argument descriptions**:
 
 ### Teams
-  (REMOVE_NOTE_TODO: check the interface for caf_change_team and caf_form_team, currently are same as the procedures in Caffeine, but these interfaces have not yet been discussed and decided upon for the Coarray Fortran Parallel Runtime Library Interface. May need to add something? Change something?)
+  TODO: add general points related to our team design
+  The only time we create a new handle for an established coarray is in a change-team-stmt. The only times we create handles is `caf_allocate`, and a `change-team-stmt`.
 
  #### `caf_change_team`
   * **Description**:
   * **Procedure Interface**:
     ```
-      subroutine caf_change_team(team)
+      subroutine caf_change_team(team_var, errmsg, stat)
         implicit none
-        type(team_type), target, intent(in) :: team
+        type(caf_team_type_t), intent(in) :: team_var
+        character(len=*), intent(out), optional :: errmsg
+        integer, intent(out), optional :: stat
       end subroutine
     ```
   * **Further argument descriptions**:
 
  #### `caf_end_team`
-  * **Description**:
+  * **Description**: During the execution of `caf_end_team`, the runtime library will implicitly synchronize, as the standard requires and deallocate any coarrays allocated during the change team construct. Prior to `caf_end_team`, the compiler is responsible for invoking [`caf_alias_destroy`](#caf_alias_destroy) for any `caf_co_handle_t` handles created during the change team construct.
   * **Procedure Interface**:
     ```
-      subroutine caf_end_team(fill in...)
+      subroutine caf_end_team(errmsg, stat)
         implicit none
+        character(len=*), intent(out), optional :: errmsg
+        integer, intent(out), optional :: stat
       end subroutine
     ```
   * **Further argument descriptions**:
@@ -726,13 +736,13 @@ REMOVE_NOTE_TODO: resolve the following two options for caf_stop
   * **Description**:
   * **Procedure Interface**:
     ```
-      subroutine caf_form_team(num, team, new_index, stat, errmsg)
+      subroutine caf_form_team(team_num, team_var, new_index, errmsg, stat)
         implicit none
-        integer,          intent(in)  :: num
-        type(team_type),  intent(out) :: team
-        integer,          intent(in),    optional :: new_index
-        integer,          intent(out),   optional :: stat
-        character(len=*), intent(inout), optional :: errmsg
+        integer(kind=c_int), intent(in) :: team_num
+        type(caf_team_type_t), intent(out) :: team_var
+        integer(kind=c_int), intent(in), optional :: new_index
+        character(len=*), intent(out), optional :: errmsg
+        integer, intent(out), optional :: stat
       end subroutine
     ```
   * **Further argument descriptions**:
@@ -767,6 +777,30 @@ REMOVE_NOTE_TODO: resolve the following two options for caf_stop
     ```
   * **Further argument descriptions**:
 
+ #### `caf_alias_create`
+  * **Description**: Create a new coarray handle that may be used for coarray queries about an aliased coarray, such as in a [`caf_change_team`](#caf_change_team)
+  * **Procedure Interface**:
+    ```
+      subroutine caf_alias_create(source_handle, alias_co_lbounds, alias_co_ubounds, alias_handle)
+        implicit none
+        type(caf_co_handle_t), intent(in) :: source_handle
+        integer(kind=c_intmax_t), dimension(:), intent(in) :: alias_co_lbounds, alias_co_ubounds
+        type(caf_co_handle_t), intent(out) :: alias_handle
+      end subroutine
+    ```
+  * **Further argument descriptions**:
+
+ #### `caf_alias_destroy`
+  * **Description**: REMOVE_NOTE_TODO: Keep the alias routines in the Teams section? Move to another more general section?
+  * **Procedure Interface**:
+    ```
+      subroutine caf_alias_destroy(alias_handle)
+        implicit none
+        type(caf_co_handle_t), intent(in) :: alias_handle
+      end subroutine
+    ```
+  * **Further argument descriptions**:
+
 ### Atomic Memory Operation
 
 All atomic operations are blocking operations.
@@ -924,21 +958,46 @@ All atomic operations are blocking operations.
 ### Coarray Queries
 
  #### `caf_lcobound`
-  * **Description**:
+  * **Description**: This procedure returns the lcobounds of the coarray referred to by the coarray_handle. The lcobounds will always be a 64bit integer and it is the compiler's responsibility to convert to a different kind, as needed by the user.
   * **Procedure Interface**:
     ```
-      subroutine caf_lcobound(fill in...)
-        implicit none
+      interface caf_lcobound
+        module procedure caf_lcobound_with_dim
+        module procedure caf_lcobound_no_dim
+      end interface caf_lcobound
+
+      subroutine caf_lcobound_with_dim(coarray_handle, dim, lcobound)
+        type(caf_co_handle_t), intent(in) :: coarray_handle
+        integer, intent(in) :: dim
+        integer(kind=c_int64_t), intent(out) :: lcobound
+      end subroutine
+
+      subroutine caf_lcobound_no_dim(coarray_handle, lcobounds)
+        type(caf_co_handle_t), intent(in) :: coarray_handle
+        integer(kind=c_int64_t), intent(out) :: lcobounds(:)
       end subroutine
     ```
   * **Further argument descriptions**:
 
  #### `caf_ucobound`
-  * **Description**:
+  * **Description**: This procedure returns the ucobounds of the coarray referred to by the coarray_handle. The ucobounds will always be a 64bit integer and it is the compiler's responsibility to convert to a different kind, as needed by the user.
+
   * **Procedure Interface**:
     ```
-      subroutine caf_ucobound(fill in...)
-        implicit none
+      interface caf_ucobound
+        module procedure caf_ucobound_with_dim
+        module procedure caf_ucobound_no_dim
+      end interface
+
+      subroutine caf_ucobound_with_dim(coarray_handle, dim, ucobound)
+        type(caf_co_handle_t), intent(in) :: coarray_handle
+        integer, intent(in) :: dim
+        integer(kind=c_int64_t), intent(out) :: ucobound
+      end subroutine
+
+      subroutine caf_ucobound_no_dim(coarray_handle, ucobounds)
+        type(caf_co_handle_t), intent(in) :: coarray_handle
+        integer(kind=c_int64_t), intent(out) :: ucobounds(:)
       end subroutine
     ```
   * **Further argument descriptions**:
@@ -1028,131 +1087,6 @@ All atomic operations are blocking operations.
 
   (REMOVE_NOTE: complete this section, potentially move to earlier in doc) Compiler will need to: call caf_init, call caf_allocate ... for each coarray and in the right order. And then copy any initializers.
 
-
-## Internal Development Notes: (REMOVE_NOTES before submission)
-  - REMOVE_NOTE_TODO_DECISION: Need to decide the thread semantics
-  - REMOVE_NOTE_TODO_DECISION: Are we going to have Caffeine be thread safe? Have a thread safety option? Is it a build time option? or runtime?
-        Dan advocates having a thread-safety option that is build time.
-        Dan advocates having a debug versus optimized compliation mode (Debug mode - does sanity checks of preconditions)
-  - Do we need to add any discussion of what it would look like when code has mixed OpenMP and Coarray Fortran?
-
- - boilerplate was added for all of the interfaces and initially everything was made a subroutine
- - when all the interfaces are done, check them to make sure there were no interfaces created that could be a function, but were left a subroutine because forgot to change that aspect of the boilerplate
-
-  - TODO: Question, reference standard: Can you explicitly deallocate a coarray in a child team that has been allocated by the parent team?
-  - need to understand more about how the change team construct affects allocation
-
-
-  - critical construct - if we track state saying whether we are in a critical construct or not, then we can have additional checks for non compliant behavior that are done when in debug mode
-
-  - Search for REMOVE_NOTE_TODO_DECISION to find locations where specific decisions/options are outlined, but not yet made.
-  - Search for, resolve, and remove all REMOVE_NOTE and REMOVE_NOTE_TODO_DECISIONS before finalizing this document.
-
-  - `caf_allocate` - after getting the basic necessities for this procedure sorted, considering adding a mold argument that would allow for dynamic typing for `local_slice`
-
-
-  * **Asynchrony:**
-    -   Could be handle based or fence based approaches
-    -   Handle based - return can individual operation handle, later on compiler synchronizes handle
-    -   Fence based - implicit handle operations, closer to MPI
-
-
-
-POTENTIAL RATIONALE TO PRESENT SOMEWHERE maybe
-The runtime library will handle critical constructs, and not expect the compiler to rewrite them as blocks with lock and unlock statements. This would be burdensome on the compiler because a lock_type variable would need to be declared, but as it needs to be a coarray, it would have to hoist its (REMOVE_NOTE: reword?!?!) declaration.
-
-Same non-blocking semantics (has to be started and finished in the same segment) will likely apply to collectives, use caf_wait_for, caf_try_for, etc
-Should change team and critical be non-blocking? sync-all?)
-
-
-Caffeine internal procedure, so not part of the CAF PRI.
- #### `caf_end_segment`
-  * **Description**: This procedure ends a segment. Any puts that are still in flight will be committed (and any caches will be thrown away REMOVE_NOTE_TODO_DECISION: if we decide to do caches). Calls to this procedure will be side effects of invocations of the image control statements. It is not a synchronizing operation.
-  * **Procedure Interface**:
-  ```
-    subroutine caf_end_segment()
-      implicit none
-      (REMOVE_NOTE: are there no arguments? or is it just that we haven't sketched out the args yet?)
-    end subroutine
-  ```
-
-### `caf_co_handle_t`
-
-   The following is a Fortran heavy pseudo code, not the exact implementation we plan
-   ```
-   type caf_co_handle_t
-     type(c_ptr) :: base_addr
-     integer, allocatable, dimension(:) :: lbounds, sizes
-     !integer :: established_team ! probably not necessary unless we want to bounds check
-   end type
-   ```
-
-REMOVE_NOTEs:
-    allow for non-blocking collective subroutines
-    need to be able to track puts in flight, may need a write buffer, record boundaries in a hash table struct
-    every single rma needs to check the table to see if there is a conflicting overlap
-    could add caching
-    if the constants (stat_failed_image, etc) are compiler provided, we need to get C access to these values
-
-
-#### Implementation internal notes
-  * In `caf_allocate`, add precondition that `lbounds` and `sizes` are the same size - use assert or other similar solution
-  * In `caf_get` (and others?), `source` arg useful to get the "shape" of the thing, not the value of this dummy arg, compiler needs to ensure this dummy arg is not a copy for this strategy to work, compiler's codegen needs to ensure that this (and other subroutine calls) are not using copies for this arg
-  * In `caf_get_async`, the `value` arg may need asynchronous attribute or may be implicitly asynchronous
-
-
-REMOVE_NOTE_
-
-examples where if a user writes this example code, then the compiler should rewrite it to look like this other piece of example code
-
-
-TODO after writing example, try compiling it and see if it compiles at least until breaking at linking because no def for caf_allocate etc
-1. basic caf example
-    - static coarray declaration
-         - transform by adding caf_allocate and caf_deallocate calls
-    - write to coarrays
-    - read from coarrays
-    - sync-all-stmt (?)
-
-2. allocatable coarray example
-    - allocatable coarray with an initializer
-    - compiler transforms code and adds assignment statement after calls to caf_allocate
-
-3. implicit deallocation of a coarray example
-    - local, allocatable coarray -> compiler must insert caf_deallocate call
-    - have multiple coarrays, with only one call to caf_deallocate since it takes an array of handles
-
-4. example with coarray initializer to express the idea written earlier that compiler is reponsible for this
-
-5. include somewhere in examples
-  ! integer :: example[2:4,3:*]
-  ! integer :: example[3:*]
-
-
-
-REMOVE_NOTE_TODO: FIX CAF_ALLOCATE LINK
-
-CAF_ALLOCATE implementation notes:
-  ! figure out how much space it needs, c_size_of
-  ! internally consult its own allocater
-  ! will know the memory address
-  ! associate the allocatable local slice with the local memory
-  ! once caf_allocate returns, example_local_slice will refer to the local slice of the coarray that is in the shared heap
-  ! create meta data block that contains info about a given coarray, stash away the cobounds
-
-
-UNIT TESTS TODOs
-  unit test with user code tries to directly call caf_{...} procedures, should be possible for user to do so
-  tests that check to make sure copy in copy out semantics are NOT occurring in the places that it matters
-
-flexible array member in c
-
-### Caffeine internals for coarray accesses
-  Coarray access could start with image_index, with the same coarray coindicies and team identifier argument
-  and would get back a single integer, which is the image number in that team
-  Then can use that to pass into internal query about team and global image index and can use it to bounds check.
-
-
 # Testing plan
 [tbd]
 

>From f959c81b75b25c07e767759f45e7333a2f63fcf0 Mon Sep 17 00:00:00 2001
From: Katherine Rasmussen <krasmussen at lbl.gov>
Date: Thu, 2 Nov 2023 16:35:21 -0700
Subject: [PATCH 32/33] Update parallel runtime library design doc. Rename doc
 file also.

---
 ...anRuntime.md => ParallelFortranRuntime.md} | 75 ++++++-------------
 1 file changed, 24 insertions(+), 51 deletions(-)
 rename flang/docs/{CoarrayFortranRuntime.md => ParallelFortranRuntime.md} (88%)

diff --git a/flang/docs/CoarrayFortranRuntime.md b/flang/docs/ParallelFortranRuntime.md
similarity index 88%
rename from flang/docs/CoarrayFortranRuntime.md
rename to flang/docs/ParallelFortranRuntime.md
index 4288a45000326e..42555641a543e6 100644
--- a/flang/docs/CoarrayFortranRuntime.md
+++ b/flang/docs/ParallelFortranRuntime.md
@@ -1,4 +1,4 @@
-<!--===- docs/CoarrayFortranRuntime.md
+<!--===- docs/ParallelFortranRuntime.md
 
    Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
    See https://llvm.org/LICENSE.txt for license information.
@@ -44,11 +44,11 @@ In addition to being able to support syntax related to the above features, compi
 One consequence of the statements being categorized as image control statements will be the need to restrict code movement by optimizing compilers.
 
 # Proposed solution
-  This design document proposes an interface to support the above features, named Coarray Fortran Parallel Runtime Interface. By defining a library-agnostic interface, we envision facilitating the development of alternative parallel runtime libraries that support the same interface.  One benefit of this approach is the ability to vary the communication substrate. A central aim of this document is to use a parallel runtime interface in standard Fortran syntax, which enables us to leverage Fortran to succinctly express various properties of the procedure interfaces, including argument attributes.  See [Rouson and Bonachea (2022)] for additional details.
+  This design document proposes an interface to support the above features, named Fortran Parallel Runtime Interface. By defining a library-agnostic interface, we envision facilitating the development of alternative parallel runtime libraries that support the same interface.  One benefit of this approach is the ability to vary the communication substrate. A central aim of this document is to use a parallel runtime interface in standard Fortran syntax, which enables us to leverage Fortran to succinctly express various properties of the procedure interfaces, including argument attributes.  See [Rouson and Bonachea (2022)] for additional details.
 
-## Coarray Fortran (CAF) Parallel Runtime Interface
+## Fortran Parallel Runtime Interface (FPRI)
 
-  The Coarray Fortran Parallel Runtime Interface is a proposed interface in which the runtime library is responsible for coarray allocation, deallocation and accesses, image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the source code to add Fortran procedure calls to the necessary runtime library procedures. Below you can find a table showing the delegation of tasks between the compiler and the runtime library. The interface is designed for portability across shared and distributed memory machines, different operating systems, and multiple architectures. The Caffeine implementation, [see below](#caffeine-lbl's-implementation-of-the-coarray-fortran-parallel-runtime-interface), of the Coarray Fortran Parallel Runtime Interface plans to support the following architectures: x86_64, PowerPC64, AArch64, with the possibility of supporting more as requested. Implementations of this interface is intended as an augmentation for the compiler's own runtime library. While the interface can support multiple implementations, we envision needing to build the runtime library as part of installing the compiler. REMOVE_NOTE_TODO: write sentence about how a module will be defined with the procedures. name of module must be: caf_pri.
+  The Fortran Parallel Runtime Interface is a proposed interface in which the runtime library is responsible for coarray allocation, deallocation and accesses, image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the source code to add Fortran procedure calls to the necessary runtime library procedures. Below you can find a table showing the delegation of tasks between the compiler and the runtime library. The interface is designed for portability across shared and distributed memory machines, different operating systems, and multiple architectures. The Caffeine implementation, [see below](#caffeine-lbl's-implementation-of-the-fortran-parallel-runtime-interface), of the Fortran Parallel Runtime Interface plans to support the following architectures: x86_64, PowerPC64, AArch64, with the possibility of supporting more as requested. Implementations of this interface is intended as an augmentation for the compiler's own runtime library. While the interface can support multiple implementations, we envision needing to build the runtime library as part of installing the compiler. REMOVE_NOTE_TODO: write sentence about how a module will be defined with the procedures. name of module must be: caf_pri.
 
 ## Delegation of tasks between the Fortran compiler and the runtime library
 
@@ -108,7 +108,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
      [`caf_event_post`](#caf_event_post), [`caf_event_wait`](#caf_event_wait), [`caf_event_query`](#caf_event_query)
 
    **Teams:**
-     [`caf_change_team`](#caf_change_team), [`caf_end_team`](#caf_end_team), [`caf_form_team`](#caf_form_team), [`caf_sync_team`](#caf_sync_team), [`caf_get_team`](#caf_get_team), [`caf_team_number`](#caf_team_number), [`caf_alias_create`](#caf_alias_create), [`caf_alias_destroy`](#caf_alias_destroy)
+     [`caf_change_team`](#caf_change_team), [`caf_end_team`](#caf_end_team), [`caf_form_team`](#caf_form_team), [`caf_sync_team`](#caf_sync_team), [`caf_get_team`](#caf_get_team), [`caf_team_number`](#caf_team_number)
 
    **Atomic Memory Operation:**
      [`caf_atomic_add`](#caf_atomic_add), [`caf_atomic_and`](#caf_atomic_and), [`caf_atomic_cas`](#caf_atomic_cas), [`caf_atomic_define`](#caf_atomic_define), [`caf_atomic_fetch_add`](#caf_atomic_fetch_add), [`caf_atomic_fetch_and`](#caf_atomic_fetch_and), [`caf_atomic_fetch_or`](#caf_atomic_fetch_or), [`caf_atomic_fetch_xor`](#caf_atomic_fetch_xor), [`caf_atomic_or`](#caf_atomic_or), [`caf_atomic_ref`](#caf_atomic_ref), [`caf_atomic_xor`](#caf_atomic_xor)
@@ -120,8 +120,8 @@ The following table outlines which tasks will be the responsibility of the Fortr
      [`caf_num_images`](#caf_num_images), [`caf_this_image`](#caf_this_image), [`caf_failed_images`](#caf_failed_images), [`caf_stopped_images`](#caf_stopped_images), [`caf_image_status`](#caf_image_status)
 
 
-### Caffeine - LBL's Implementation of the Coarray Fortran Parallel Runtime Interface
-  Implementations of some parts of the Coarray Fortran Parallel Runtime Interface exist in [Caffeine], a parallel runtime library targeting coarray Fortran compilers. Caffeine will continue to be developed in order to fully implement the proposed Coarray Fortran Parallel Runtime Interface. Caffeine uses the [GASNet-EX] exascale networking middleware but with the library-agnostic interface and the ability to vary the communication substrate, it might also be possible to develop wrappers that would support the proposed interface with [OpenCoarrays], which uses the Message Passing Interface ([MPI]).
+### Caffeine - LBL's Implementation of the Fortran Parallel Runtime Interface
+  Implementations of some parts of the Fortran Parallel Runtime Interface exist in [Caffeine], a parallel runtime library targeting coarray Fortran compilers. Caffeine will continue to be developed in order to fully implement the proposed Fortran Parallel Runtime Interface. Caffeine uses the [GASNet-EX] exascale networking middleware but with the library-agnostic interface and the ability to vary the communication substrate, it might also be possible to develop wrappers that would support the proposed interface with [OpenCoarrays], which uses the Message Passing Interface ([MPI]).
 
 
 ## Types Descriptions
@@ -178,7 +178,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
     * if an error condition occurs, an explanatory message is assigned to the argument
 
 
-  (REMOVE_NOTE_TODO: check the interfaces for these collectives, currently are same as the procedures in Caffeine, but these interfaces have not yet been discussed and decided upon for the Coarray Fortran Parallel Runtime Library Interface. May need to add something?)
+  (REMOVE_NOTE_TODO: check the interfaces for these collectives, currently are same as the procedures in Caffeine, but these interfaces have not yet been discussed and decided upon for the Fortran Parallel Runtime Interface. May need to add something?)
 
  #### `caf_co_broadcast`
   * **Description**:
@@ -253,10 +253,10 @@ The following table outlines which tasks will be the responsibility of the Fortr
 
 ### Program startup and shutdown
 
-  When the compiler identifies a program that uses "Coarray Fortran" features, it will insert calls to `caf_init` and `caf_stop`. These procedures will initialize and terminate the Coarray Fortran environment.
+  When the compiler identifies a program that uses parallel Fortran features, it will insert calls to `caf_init` and `caf_stop`. These procedures will initialize and terminate the parallel Fortran environment.
 
  #### `caf_init`
-  * **Description**: This procedure will initialize the Coarray Fortran environment.
+  * **Description**: This procedure will initialize the parallel Fortran environment.
   * **Procedure Interface**:
     ```
       function caf_init() result(exit_code)
@@ -268,7 +268,7 @@ The following table outlines which tasks will be the responsibility of the Fortr
 
 REMOVE_NOTE_TODO: resolve the following two options for caf_stop
  #### `caf_stop`
-  * **Description**: This procedure may or may terminate the Coarray Fortran environment. REMOVE_NOTE_TODO_DECISION: does it terminate for sure or not? can caf_init be called twice?
+  * **Description**: This procedure may or may terminate the parallel Fortran environment. REMOVE_NOTE_TODO_DECISION: does it terminate for sure or not? can caf_init be called twice?
   * **Procedure Interface**:
     ```
       subroutine caf_stop(exit_code)
@@ -279,7 +279,7 @@ REMOVE_NOTE_TODO: resolve the following two options for caf_stop
   * **Further argument descriptions**:
     * **`exit_code`**: is .. (REMOVE_NOTE_TODO: fill in)
 
-  (REMOVE_NOTE_TODO: check the interfaces for caf_error_stop and caf_stop, currently are same as the procedures in Caffeine, but these interfaces have not yet been discussed and decided upon for the Coarray Fortran Parallel Runtime Library Interface. May need to add something? Change something?)
+  (REMOVE_NOTE_TODO: check the interfaces for caf_error_stop and caf_stop, currently are same as the procedures in Caffeine, but these interfaces have not yet been discussed and decided upon for the Fortran Parallel Runtime Interface. May need to add something? Change something?)
 
   * **Description**: This procedure synchronizes and stops the executing image. It provides the `stop_code` or `0` if no `stop_code` is passed, as the process exit status.
   * **Procedure Interface**:  REMOVE_NOTE_TODO_DECISION: should stop be implemented with two optional arguments with the precondition that they both shall not be passed at the same time? Or have overloaded procedures?
@@ -704,6 +704,7 @@ REMOVE_NOTE_TODO: resolve the following two options for caf_stop
   * **Further argument descriptions**:
 
 ### Teams
+  (REMOVE_NOTE_TODO: check the interface for caf_change_team and caf_form_team, currently are same as the procedures in Caffeine, but these interfaces have not yet been discussed and decided upon for the Coarray Fortran Parallel Runtime Library Interface. May need to add something? Change something?)
   TODO: add general points related to our team design
   The only time we create a new handle for an established coarray is in a change-team-stmt. The only times we create handles is `caf_allocate`, and a `change-team-stmt`.
 
@@ -711,23 +712,19 @@ REMOVE_NOTE_TODO: resolve the following two options for caf_stop
   * **Description**:
   * **Procedure Interface**:
     ```
-      subroutine caf_change_team(team_var, errmsg, stat)
+      subroutine caf_change_team(team)
         implicit none
-        type(caf_team_type_t), intent(in) :: team_var
-        character(len=*), intent(out), optional :: errmsg
-        integer, intent(out), optional :: stat
+        type(team_type), target, intent(in) :: team
       end subroutine
     ```
   * **Further argument descriptions**:
 
  #### `caf_end_team`
-  * **Description**: During the execution of `caf_end_team`, the runtime library will implicitly synchronize, as the standard requires and deallocate any coarrays allocated during the change team construct. Prior to `caf_end_team`, the compiler is responsible for invoking [`caf_alias_destroy`](#caf_alias_destroy) for any `caf_co_handle_t` handles created during the change team construct.
+  * **Description**:
   * **Procedure Interface**:
     ```
-      subroutine caf_end_team(errmsg, stat)
+      subroutine caf_end_team(fill in...)
         implicit none
-        character(len=*), intent(out), optional :: errmsg
-        integer, intent(out), optional :: stat
       end subroutine
     ```
   * **Further argument descriptions**:
@@ -736,13 +733,13 @@ REMOVE_NOTE_TODO: resolve the following two options for caf_stop
   * **Description**:
   * **Procedure Interface**:
     ```
-      subroutine caf_form_team(team_num, team_var, new_index, errmsg, stat)
+      subroutine caf_form_team(num, team, new_index, stat, errmsg)
         implicit none
-        integer(kind=c_int), intent(in) :: team_num
-        type(caf_team_type_t), intent(out) :: team_var
-        integer(kind=c_int), intent(in), optional :: new_index
-        character(len=*), intent(out), optional :: errmsg
-        integer, intent(out), optional :: stat
+        integer,          intent(in)  :: num
+        type(team_type),  intent(out) :: team
+        integer,          intent(in),    optional :: new_index
+        integer,          intent(out),   optional :: stat
+        character(len=*), intent(inout), optional :: errmsg
       end subroutine
     ```
   * **Further argument descriptions**:
@@ -777,30 +774,6 @@ REMOVE_NOTE_TODO: resolve the following two options for caf_stop
     ```
   * **Further argument descriptions**:
 
- #### `caf_alias_create`
-  * **Description**: Create a new coarray handle that may be used for coarray queries about an aliased coarray, such as in a [`caf_change_team`](#caf_change_team)
-  * **Procedure Interface**:
-    ```
-      subroutine caf_alias_create(source_handle, alias_co_lbounds, alias_co_ubounds, alias_handle)
-        implicit none
-        type(caf_co_handle_t), intent(in) :: source_handle
-        integer(kind=c_intmax_t), dimension(:), intent(in) :: alias_co_lbounds, alias_co_ubounds
-        type(caf_co_handle_t), intent(out) :: alias_handle
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_alias_destroy`
-  * **Description**: REMOVE_NOTE_TODO: Keep the alias routines in the Teams section? Move to another more general section?
-  * **Procedure Interface**:
-    ```
-      subroutine caf_alias_destroy(alias_handle)
-        implicit none
-        type(caf_co_handle_t), intent(in) :: alias_handle
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
 ### Atomic Memory Operation
 
 All atomic operations are blocking operations.
@@ -1029,7 +1002,7 @@ All atomic operations are blocking operations.
 
  #### `caf_num_images`
   * **Description**:
-  * **Procedure Interface**:   (REMOVE_NOTE_TODO: check the interface for caf_num_images, currently is same as the procedure in Caffeine, but this interface has not yet been discussed and decided upon for the Coarray Fortran Parallel Runtime Library Interface. May need to add something? Change something?)
+  * **Procedure Interface**:   (REMOVE_NOTE_TODO: check the interface for caf_num_images, currently is same as the procedure in Caffeine, but this interface has not yet been discussed and decided upon for the Fortran Parallel Runtime Interface. May need to add something? Change something?)
     ```
       function caf_num_images(team, team_number) result(image_count)
         implicit none

>From 1c0e485cae2c0583417ec92fc6d3c8a6fb448126 Mon Sep 17 00:00:00 2001
From: Brad Richardson <everythingfunctional at protonmail.com>
Date: Tue, 19 Dec 2023 15:55:42 -0600
Subject: [PATCH 33/33] update PRIF design doc

---
 flang/docs/ParallelFortranRuntime.md | 2740 ++++++++++++++++----------
 1 file changed, 1746 insertions(+), 994 deletions(-)

diff --git a/flang/docs/ParallelFortranRuntime.md b/flang/docs/ParallelFortranRuntime.md
index 42555641a543e6..392fefd7f927ec 100644
--- a/flang/docs/ParallelFortranRuntime.md
+++ b/flang/docs/ParallelFortranRuntime.md
@@ -1,1067 +1,1819 @@
-<!--===- docs/ParallelFortranRuntime.md
+<!--===- docs/CoarrayFortranRuntime.md
 
    Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
    See https://llvm.org/LICENSE.txt for license information.
    SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
 
 -->
-# THIS IS A WORK IN PROGRESS - DECISIONS REGARDING THE DESIGNS DISCUSSED IN THIS DOCUMENT ARE ONGOING AND MAY CHANGE AND THE DOCUMENT IS INCOMPLETE
-
-
-#TODO:
-   add note in an appropriate place about move_alloc when args are coarrays, say that it is the compilers responsibility to use the handles
-provided from runtime library to do necessary swap
-
+# Parallel Runtime Interface for Fortran (PRIF) Design Document, Revision 0.2
+
+Damian Rouson  
+Brad Richardson  
+Dan Bonachea  
+Katherine Rasmussen  
+Lawrence Berkeley National Laboratory, USA  
+lbl-flang at lbl.gov  
+
+# Abstract
+
+This design document proposes an interface to support the parallel features of
+Fortran, named the Parallel Runtime Interface for Fortran (PRIF). PRIF is a
+proposed solution in which the runtime library is responsible for coarray
+allocation, deallocation and accesses, image synchronization, atomic operations,
+events, and teams. In this interface, the compiler is responsible for
+transforming the invocation of Fortran-level parallel features into procedure
+calls to the necessary PRIF procedures. The interface is designed for
+portability across shared- and distributed-memory machines, different operating
+systems, and multiple architectures. Implementations of this interface are
+intended as an augmentation for the compiler's own runtime library. With an
+implementation-agnostic interface, alternative parallel runtime libraries may be
+developed that support the same interface. One benefit of this approach is the
+ability to vary the communication substrate. A central aim of this document is
+to define a parallel runtime interface in standard Fortran syntax, which enables
+us to leverage Fortran to succinctly express various properties of the procedure
+interfaces, including argument attributes.
+
+> **WORK IN PROGRESS** This is still a draft a may continue to evolve.
+> Feedback and questions should be directed to lbl-flang at lbl.gov.
+
+# Changelog
+
+## Rev. 0.1
+
+* Identify parallel features
+* Sketch out high-level design
+* Decide on compiler vs PRIF responsibilities
+
+## Rev. 0.2 (Dec. 2023)
+
+* Change name to PRIF
+* Fill out interfaces to all PRIF provided procedures
+* Write descriptions, discussions and overviews of various features, arguments, etc.
 
 # Problem description
-  In order to be fully Fortran 2018 compliant, Flang needs to add support for what is commonly referred to as coarray fortran, which includes features related to parallelism. These features include the following statements, subroutines, functions, types, and kind type parameters:
-
-  * **Statements:**
-    - _Synchronization:_ `sync all`, `sync images`, `sync memory`, `sync team`
-    - _Events:_ `event post`, `event wait`
-    - _Error termination:_ `error stop`
-    - _Locks:_ `lock`, `unlock`
-    - _Failed images:_ `fail image`
-    - _Teams:_ `form team`, `change team`
-    - _Critical sections:_ `critical`, `end critical`
-  * **Intrinsic functions:**
-  `num_images`, `this_image`, `lcobound`, `ucobound`, `team_number`, `get_team`, `failed_images`, `stopped_images`,
-  `image_status`, `coshape`, `image_index`
-  * **Intrinsic subroutines:**
-    - _Collective subroutines:_ `co_sum`, `co_max`, `co_min`, `co_reduce`, `co_broadcast`
-    - _Atomic subroutines:_ `atomic_add`, `atomic_and`, `atomic_cas`, `atomic_define`,
-  `atomic_fetch_add`, `atomic_fetch_and`, `atomic_fetch_or`, `atomic_fetch_xor`, `atomic_or`, `atomic_ref`, `atomic_xor`
-    - _Other subroutines:_ `event_query`
-  * **Types, kind type parameters, and values:**
-    - _Intrinsic derived types:_ `event_type`, `team_type`, `lock_type`
-    - _Atomic kind type parameters:_ `atomic_int_kind` and `atomic_logical_kind`
-    - _Values:_ `stat_failed_image`, `stat_locked`, `stat_locked_other_image`, `stat_stopped_image`, `stat_unlocked`, `stat_unlocked_failed_image`
-
-In addition to being able to support syntax related to the above features, compilers will also need to be able to handle new execution concepts such as image control.  The image control concept affects the behaviors of some statements that were introduced in Fortran expressly for supporting parallel programming, but image control also affects the behavior of some statements that pre-existed parallism in standard Fortran:
- * **Image control statements:**
-   - _Pre-existing statements_: `allocate`, `deallocate`, `stop`, `end`, a `call` referencing `move_alloc` with coarray arguments
-   - _New statements:_ `sync all`, `sync images`, `sync memory`, `sync team`, `change team`, `end team`, `critical`, `end critical`, `event post`, `event wait`, `form team`, `lock`, `unlock`
-One consequence of the statements being categorized as image control statements will be the need to restrict code movement by optimizing compilers.
-
-# Proposed solution
-  This design document proposes an interface to support the above features, named Fortran Parallel Runtime Interface. By defining a library-agnostic interface, we envision facilitating the development of alternative parallel runtime libraries that support the same interface.  One benefit of this approach is the ability to vary the communication substrate. A central aim of this document is to use a parallel runtime interface in standard Fortran syntax, which enables us to leverage Fortran to succinctly express various properties of the procedure interfaces, including argument attributes.  See [Rouson and Bonachea (2022)] for additional details.
-
-## Fortran Parallel Runtime Interface (FPRI)
-
-  The Fortran Parallel Runtime Interface is a proposed interface in which the runtime library is responsible for coarray allocation, deallocation and accesses, image synchronization, atomic operations, events, and teams. In this interface, the compiler is responsible for transforming the source code to add Fortran procedure calls to the necessary runtime library procedures. Below you can find a table showing the delegation of tasks between the compiler and the runtime library. The interface is designed for portability across shared and distributed memory machines, different operating systems, and multiple architectures. The Caffeine implementation, [see below](#caffeine-lbl's-implementation-of-the-fortran-parallel-runtime-interface), of the Fortran Parallel Runtime Interface plans to support the following architectures: x86_64, PowerPC64, AArch64, with the possibility of supporting more as requested. Implementations of this interface is intended as an augmentation for the compiler's own runtime library. While the interface can support multiple implementations, we envision needing to build the runtime library as part of installing the compiler. REMOVE_NOTE_TODO: write sentence about how a module will be defined with the procedures. name of module must be: caf_pri.
-
-## Delegation of tasks between the Fortran compiler and the runtime library
-
-The following table outlines which tasks will be the responsibility of the Fortran compiler and which tasks will be the responsibility of the runtime library. A '✓' in the Fortran compiler column indicates that the compiler has the primary responsibility for that task, while a '✓' in the Runtime library column indicates that the compiler will invoke the runtime library to perform the task and the runtime library has primary responsibility for the task's implementation. See the [Runtime Interface Procedures](#runtime-interface-procedures) for the list of runtime library procedures that the compiler will invoke.
-
-
-| Tasks | Fortran compiler | Runtime library |
-| ----  | ----- | -------- |
-| Establish and initialize static coarrays prior to `main` - [see more](#establish-and-initialize-static-coarrays-prior-to-`main`)        |     ✓     |           |
-| Track corank of coarrays                |     ✓     |           |
-| Assigning variables of type `team-type` |     ✓     |           |
-| Track locals coarrays for implicit deallocation when exiting a scope |     ✓     |           |
-| Initialize a coarray with SOURCE= as part of allocate-stmt |     ✓     |           |
-| Provide unique identifiers for location of each `critical-construct` |     ✓     |           |
-| Provide final subroutine for all derived types with allocatable components that appear in a coarray |     ✓     |           |
-| Track coarrays for implicit deallocation at `end-team-stmt`  |           |     ✓     |
-| Allocate and deallocate a coarray       |           |     ✓     |
-| Reference a coindexed-object            |           |     ✓     |
-| Team stack abstraction                  |           |     ✓     |
-| `form-team-stmt`, `change-team-stmt`, `end-team-stmt` |           |     ✓     |
-| Intrinsic functions related to Coarray Fortran, like `num_images`, etc |           |     ✓     |
-| Atomic subroutines                      |           |     ✓     |
-| Collective subroutines                      |           |     ✓     |
-| Synchronization statements              |           |     ✓     |
-| Events              |           |     ✓     |
-| Locks              |           |     ✓     |
-| `critical-construct`             |           |     ✓     |
-
-
-## Types
-
- **Provided Fortran types:** [`caf_event_type`](#caf_event_type), [`caf_team_type`](#caf_team_type), [`caf_lock_type`](#caf_lock_type)
-
- **Runtime library specific types:** [`caf_co_handle_t`](#caf_co_handle_t), [`caf_async_handle_t`](#caf_async_handle_t), [`caf_source_loc_t`](#caf_source_loc_t)
-
-## Runtime Interface Procedures
-
-   **Collectives:**
-     [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum)
-
-   **Program startup and shutdown:**
-     [`caf_init`](#caf_init), [`caf_stop`](#caf_stop), [`caf_error_stop`](#caf_error_stop), [`caf_fail_image`](#caf_fail_image)
-
-   **Allocation and deallocation:**
-     [`caf_allocate`](#caf_allocate), [`caf_deallocate`](#caf_deallocate), [`caf_allocate_non_symmetric`](#caf_allocate_non_symmetric), [`caf_deallocate_non_symmetric`](#caf_deallocate_non_symmetric)
-
-   **Coarray Access:**
-     [`caf_put`](#caf_put), [`caf_put_raw`](#caf_put_raw), [`caf_put_raw_strided`](#caf_put_raw_strided), [`caf_get`](#caf_get), [`caf_get_raw`](#caf_get_raw), [`caf_get_raw_strided`](#caf_get_raw_strided), [`caf_get_async`](#caf_get_async), [`caf_base_pointer`](#caf_base_pointer)
-
-   **Operation Synchronization:**
-     [`caf_async_wait_for`](#caf_aync_wait_for), [`caf_async_try_for`](#caf_async_try_for), [`caf_sync_memory`](#caf_sync_memory)
-
-   **Image Synchronization:**
-     [`caf_sync_all`](#caf_sync_all), [`caf_sync_images`](#caf_sync_images), [`caf_lock`](#caf_lock), [`caf_unlock`](#caf_unlock), [`caf_critical`](#caf_critical), [`caf_end_critical`](#caf_end_critical)
-
-   **Events:**
-     [`caf_event_post`](#caf_event_post), [`caf_event_wait`](#caf_event_wait), [`caf_event_query`](#caf_event_query)
-
-   **Teams:**
-     [`caf_change_team`](#caf_change_team), [`caf_end_team`](#caf_end_team), [`caf_form_team`](#caf_form_team), [`caf_sync_team`](#caf_sync_team), [`caf_get_team`](#caf_get_team), [`caf_team_number`](#caf_team_number)
-
-   **Atomic Memory Operation:**
-     [`caf_atomic_add`](#caf_atomic_add), [`caf_atomic_and`](#caf_atomic_and), [`caf_atomic_cas`](#caf_atomic_cas), [`caf_atomic_define`](#caf_atomic_define), [`caf_atomic_fetch_add`](#caf_atomic_fetch_add), [`caf_atomic_fetch_and`](#caf_atomic_fetch_and), [`caf_atomic_fetch_or`](#caf_atomic_fetch_or), [`caf_atomic_fetch_xor`](#caf_atomic_fetch_xor), [`caf_atomic_or`](#caf_atomic_or), [`caf_atomic_ref`](#caf_atomic_ref), [`caf_atomic_xor`](#caf_atomic_xor)
-
-   **Coarray Queries:**
-     [`caf_lcobound`](#caf_lcobound), [`caf_ucobound`](#caf_ucobound), [`caf_coshape`](#caf_coshape), [`caf_image_index`](#caf_image_index)
 
-   **Image Queries:**
-     [`caf_num_images`](#caf_num_images), [`caf_this_image`](#caf_this_image), [`caf_failed_images`](#caf_failed_images), [`caf_stopped_images`](#caf_stopped_images), [`caf_image_status`](#caf_image_status)
+In order to be fully Fortran 2023 compliant, a Fortran compiler needs support for
+what is commonly referred to as coarray fortran, which includes features
+related to parallelism. These features include the following statements,
+subroutines, functions, types, and kind type parameters:
+
+* **Statements:**
+  - _Synchronization:_ `sync all`, `sync images`, `sync memory`, `sync team`
+  - _Events:_ `event post`, `event wait`
+  - _Notify:_ `notify wait`
+  - _Error termination:_ `error stop`
+  - _Locks:_ `lock`, `unlock`
+  - _Failed images:_ `fail image`
+  - _Teams:_ `form team`, `change team`
+  - _Critical sections:_ `critical`, `end critical`
+* **Intrinsic functions:** `num_images`, `this_image`, `lcobound`, `ucobound`,
+  `team_number`, `get_team`, `failed_images`, `stopped_images`, `image_status`,
+  `coshape`, `image_index`
+* **Intrinsic subroutines:**
+  - _Collective subroutines:_ `co_sum`, `co_max`, `co_min`, `co_reduce`, `co_broadcast`
+  - _Atomic subroutines:_ `atomic_add`, `atomic_and`, `atomic_cas`,
+    `atomic_define`, `atomic_fetch_add`, `atomic_fetch_and`, `atomic_fetch_or`,
+    `atomic_fetch_xor`, `atomic_or`, `atomic_ref`, `atomic_xor`
+  - _Other subroutines:_ `event_query`
+* **Types, kind type parameters, and values:**
+  - _Intrinsic derived types:_ `event_type`, `team_type`, `lock_type`, `notify_type`
+  - _Atomic kind type parameters:_ `atomic_int_kind` and `atomic_logical_kind`
+  - _Values:_ `stat_failed_image`, `stat_locked`, `stat_locked_other_image`,
+    `stat_stopped_image`, `stat_unlocked`, `stat_unlocked_failed_image`
+
+In addition to being able to support syntax related to the above features,
+compilers will also need to be able to handle new execution concepts such as
+image control. The image control concept affects the behaviors of some
+statements that were introduced in Fortran expressly for supporting parallel
+programming, but image control also affects the behavior of some statements
+that pre-existed parallelism in standard Fortran:
+
+* **Image control statements:**
+  - _Pre-existing statements_: `allocate`, `deallocate`, `stop`, `end`,
+    a `call` referencing `move_alloc` with coarray arguments
+  - _New statements:_ `sync all`, `sync images`, `sync memory`, `sync team`,
+    `change team`, `end team`, `critical`, `end critical`, `event post`,
+    `event wait`, `form team`, `lock`, `unlock`, `notify wait`
+
+One consequence of the statements being categorized as image control statements
+will be the need to restrict code movement by optimizing compilers.
 
+# Proposed solution
 
-### Caffeine - LBL's Implementation of the Fortran Parallel Runtime Interface
-  Implementations of some parts of the Fortran Parallel Runtime Interface exist in [Caffeine], a parallel runtime library targeting coarray Fortran compilers. Caffeine will continue to be developed in order to fully implement the proposed Fortran Parallel Runtime Interface. Caffeine uses the [GASNet-EX] exascale networking middleware but with the library-agnostic interface and the ability to vary the communication substrate, it might also be possible to develop wrappers that would support the proposed interface with [OpenCoarrays], which uses the Message Passing Interface ([MPI]).
-
+This design document proposes an interface to support the above features,
+named Parallel Runtime Interface for Fortran (PRIF). By defining an
+implementation-agnostic interface, we envision facilitating the development of
+alternative parallel runtime libraries that support the same interface. One
+benefit of this approach is the ability to vary the communication substrate.
+A central aim of this document is to use a parallel runtime interface in
+standard Fortran syntax, which enables us to leverage Fortran to succinctly
+express various properties of the procedure interfaces, including argument
+attributes. See [Rouson and Bonachea (2022)] for additional details.
+
+## Parallel Runtime Interface for Fortran (PRIF)
+
+The Parallel Runtime Interface for Fortran is a proposed interface in which the
+PRIF implementation is responsible for coarray allocation, deallocation and
+accesses, image synchronization, atomic operations, events, and teams. In this
+interface, the compiler is responsible for transforming the invocation of
+Fortran-level parallel features to add procedure calls to the necessary PRIF
+procedures. Below you can find a table showing the delegation of tasks
+between the compiler and the PRIF implementation. The interface is designed for
+portability across shared and distributed memory machines, different operating
+systems, and multiple architectures. The Caffeine implementation,
+[see below](#caffeine---lbl's-implementation-of-the-parallel-runtime-interface-for-fortran),
+of the Parallel Runtime Interface for Fortran plans to support the following
+architectures: x86_64, PowerPC64, AArch64, with the possibility of supporting
+more as requested. Implementations of this interface are intended as an
+augmentation for the compiler's own runtime library. While the interface can
+support multiple implementations, we envision needing to build the PRIF implementation
+as part of installing the compiler. The procedures and types provided
+for direct invocation as part of the PRIF implementation shall be defined in a
+Fortran module with the name `prif`.
+
+## Delegation of tasks between the Fortran compiler and the PRIF implementation
+
+The following table outlines which tasks will be the responsibility of the
+Fortran compiler and which tasks will be the responsibility of the PRIF
+implementation. A 'X' in the "Fortran compiler" column indicates that the compiler has
+the primary responsibility for that task, while a 'X' in the "PRIF implementation"
+column indicates that the compiler will invoke the PRIF implementation to perform
+the task and the PRIF implementation has primary responsibility for the task's
+implementation. See the [Procedure descriptions](#procedure-descriptions)
+for the list of PRIF implementation procedures that the compiler will invoke.
+
+|                                                      Tasks                                                                       |  Fortran compiler  | PRIF implementation |
+|----------------------------------------------------------------------------------------------------------------------------------|--------------------|---------------------|
+| Establish and initialize static coarrays prior to `main`                                                                         |         X          |                     |
+| Track corank of coarrays                                                                                                         |         X          |                     |
+| Track local coarrays for implicit deallocation when exiting a scope                                                              |         X          |                     |
+| Initialize a coarray with `SOURCE=` as part of allocate-stmt                                                                     |         X          |                     |
+| Provide `lock_type` coarrays for `critical-construct`s                                                                           |         X          |                     |
+| Provide final subroutine for all derived types that are finalizable or that have allocatable components that appear in a coarray |         X          |                     |
+| Track variable allocation status, including resulting from use of `move_alloc`                                                   |         X          |                     |
+| Track coarrays for implicit deallocation at `end-team-stmt`                                                                      |                    |          X          |
+| Allocate and deallocate a coarray                                                                                                |                    |          X          |
+| Reference a coindexed-object                                                                                                     |                    |          X          |
+| Team stack abstraction                                                                                                           |                    |          X          |
+| `form-team-stmt`, `change-team-stmt`, `end-team-stmt`                                                                            |                    |          X          |
+| Intrinsic functions related to Coarray Fortran, like `num_images`, etc.                                                          |                    |          X          |
+| Atomic subroutines                                                                                                               |                    |          X          |
+| Collective subroutines                                                                                                           |                    |          X          |
+| Synchronization statements                                                                                                       |                    |          X          |
+| Events                                                                                                                           |                    |          X          |
+| Locks                                                                                                                            |                    |          X          |
+| `critical-construct`                                                                                                             |                    |          X          |
+
+### Caffeine - LBL's Implementation of the Parallel Runtime Interface for Fortran
+
+Implementations of some parts of the Parallel Runtime Interface for Fortran
+exist in [Caffeine], a parallel runtime library targeting coarray Fortran
+compilers. Caffeine will continue to be developed in order to fully implement
+the proposed Parallel Runtime Interface for Fortran. Caffeine uses the
+[GASNet-EX] exascale networking middleware but with the implementation-agnostic
+interface and the ability to vary the communication substrate, it might also be
+possible to develop wrappers that would support the proposed interface with
+[OpenCoarrays], which uses the Message Passing Interface ([MPI]).
 
 ## Types Descriptions
 
- ### Fortran Intrinsic Derived types
-   These types will be defined in the runtime library and it is proposed that the compiler will use a rename to use the runtime library definitions for these types in the compiler's implementation of the `ISO_Fortran_Env` module. REMOVE_NOTE_TODO: add rationale for this
+### Fortran Intrinsic Derived types
 
- #### `caf_team_type`
-   * implementation for `team_type` from `ISO_Fortran_Env`
- #### `caf_event_type`
-   * implementation for `event_type` from `ISO_Fortran_Env`
- #### `caf_lock_type`
-   * implementation for `lock_type` from `ISO_Fortran_Env`
+These types will be defined in the PRIF implementation and it is proposed that the
+compiler will use a rename to use the PRIF implementation definitions for these
+types in the compiler's implementation of the `ISO_Fortran_Env` module. This
+enables the internal structure of each given type to be tailored as needed for
+a given implementation.
 
+#### `prif_team_type`
 
- ### Runtime library specific types
-   REMOVE_NOTE_TODO:    ADD general description of types
+* implementation for `team_type` from `ISO_Fortran_Env`
 
- #### `caf_co_handle_t`
-   * `caf_co_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler.
- #### `caf_async_handle_t`
-   * `caf_async_handle_t` will be a derived type provided by the runtime library and that will be opaque to the compiler. This type will help the runtime library track and provide asynchrony.
+#### `prif_event_type`
 
+* implementation for `event_type` from `ISO_Fortran_Env`
 
-## Procedure descriptions
+#### `prif_lock_type`
 
-### sync-stat-list
+* implementation for `lock_type` from `ISO_Fortran_Env`
 
-  * **`stat`** : TODO
-  * **`errmsg`** : There will be two optional arguments for this, one which is allocatable and one which is not. It is the compiler's responsibility to ensure the appropriate optional argument is passed. The allocatable argument will satisfy Fortan 2023 semantics.
+#### `prif_notify_type`
 
-### Collectives
+* implementation for `notify_type` from `ISO_Fortran_Env`
 
- #### Common arguments
-  * **`a`**
-    * Argument for all the collective subroutines: [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum),
-    * may be any type
-    * is always `intent(inout)`
-    * for [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum) it is assigned the value computed by the collective operation, if no error conditions occurs and if `result_image` is absent, or the executing image is the one identified by `result_image`, otherwise `a` becomes undefined
-    * for [`co_broadcast`](#co_broadcast), the value of the argument on the `source_image` is assigned to the `a` argument on all other images
-
-  * **`stat`**
-    * Argument for all the collective subroutines: [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum),
-    * is always of type `integer`
-    * is always `intent(out)`
-    * is assigned the value `0` when the execution of the procedure is succcessful
-    * is assigned a positive value when the execution of the procedure is not succcessful and the `a` argument becomes undefined
-
-  * **`errmsg`**
-    * Argument for all the collective subroutines: [`caf_co_broadcast`](#caf_co_broadcast), [`caf_co_max`](#caf_co_max), [`caf_co_min`](#caf_co_min), [`caf_co_reduce`](#caf_co_reduce), [`caf_co_sum`](#caf_co_sum),
-    * is always of type `integer`
-    * is always `intent(inout)`
-    * if an error condition does not occur, the value is unchanged
-    * if an error condition occurs, an explanatory message is assigned to the argument
-
-
-  (REMOVE_NOTE_TODO: check the interfaces for these collectives, currently are same as the procedures in Caffeine, but these interfaces have not yet been discussed and decided upon for the Fortran Parallel Runtime Interface. May need to add something?)
-
- #### `caf_co_broadcast`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-       subroutine caf_co_broadcast(a, source_image, stat, errmsg)
-         implicit none
-         type(*), intent(inout), contiguous, target :: a(..)
-         integer, optional, intent(in) :: source_image
-         integer, optional, intent(out), target :: stat
-         character(len=*), intent(inout), optional, target :: errmsg
-       end subroutine
-    ```
-  * **Further argument descriptions**:
+### Constants in `ISO_FORTRAN_ENV`
 
- #### `caf_co_max`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-       subroutine caf_co_max(a, result_image, stat, errmsg)
-         implicit none
-         type(*), intent(inout), contiguous, target :: a(..)
-         integer, intent(in), optional, target :: result_image
-         integer, intent(out), optional, target :: stat
-         character(len=*), intent(inout), optional, target :: errmsg
-       end subroutine
-    ```
-  * **Further argument descriptions**:
+These values will be defined in the PRIF implementation and it is proposed that the
+compiler will use a rename to use the PRIF implementation definitions for these
+values in the compiler's implementation of the `ISO_Fortran_Env` module.
 
- #### `caf_co_min`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-       subroutine caf_co_min(a, result_image, stat, errmsg)
-         implicit none
-         type(*), intent(inout), contiguous, target :: a(..)
-         integer, intent(in), optional, target :: result_image
-         integer, intent(out), optional, target :: stat
-         character(len=*), intent(inout), optional, target :: errmsg
-       end subroutine
-    ```
-  * **Further argument descriptions**:
+#### `PRIF_ATOMIC_INT_KIND`
 
- #### `caf_co_reduce`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-       subroutine caf_co_reduce(a, operation, result_image, stat, errmsg)
-         implicit none
-         type(*), intent(inout), contiguous, target :: a(..)
-         type(c_funptr), value :: operation
-         integer, intent(in), optional, target :: result_image
-         integer, intent(out), optional, target :: stat
-         character(len=*), intent(inout), optional, target :: errmsg
-       end subroutine
-    ```
-  * **Further argument descriptions**:
+This shall be set to an implementation defined value from the `INTEGER_KINDS`
+array.
 
- #### `caf_co_sum`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-       subroutine caf_co_sum(a, result_image, stat, errmsg)
-         implicit none
-         type(*), intent(inout), contiguous, target :: a(..)
-         integer, intent(in), target, optional :: result_image
-         integer, intent(out), target, optional :: stat
-         character(len=*), intent(inout), target, optional :: errmsg
-       end subroutine
-    ```
-  * **Further argument descriptions**:
+#### `PRIF_ATOMIC_LOGICAL_KIND`
 
-### Program startup and shutdown
+This shall be set to an implementation defined value from the `LOGICAL_KINDS`
+array.
 
-  When the compiler identifies a program that uses parallel Fortran features, it will insert calls to `caf_init` and `caf_stop`. These procedures will initialize and terminate the parallel Fortran environment.
+#### `PRIF_CURRENT_TEAM`
 
- #### `caf_init`
-  * **Description**: This procedure will initialize the parallel Fortran environment.
-  * **Procedure Interface**:
-    ```
-      function caf_init() result(exit_code)
-        implicit none
-        integer :: exit_code
-      end function
-    ```
-  * **Result**: `exit_code` is an `integer` whose value ... (REMOVE_NOTE_TODO: fill in)
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation and shall be distinct from the values `PRIF_INITIAL_TEAM` and
+`PRIF_PARENT_TEAM`
 
-REMOVE_NOTE_TODO: resolve the following two options for caf_stop
- #### `caf_stop`
-  * **Description**: This procedure may or may terminate the parallel Fortran environment. REMOVE_NOTE_TODO_DECISION: does it terminate for sure or not? can caf_init be called twice?
-  * **Procedure Interface**:
-    ```
-      subroutine caf_stop(exit_code)
-        implicit none
-        integer, intent(in) :: exit_code
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-    * **`exit_code`**: is .. (REMOVE_NOTE_TODO: fill in)
+#### `PRIF_INITIAL_TEAM`
 
-  (REMOVE_NOTE_TODO: check the interfaces for caf_error_stop and caf_stop, currently are same as the procedures in Caffeine, but these interfaces have not yet been discussed and decided upon for the Fortran Parallel Runtime Interface. May need to add something? Change something?)
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation and shall be distinct from the values `PRIF_CURRENT_TEAM` and
+`PRIF_PARENT_TEAM`
 
-  * **Description**: This procedure synchronizes and stops the executing image. It provides the `stop_code` or `0` if no `stop_code` is passed, as the process exit status.
-  * **Procedure Interface**:  REMOVE_NOTE_TODO_DECISION: should stop be implemented with two optional arguments with the precondition that they both shall not be passed at the same time? Or have overloaded procedures?
-    ```
-      subroutine caf_stop(stop_code_int, stop_code_char)
-        implicit none
-        integer, intent(in), optional :: stop_code_int
-        character(len=*), intent(in), optional :: stop_code_char
-      end subroutine
+#### `PRIF_PARENT_TEAM`
 
-    ```
-  * **Further argument descriptions**:
-    * **`stop_code_int` and `stop_code_char`**: shall not both be present in the same call (if provide only one procedure instead of overloading caf_stop)
-          * In `caf_stop`, runtime library will need to call c_exit() REMOVE_NOTE_TODO: fix this note
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation and shall be distinct from the values `PRIF_CURRENT_TEAM` and
+`PRIF_INITIAL_TEAM`
 
- #### `caf_error_stop`
-  * **Description**: This procedure stops all images and provides the `stop_code` passed, or `0` if no `stop_code` is passed, as the process exit status
-  * **Procedure Interface**: REMOVE_NOTE_TODO_DECISION: should error_stop be implemented with two optional arguments with the precondition that they both shall not be passed at the same time? Or have overloaded procedures?
-    ```
-      subroutine caf_error_stop(stop_code_int, stop_code_char)
-        integer, intent(in), optional :: stop_code_int
-        character(len=*), intent(in), optional :: stop_code_char
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-    * **`stop_code_int` and `stop_code_char`**: shall not both be present in the same call (if provide only one procedure instead of overloading caf_error_stop)
+#### `PRIF_STAT_FAILED_IMAGE`
 
- #### `caf_fail_image`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_fail_image(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation to be negative if the implementation cannot detect failed images
+and positive otherwise and shall be distinct from `PRIF_STAT_LOCKED`,
+`PRIF_STAT_LOCKED_OTHER_IMAGE`, `PRIF_STAT_STOPPED_IMAGE`, `PRIF_STAT_UNLOCKED` and
+`PRIF_STAT_UNLOCKED_FAILED_IMAGE`.
 
-### Allocation and deallocation
+#### `PRIF_STAT_LOCKED`
 
- #### `caf_allocate`
-  * **Description**: This procedure allocates memory for a coarray. Calls to `caf_allocate` will be inserted by the compiler when there is an explicit coarray allocation or a statically declared coarray in the source code. The runtime library will stash away the coshape information at this time in order to internally track it during the lifetime of the coarray.
-  * **Procedure Interface**:
-    ```
-      subroutine caf_allocate(lcobounds, ucobounds, lbounds, ubounds, element_length, final_func, coarray_handle, allocated_memory)
-        implicit none
-        integer(kind=c_intmax_t), dimension(:), intent(in) :: lcobounds, ucobounds
-        integer(kind=c_intmax_t), dimension(:), intent(in) :: lbounds, ubounds
-        integer(kind=c_size_t) :: element_length
-        type(c_funptr), intent(in) :: final_func
-        type(caf_co_handle_t), intent(out) :: coarray_handle
-        type(c_ptr), intent(out) :: allocated_memory
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-    * **`lcobounds` and `ucobounds`**: Shall be the lower and upper bounds of the coarray being allocated. Shall be 1d arrays with the same dimensions as each other. The product of the difference of the `lcobounds` and `ucobounds` shall equal the number of team members (REMOVE_NOTE_TODO: check wording).
-    * **`lbounds` and `ubounds`**: Shall be the the lower and upper bounds of the `local_slice`. Shall be 1d arrays with the same dimensions as each other.
-    * **`element_length`**: Length of the element
-    * **`final_func`**: Shall be a function pointer to the final function, if any, for derived types
-    * **`coarray_handle`**: Represents the distributed object of the coarray on the corresponding team. Shall return the handle created by the runtime library that the compiler shall use for future coindexed-object references of the associated coarray and for deallocation of the associated coarray.
-    * **`allocated_memory`**: Shall not be allocated on entry. Shall return a pointer to the block of allocated memory for the Fortran object.
-
- #### `caf_allocate_non_symmetric`
-  * **Description**: This procedure allocates components of coarray objects. If the object to be allocated is polymorphic, it is the compiler's responsibility to set the dynamic type and it shall not be accessed remotely.
-  * **Procedure Interface**:
-    ```
-     module subroutine caf_allocate_non_symmetric(size_in_bytes, allocated_memory)
-       implicit none
-       integer(kind=c_size_t) :: size_in_bytes
-       type(c_ptr), intent(out) :: allocated_memory
-     end subroutine
-    ```
-  * **Further argument descriptions**:
-    * **`size_in_bytes`**: The size, in bytes, of the object to be allocated.
-    * **`allocated_memory`**: Shall not be allocated on entry. Shall return a pointer to the block of allocated memory for the Fortran object.
-
- #### `caf_deallocate`
-  * **Description**: This procedure releases memory previously allocated for all of the coarrays associated with the handles in `coarray_handles`, resulting in the destruction of any associated `local_slices` received by the compiler after `caf_allocate` calls.  (REMOVE_NOTE_TODO: reword) The compiler will insert calls to this procedure when exiting a local scope where implicit deallocation of a coarray is mandated by the standard and when a coarray is explicitly deallocated through a `deallocate-stmt` in the source code.
-  * **Procedure Interface**:
-    ```
-      subroutine caf_deallocate(coarray_handles)
-        implicit none
-        type(caf_co_handle_t), dimension(:), intent(in) :: coarray_handles
-      end subroutine
-    ```
-  * **Argument descriptions**:
-    * **`coarray_handles`**: Is an array of all of the handles for the coarrays that shall be deallocated.
-
- #### `caf_deallocate_non_symmetric`
-  * **Description**: This procedure releases memory previously allocated for a component of a derived type coarray.
-  * **Procedure Interface**:
-    ```
-      subroutine caf_deallocate_non_symmetric(mem)
-        implicit none
-        type(c_ptr), intent(in) :: mem
-      end subroutine
-    ```
-  * **Argument descriptions**:
-    * **`mem`**: Pointer to the block of memory to be released.
-
-
-### Coarray Access
-
- Coarray accesses will maintain serial dependencies for the issuing image. A non-blocking get has to be started and finished in the same segment. The interface provides puts that are fence-based and gets that are split phased.
-
- #### Common arguments
-  * **`coarray_handle`**
-    * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
-    * scalar of type [`caf_co_handle_t`](#caf_co_handle_t)
-    * is a handle for the established coarray
-    * represents the distributed object of the coarray on the corresponding team
-  * **`coindices`**
-    * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
-    * 1d assumed-shape array of type `integer`
-    * (REMOVE_NOTE_TODO: fill in)
-  * **`value`**
-    * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
-    * assumed-rank array of `type(*)`
-    * is (REMOVE_NOTE_TODO: fill in)
-  * **`mold`**
-    * Argument for [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
-    * assumed-rank array of `type(*)`
-    * is (REMOVE_NOTE_TODO: fill in)
-  * **`team` and `team_number`**
-    * Argument for [`caf_put`](#caf_put), [`caf_get`](#caf_get), [`caf_get_async`](#caf_get_async)
-    * are optional arguments that specify a team
-    * shall not both be present in the same call
-
- #### `caf_put`
-  * **Description**: This procedure assigns to a coarray, when both sides of the assignment are contiguous. The compiler shall call this procedure when there is a coarray reference that is a `coindexed-object`. The compiler shall not (REMOVE_NOTE: need to?) call this procedure when the coarray reference is not a `coindexed-object`. This procedure blocks on local completion. (REMOVE_NOTE: eventually would like a caf_put that doesn't block on local completion).
-  * **Procedure Interface**:
-    ```
-     subroutine caf_put(coarray_handle, coindices, value, element_storage_size, first_element_addr, team, team_number, stat)
-       implicit none
-       type(caf_co_handle_t), intent(in) :: coarray_handle
-       integer, intent(in) :: coindices(:)
-       type(*), dimension(..), intent(in), contiguous :: value
-       integer(kind=c_size_t), intent(in) :: element_storage_size
-       type(c_ptr), intent(in) :: first_element_addr ! represents the address in the local slice corresponding to where the first element lives on the remote
-       type(team_type), optional, intent(in) :: team
-       integer, optional, intent(in) :: team_number
-       integer, optional, intent(out) :: stat
-     end subroutine caf_put
-    ```
-  * **Further argument descriptions**:
-    * **`value`**: The value that shall be assigned to (REMOVE_NOTE_TODO: fill in)
-
-
- #### `caf_put_strided`
-  * **Description**: TODO: fill in
-  * **Procedure Interface**:
-    ```
-     ! more general case
-     subroutine caf_put_strided( &
-         coarray_handle, coindices, value, element_storage_size, &
-         first_element_addr, extent, stride, team, team_number, stat)
-       implicit none
-       type(caf_co_handle_t), intent(in) :: coarray_handle
-       integer, intent(in) :: coindices(:)
-       type(*), dimension(..), intent(in) :: value
-       integer(kind=c_size_t), intent(in) :: element_storage_size
-       type(c_ptr), intent(in) :: first_element_addr ! represents the address in the local slice corresponding to where the first element lives on the remote
-       integer(kind=c_size_t) :: extent(:)
-       integer(kind=c_ptrdiff_t) :: stride(:)
-       type(team_type), optional, intent(in) :: team
-       integer, optional, intent(in) :: team_number
-       integer, optional, intent(out) :: stat
-     end subroutine
-    ```
-  * **Further argument descriptions**:
-    * **`coarray_handle`**:
-
- #### `caf_put_raw`
-  * **Description**: Assign to size number of bytes from image named by coindicies starting at remote pointer, copying from local_buffer.
-  * **Procedure Interface**:
-    ```
-      subroutine caf_put_raw(image_num, local_buffer, remote_ptr, size, stat)
-        implicit none
-        integer(kind=c_int), intent(in) :: image_num
-        type(c_ptr), intent(in) :: local_buffer
-        integer(kind=c_int64_t), intent(in) :: remote_ptr
-        integer(kind=c_size_t), intent(in) :: size
-        integer, optional, intent(out) :: stat
-      end subroutine
-    ```
-
- #### `caf_put_raw_strided`
-  * **Description**: TODO: (reword) Assign to size number of bytes from image named by coindicies starting at remote pointer, use stride to assign, copying from local_buffer.
-  * **Procedure Interface**:
-    ```
-       subroutine caf_put_raw_strided(image_num, local_buffer, remote_ptr, size, extent, remote_ptr_stride, local_buffer_stride, stat)
-        implicit none
-        integer(kind=c_int), intent(in) :: image_num
-        type(c_ptr), intent(in) :: local_buffer
-        integer(kind=c_int64_t), intent(in) :: remote_ptr
-        integer(kind=c_size_t), intent(in) :: size
-        integer(kind=c_size_t) :: extent(:)
-        integer(kind=c_ptrdiff_t) :: remote_ptr_stride(:), local_buffer_stride(:)
-        integer, optional, intent(out) :: stat
-      end subroutine
-    ```
-
- #### `caf_get`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_get(coarray_handle, coindices, mold, value, team, team_number, stat)
-        implicit none
-        type(caf_co_handle_t), intent(in) :: coarray_handle
-        integer, intent(in) :: coindices(:)
-        type(*), dimension(..), intent(in) :: mold
-        type(*), dimension(..), intent(inout) :: value
-        type(team_type), optional, intent(in) :: team
-        integer, optional, intent(in) :: team_number
-        integer, optional, intent(out) :: stat
-      end subroutine
-    ```
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation and shall be distinct from `PRIF_STAT_FAILED_IMAGE`,
+`PRIF_STAT_LOCKED_OTHER_IMAGE`, `PRIF_STAT_STOPPED_IMAGE`, `PRIF_STAT_UNLOCKED` and
+`PRIF_STAT_UNLOCKED_FAILED_IMAGE`.
 
- #### `caf_get_async`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_get_async(coarray_handle, coindices, mold, value, async_handle, team, team_number, stat)
-        implicit none
-        type(caf_co_handle_t),  intent(in) :: coarray_handle
-        integer, dimension(:),  intent(in) :: coindices
-        type(*), dimension(..), intent(in) :: mold
-        type(*), dimension(..), intent(inout) :: value
-        type(caf_async_handle_t), intent(out) :: async_handle
-        type(team_type), optional, intent(in) :: team
-        integer, optional, intent(in) :: team_number
-        integer, optional, intent(out) :: stat
-      end subroutine
-    ```
+#### `PRIF_STAT_LOCKED_OTHER_IMAGE`
 
- #### `caf_get_raw`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_get_raw(image_num, local_buffer, remote_ptr, size, stat)
-        implicit none
-        integer(kind=c_int), intent(in) :: image_num
-        type(c_ptr), intent(in) :: local_buffer
-        integer(kind=c_int64_t), intent(in) :: remote_ptr
-        integer(kind=c_size_t), intent(in) :: size
-        integer, optional, intent(out) :: stat
-      end subroutine
-    ```
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation and shall be distinct from `PRIF_STAT_FAILED_IMAGE`,
+`PRIF_STAT_LOCKED`, `PRIF_STAT_STOPPED_IMAGE`, `PRIF_STAT_UNLOCKED` and
+`PRIF_STAT_UNLOCKED_FAILED_IMAGE`.
 
- #### `caf_get_raw_strided`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_get_raw_strided(image_num, local_buffer, remote_ptr, size, extent, remote_ptr_stride, local_buffer_stride, stat)
-        implicit none
-        integer(kind=c_int), intent(in) :: image_num
-        type(c_ptr), intent(in) :: local_buffer
-        integer(kind=c_int64_t), intent(in) :: remote_ptr
-        integer(kind=c_size_t), intent(in) :: size
-        integer(kind=c_size_t) :: extent(:)
-        integer(kind=c_ptrdiff_t) :: remote_ptr_stride(:), local_buffer_stride(:)
-        integer, optional, intent(out) :: stat
-      end subroutine
-    ```
+#### `PRIF_STAT_STOPPED_IMAGE`
 
- #### `caf_base_pointer`
-  * **Description**: This procedure provides a pointer to the base of the coarray elements on a given image and may be used in conjunction with caf_get_raw
-  * **Procedure Interface**:
-    ```
-      function caf_base_pointer(coarray_handle, coindices, team, team_number, stat) result (raw_ptr_int)
-        implicit none
-        type(caf_co_handle_t), intent(in) :: coarray_handle
-        integer, intent(in) :: coindices(:)
-        integer(kind=c_int64_t) :: raw_ptr_int
-        type(team_type), optional, intent(in) :: team
-        integer, optional, intent(in) :: team_number
-        integer, optional, intent(out) :: stat
-      end function
-    ```
+This shall be a positive value of type `integer(c_int)` that is defined by the
+implementation and shall be distinct from `PRIF_STAT_FAILED_IMAGE`,
+`PRIF_STAT_LOCKED`, `PRIF_STAT_LOCKED_OTHER_IMAGE`, `PRIF_STAT_UNLOCKED` and
+`PRIF_STAT_UNLOCKED_FAILED_IMAGE`.
 
-###  Operation Synchronization
+#### `PRIF_STAT_UNLOCKED`
 
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation and shall be distinct from `PRIF_STAT_FAILED_IMAGE`,
+`PRIF_STAT_LOCKED`, `PRIF_STAT_LOCKED_OTHER_IMAGE`, `PRIF_STAT_STOPPED_IMAGE` and
+`PRIF_STAT_UNLOCKED_FAILED_IMAGE`.
 
- #### Common arguments
-  * **`async_handle`**
-    * Argument for [`caf_async_wait_for`](#caf_async_wait_for), [`caf_async_try_for`](#caf_async_try_for)
-    * scalar of type [`caf_async_handle_t`](#caf_async_handle_t)
-    * This argument is a handle used to track the asynchronous operation REMOVE_NOTE_TODO: reword and buff out this sentence
 
+#### `PRIF_STAT_UNLOCKED_FAILED_IMAGE`
 
- #### `caf_async_wait_for`
-  * **Description**: This procedure waits until (REMOVE_NOTE: asynchronous?) operation is complete and then consumes the async handle
-  * **Procedure Interface**:
-    ```
-      subroutine caf_async_wait_for(async_handle)
-        implicit none
-        type(caf_async_handle_t), intent(inout) :: async_handle
-      end subroutine
-    ```
+This shall be a value of type `integer(c_int)` that is defined by the
+implementation and shall be distinct from `PRIF_STAT_FAILED_IMAGE`,
+`PRIF_STAT_LOCKED`, `PRIF_STAT_LOCKED_OTHER_IMAGE`, `PRIF_STAT_STOPPED_IMAGE` and
+`PRIF_STAT_UNLOCKED`.
 
- #### `caf_async_try_for`
-  * **Description**: This procedure consumes the async handle if and only if the operation is complete
-  * **Procedure Interface**:
-    ```
-      subroutine caf_async_try_for(async_handle, finished)
-        implicit none
-        type(caf_async_handle_t), intent(inout) :: async_handle
-        logical, intent(out) :: finished
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-    * **`finished`**: This argument returns `true` if the asynchronous operation is complete
+### PRIF specific types
 
- #### `caf_sync_memory`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_sync_memory(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
-### Image Synchronization
-
-  Compilers need an "optimization fence",
-  inline assembly available in c/c++ to provide optimization fences
+These types are used to represent opaque "descriptors" that can be passed to and
+from the PRIF implementation between operations.
 
- #### `caf_sync_all`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_sync_all(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
+#### `prif_coarray_handle`
 
- #### `caf_sync_images`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_sync_images(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
+* a derived type provided by the PRIF implementation and that will be opaque to the
+  compiler that represents a reference to a coarray variable is used for coarray
+  operations.
+* It maintains some "context data" on a per-image basis, which the compiler may
+  use to support proper implementation of coarray arguments, especially with
+  respect to automatic deallocation of coarrays at an `end team` statement. This
+  is accessed/set with the provided procedures `prif_get_context_handle` and
+  `prif_set_context_handle`. PRIF does not interpret the contents of this context data in
+  any way, and it is only accessible on the current image. The context data is
+  a property of the allocated coarray object, and is thus shared between all
+  handles and aliases that refer to the same coarray allocation (i.e. those
+  created from a call to `prif_alias_create`).
 
- #### `caf_lock`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_lock(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
+#### `prif_critical_type`
 
- #### `caf_unlock`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_unlock(fill in...)
-      end subroutine
-    ```
-  * **Further argument descriptions**:
+* a derived type provided by the PRIF implementation that will be opaque to the
+  compiler that will be used for implementing `critical` blocks
 
- #### `caf_critical`
-  * **Description**: For each critical construct, the compiler shall define a coarray that shall only be used to begin and end the critical block. The coarray shall be a scalar coarray of type `lock_type` and the associated coarray handle shall be passed to the procedure.
-  * **Procedure Interface**:
-    ```
-      subroutine caf_critical(critical_coarray, stat)
-        implicit none
-        type(caf_co_handle_t), intent(in) :: critical_coarray
-        integer, optional, intent(out) :: stat
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-    * **`critical_coarray`**:
+## Procedure descriptions
 
-#### `caf_end_critical`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_end_critical(critical_coarray)
-        implicit none
-        type(caf_co_handle_t), intent(in) :: critical_coarray
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-      * **`critical_id`**: shall be the same unique identifier for the critical construct that was passed to the runtime library during the corresponding call to `caf_critical`. REMOVE_NOTE_TODO: reword?
+The PRIF API provides implementations of parallel Fortran features, as specified
+in Fortran 2023. For any given `prif_*` procedure that corresponds to a Fortran
+procedure or statement of similar name, the constraints and semantics associated
+with each argument to the `prif_` procedure match those of the analogous
+argument to the parallel Fortran feature, except where this document explicitly
+specifies otherwise. For any given `prif_*` procedure that corresponds to a Fortran
+procedure or statement of similar name, the constraints and semantics match those
+of the analogous parallel Fortran feature. Specifically, any required synchronization
+is performed by the PRIF implementation unless otherwise specified.
+
+Where possible, optional arguments are used for optional parts or different
+forms of statements or procedures. In some cases the different forms or presence
+of certain options change the return type or rank, and in those cases a generic
+interface with different specific procedures is used.
+
+### Common arguments
+
+* **`team`**
+  * a value of type `prif_team_type` that identifies a team that the
+    current image is a member of
+  * shall not be present with `team_number` except in a call to `prif_form_team`
+* **`team_number`**
+  * a value of type `integer(c_intmax_t)` that identifies a sibling team or
+    in a call to `prif_form_team`, which team to become a member of
+  * shall not be present with `team` except in a call to `prif_form_team`
+* **`image_num`, any argument identifying an image**
+  * May identify the current image
+
+### Integer and Pointer Arguments
+
+There are several categories of arguments where the PRIF implementation will need
+pointers and/or integers. These fall broadly into the following categories.
+
+1. `integer(c_intptr_t)`: Anything containing a pointer representation where
+   the compiler might be expected to perform pointer arithmetic
+2. `type(c_ptr)` and `type(c_funptr)`: Anything containing a pointer to an
+   object/function where the compiler is expected only to pass it (back) to the
+   PRIF implementation
+3. `integer(c_size_t)`: Anything containing an object size, in units of bytes
+   or elements, i.e. shape, element_size, etc.
+4. `integer(c_ptrdiff_t)`: strides between elements for non-contiguous coarray
+   accesses
+5. `integer(c_int)`: Integer arguments corresponding to image index and
+  stat arguments. It is expected that the most common arguments
+  appearing in Fortran code will be of default integer, it is expected that
+  this will correspond with that kind, and there is no reason to expect these
+  arguments to have values that would not be representable in this kind.
+6. `integer(c_intmax_t)`: Bounds, cobounds, indices, coindices, and any other
+  argument to an intrinsic procedure that accepts or returns an arbitrary
+  integer.
+
+The compiler is responsible for generating values and temporary variables as
+necessary to pass arguments of the correct type/size, and perform conversions
+when needed.
+
+#### sync-stat-list
+
+* **`stat`** : This argument is `intent(out)` representing the presence and
+  type of any error that occurs. A value of zero, indicates no error occurred.
+  It is of type `integer(c_int)`, to minimize the frequency that integer
+  conversions will be needed. If a different kind of integer is used as the
+  argument, it is the compiler's responsibility to use an intermediate variable
+  as the argument to the PRIF implementation procedure and provide conversion to the
+  actual argument.
+* **`errmsg` or `errmsg_alloc`** : There are two optional arguments for this,
+  one which is allocatable and one which is not. It is the compiler's
+  responsibility to ensure the appropriate optional argument is passed.
+  If no error occurs, the definition status of the actual argument is unchanged.
 
-### Events
+### Program startup and shutdown
 
- #### `caf_event_post`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_event_post(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
+For a program that uses parallel Fortran features, the compiler shall insert
+calls to `prif_init` and `prif_stop`. These procedures will initialize and
+terminate the parallel runtime. `prif_init` shall be called prior
+to any other calls to the PRIF implementation.
+
+#### `prif_init`
+
+* **Description**: This procedure will initialize the parallel environment.
+* **Procedure Interface**:
+```
+subroutine prif_init(exit_code)
+  integer(c_int), intent(out) :: exit_code
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`exit_code`**: a non-zero value indicates an error occurred during
+    initialization.
+
+#### `prif_stop`
+
+* **Description**: This procedure synchronizes all executing images, cleans up
+  the parallel runtime environment, and terminates the program.
+  Calls to this procedure do not return.
+* **Procedure Interface**:
+```
+subroutine prif_stop(quiet, stop_code_int, stop_code_char)
+  logical(c_bool), intent(in) :: quiet
+  integer(c_int), intent(in), optional :: stop_code_int
+  character(len=*), intent(in), optional :: stop_code_char
+end subroutine
+```
+* **Further argument descriptions**: At most one of the arguments
+  `stop_code_int` or `stop_code_char` shall be supplied.
+  * **`quiet`**: if this argument has the value `.true.`, no output of
+    signaling exceptions or stop code will be produced. Note that in the case
+    the statement does not contain this optional part, the compiler should
+    provide the value `.false.`.
+  * **`stop_code_int`**: is used as the process exit code if it is provided.
+    Otherwise, the process exit code is `0`.
+  * **`stop_code_char`**: is written to the unit identified by the named
+    constant `OUTPUT_UNIT` from the intrinsic module `ISO_FORTRAN_ENV` if
+    provided.
+
+#### `prif_error_stop`
+
+* **Description**: This procedure terminates all executing images
+  Calls to this procedure do not return.
+* **Procedure Interface**:
+```
+subroutine prif_error_stop(quiet, stop_code_int, stop_code_char)
+  logical(c_bool), intent(in) :: quiet
+  integer(c_int), intent(in), optional :: stop_code_int
+  character(len=*), intent(in), optional :: stop_code_char
+end subroutine
+```
+* **Further argument descriptions**: At most one of the arguments
+  `stop_code_int` or `stop_code_char` shall be supplied.
+  * **`quiet`**: if this argument has the value `.true.`, no output of
+    signaling exceptions or stop code will be produced. Note that in the case
+    the statement does not contain this optional part, the compiler should
+    provide the value `.false.`.
+  * **`stop_code_int`**: is used as the process exit code if it is provided.
+    Otherwise, the process exit code is a non-zero value.
+  * **`stop_code_char`**: is written to the unit identified by the named
+    constant `ERROR_UNIT` from the intrinsic module `ISO_FORTRAN_ENV` if
+    provided.
+
+#### `prif_fail_image`
+
+* **Description**: causes the executing image to cease participating in
+  program execution without initiating termination.
+  Calls to this procedure do not return.
+* **Procedure Interface**:
+```
+subroutine prif_fail_image()
+end subroutine
+```
 
- #### `caf_event_wait`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_event_wait(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
+### Image Queries
 
- #### `caf_event_query`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_event_query(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
+#### `prif_num_images`
+
+* **Description**: Query the number of images in the specified or current team.
+* **Procedure Interface**:
+```
+subroutine prif_num_images(team, team_number, image_count)
+  type(prif_team_type), intent(in), optional :: team
+  integer(c_intmax_t), intent(in), optional :: team_number
+  integer(c_int), intent(out) :: image_count
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`team` and `team_number`**: optional arguments that specify a team. They
+    shall not both be present in the same call.
+
+#### `prif_this_image`
+
+* **Description**: Determine the image index or cosubscripts with respect to a
+  given coarray of the current image in a given team or the current team.
+  team, or the cosubscripts
+* **Procedure Interface**:
+```
+interface prif_this_image
+  subroutine prif_this_image_no_coarray(team, image_index)
+    type(prif_team_type), intent(in), optional :: team
+    integer(c_int), intent(out) :: image_index
+  end subroutine
+
+  subroutine prif_this_image_with_coarray( &
+      coarray_handle, team, cosubscripts)
+    type(prif_coarray_handle), intent(in) :: coarray_handle
+    type(prif_team_type), intent(in), optional :: team
+    integer(c_intmax_t), intent(out) :: cosubscripts(:)
+  end subroutine
+
+  subroutine prif_this_image_with_dim( &
+      coarray_handle, dim, team, cosubscript)
+    type(prif_coarray_handle), intent(in) :: coarray_handle
+    integer(c_int), intent(in) :: dim
+    type(prif_team_type), intent(in), optional :: team
+    integer(c_intmax_t), intent(out) :: cosubscript
+  end subroutine
+end interface
+```
+* **Further argument descriptions**:
+  * **`cosubscripts`**: the cosubcripts that would identify the current image
+    in the specified team when used as coindices for the specified coarray
+  * **`dim`**: identify which of the elements from `cosubscripts` should be
+    returned as the `cosubscript` value
+  * **`cosubscript`**: the element identified by `dim` or the array
+    `cosubscripts` that would have been returned without the `dim` argument
+    present
+
+#### `prif_failed_images`
+
+* **Description**: Determine the image indices of known failed images, if any.
+* **Procedure Interface**:
+```
+subroutine prif_failed_images(team, failed_images)
+  type(prif_team_type), intent(in), optional :: team
+  integer(c_int), allocatable, intent(out) :: failed_images(:)
+end subroutine
+```
+
+#### `prif_stopped_images`
+
+* **Description**: Determine the image indices of images known to have initiated
+  normal termination, if any.
+* **Procedure Interface**:
+```
+subroutine prif_stopped_images(team, stopped_images)
+  type(prif_team_type), intent(in), optional :: team
+  integer(c_int), allocatable, intent(out) :: stopped_images(:)
+end subroutine
+```
+
+#### `prif_image_status`
+
+* **Description**: Determine the image execution state of an image
+* **Procedure Interface**:
+```
+impure elemental subroutine prif_image_status(image, team, image_status)
+  integer(c_int), intent(in) :: image
+  type(prif_team_type), intent(in), optional :: team
+  integer(c_int), intent(out) :: image_status
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`image`**: the image index of the image in the given or current team for
+    which to return the execution status
+  * **`team`**: if provided, the team from which to identify the image
+  * **`image_status`**: has the value `PRIF_STAT_FAILED_IMAGE` if the identified
+    image has failed, `PRIF_STAT_STOPPED_IMAGE` if the identified image has initiated
+    normal termination, or zero.
+
+### Coarrays
+
+#### Common arguments
+
+* **`coarray_handle`**
+  * Argument for many of the coarray access procedures
+  * scalar of type [`prif_coarray_handle`](#prif_coarray_handle)
+  * is a handle for the established coarray
+  * represents the distributed object of the coarray in the team in which it
+    was established
+* **`coindices`**
+  * Argument for many of the coarray access procedures
+  * 1d assumed-shape array of type `integer`
+  * correspond to the coindices appearing in a coindexed object
+* **`value`** or `local_buffer`
+  * Argument for `put` and `get` operations
+  * assumed-rank array of `type(*)` or `type(c_ptr)`
+  * It is the value to be sent in a `put` operation, and is assigned the value
+    retrieved in the case of a `get` operation
+* **`image_num`**
+  * identifies the image to be communicated with
+  * is the image index in the initial team
+  * may be the current image
+
+#### Allocation and deallocation
+
+Calls to `prif_allocate` and `prif_deallocate` are collective operations, while
+other allocation/deallocation operations are not. Note that a call to
+`move_alloc` with coarray arguments is also a collective operation, as described
+in the section below.
+
+##### Static coarray allocation
+
+The compiler is responsible to generate code that collectively runs
+`prif_allocate` once for each static coarray and initializes them where applicable.
+
+##### `prif_allocate`
+
+* **Description**: This procedure allocates memory for a coarray. 
+  This call is collective over the current team.  Calls to
+  `prif_allocate` will be inserted by the compiler when there is an explicit
+  coarray allocation or at the beginning of a program to allocate space for
+  statically declared coarrays in the source code. The PRIF implementation will
+  store the coshape information in order to internally track it during the
+  lifetime of the coarray.
+* **Procedure Interface**:
+```
+subroutine prif_allocate( &
+    lcobounds, ucobounds, lbounds, ubounds, element_length, &
+    final_func, coarray_handle, allocated_memory, &
+    stat, errmsg, errmsg_alloc)
+  integer(kind=c_intmax_t), intent(in) :: lcobounds(:), ucobounds(:)
+  integer(kind=c_intmax_t), intent(in) :: lbounds(:), ubounds(:)
+  integer(kind=c_size_t), intent(in) :: element_length
+  type(c_funptr), intent(in) :: final_func
+  type(prif_coarray_handle), intent(out) :: coarray_handle
+  type(c_ptr), intent(out) :: allocated_memory
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`lcobounds` and `ucobounds`**: Shall be the lower and upper bounds of the
+    codimensions of the coarray being allocated. Shall be 1d arrays with the
+    same dimensions as each other. The cobounds shall be sufficient to have a
+    unique index for every image in the current team.
+    I.e. `product(coshape(coarray)) >= num_images`.
+  * **`lbounds` and `ubounds`**: Shall be the the lower and upper bounds of the
+    local portion of the array. Shall be 1d arrays with the same dimensions as
+    each other.
+  * **`element_length`**: size of a single element of the array in bytes
+  * **`final_func`**: Shall be a function pointer to the final subroutine, if any,
+    for derived types. It is the responsibility of the compiler to generate
+    such a subroutine if necessary to clean up allocatable components, typically
+    with calls to `prif_deallocate_non_symmetric`. It may also be necessary to
+    modify the allocation status of the coarray variable, especially in the case
+    that it was allocated through a dummy argument. Its interface should be
+    equivalent to the following Fortran interface
+    ```
+    subroutine coarray_cleanup(handle, stat, errmsg) bind(C)
+      type(prif_coarray_handle), intent(in) :: handle
+      integer(c_int), intent(out) :: stat
+      character(len=:), intent(out), allocatable :: errmsg
+    end subroutine
+    ```
+    or to the following equivalent C prototype
+    ```
+    void coarray_cleanup( 
+        prif_handle_t* handle, int* stat, CFI_cdesc_t* errmsg)
+    ```
+    The coarray handle can then be interrogated to determine the memory address
+    and size of the data in order to orchestrate calling any necessary final
+    subroutines or deallocation of any allocatable components, or the context
+    data to orchestrate modifying the allocation status of a local variable
+    portion of the coarray. It will be invoked once on each image, upon
+    deallocation of the coarray.
+  * **`coarray_handle`**: Represents the distributed object of the coarray on
+    the corresponding team. The handle is created by the PRIF implementation and the
+    compiler uses it for subsequent coindexed-object references of the
+    associated coarray and for deallocation of the associated coarray.
+  * **`allocated_memory`**: A pointer to the local block of allocated memory for
+    the Fortran object. The compiler is responsible for associating the local
+    Fortran object with this memory, and initializing it if necessary.
+
+##### `prif_allocate_non_symmetric`
+
+* **Description**: This procedure is used to allocate components of coarray
+  objects.
+* **Procedure Interface**:
+```
+subroutine prif_allocate_non_symmetric( &
+    size_in_bytes, allocated_memory, stat, errmsg, errmsg_alloc)
+  integer(kind=c_size_t) :: size_in_bytes
+  type(c_ptr), intent(out) :: allocated_memory
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`size_in_bytes`**: The size, in bytes, of the object to be allocated.
+  * **`allocated_memory`**: A pointer to the block of allocated memory for the
+    Fortran object. The compiler is responsible for associating the Fortran
+    object with this memory, and initializing it if necessary.
+
+##### `prif_deallocate`
+
+* **Description**: This procedure releases memory previously allocated for all
+  of the coarrays associated with the handles in `coarray_handles`. This means
+  that any local objects associated with this memory become invalid. The
+  compiler will insert calls to this procedure when exiting a local scope where
+  implicit deallocation of a coarray is mandated by the standard and when a
+  coarray is explicitly deallocated through a `deallocate-stmt` in the source
+  code.
+  This call is collective over the current team, and the provided list of handles
+  must denote corresponding coarrays (in the same order on every image) that
+  were allocated by the current team using `prif_allocate` and not yet deallocated.
+  It will start with a synchronization over the current team, and then the final subroutine
+  for each coarray (if any) will be called. A synchronization will also occur
+  before control is returned from this procedure, after all deallocation has been
+  completed.
+* **Procedure Interface**:
+```
+subroutine prif_deallocate( &
+    coarray_handles, stat, errmsg, errmsg_alloc)
+  type(prif_coarray_handle), intent(in) :: coarray_handles(:)
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Argument descriptions**:
+  * **`coarray_handles`**: Is an array of all of the handles for the coarrays
+    that shall be deallocated.
+
+##### `prif_deallocate_non_symmetric`
+
+* **Description**: This procedure releases memory previously allocated by a call
+  to `prif_allocate_non_symmetric`.
+* **Procedure Interface**:
+```
+subroutine prif_deallocate_non_symmetric( &
+    mem, stat, errmsg, errmsg_alloc)
+  type(c_ptr), intent(in) :: mem
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`mem`**: Pointer to the block of memory to be released.
+
+##### `prif_alias_create`
+
+* **Description**: Create a new coarray handle for an existing coarray, such as
+  in a [`prif_change_team`](#prif_change_team) or to pass to a coarray dummy
+  argument (especially in the case that the cobounds are different)
+* **Procedure Interface**:
+```
+subroutine prif_alias_create( &
+    source_handle, alias_co_lbounds, alias_co_ubounds, alias_handle)
+  type(prif_coarray_handle), intent(in) :: source_handle
+  integer(c_intmax_t), intent(in) :: alias_co_lbounds(:)
+  integer(c_intmax_t), intent(in) :: alias_co_ubounds(:)
+  type(prif_coarray_handle), intent(out) :: alias_handle
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`source_handle`**: a handle (which may itself be an alias) to the existing
+    coarray for which an alias is to be created
+  * **`alias_co_lbounds` and `alias_co_ubounds`**: the cobounds to be used for
+    the new alias
+  * **`alias_handle`**: a new alias to the existing coarray
+
+##### `prif_alias_destroy`
+
+* **Description**: Delete an alias to a coarray
+* **Procedure Interface**:
+```
+subroutine prif_alias_destroy(alias_handle)
+  type(prif_coarray_handle), intent(in) :: alias_handle
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`alias_handle`**: the alias to be destroyed
+
+##### `move_alloc`
+
+This is not provided by PRIF, but should be easily implemented through
+manipulation of `prif_coarray_handle`s. Note that calls to
+`prif_set_context_data` will likely be required as part of the operation. Note
+that `move_alloc` with coarray arguments is an image control statement that
+requires synchronization, so the compiler should likely insert call(s) to
+`prif_sync_all` as part of the implementation.
+
+#### Queries
+
+##### `prif_set_context_data`
+
+* **Description**: This procedure stores a `c_ptr` associated with a coarray
+  handle for future retrieval. A typical usage would be to store a reference
+  to the actual variable whose allocation status must be changed in the case
+  that the coarray is deallocated.
+* **Procedure Interface**:
+```
+subroutine prif_set_context_data(coarray_handle, context_data)
+  type(prif_coarray_handle), intent(in) :: coarray_handle
+  type(c_ptr), intent(in) :: context_data
+end subroutine
+```
+
+##### `prif_get_context_data`
+
+* **Description**: This procedure returns the `c_ptr` provided in the most
+  recent call to [`prif_set_context_data`](#prif_set_context_data) with the
+  same coarray handle
+* **Procedure Interface**:
+```
+subroutine prif_get_context_data(coarray_handle, context_data)
+  type(prif_coarray_handle), intent(in) :: coarray_handle
+  type(c_ptr), intent(out) :: context_data
+end subroutine
+```
+
+##### `prif_base_pointer`
+
+* **Description**: This procedure returns a C pointer value referencing the base of the
+  coarray elements on a given image and may be used in conjunction with
+  various communication operations. Pointer arithmetic operations may be
+  performed with the value and the results provided as input to the
+  `get/put_*raw` or atomic procedures (none of which are guaranteed to perform
+  validity checks, e.g., to detect out-of-bounds access violations).
+  It is not valid to dereference the produced pointer value
+  or the result of any operations performed with it on any image except for
+  the identified image.
+* **Procedure Interface**:
+```
+subroutine prif_base_pointer( &
+    coarray_handle, coindices, team, team_number, ptr)
+  type(prif_coarray_handle), intent(in) :: coarray_handle
+  integer(c_intmax_t), intent(in) :: coindices(:)
+  type(prif_team_type), optional, intent(in) :: team
+  integer(c_intmax_t), optional, intent(in) :: team_number
+  integer(c_intptr_t), intent(out) :: ptr
+end subroutine
+```
+
+##### `prif_local_data_size`
+
+* **Description**: This procedure returns the size of the coarray data associated
+  with the current image. This will be equal to the following expression of the
+  arguments provided to [`prif_allocate`](#prif_allocate) at the time that the
+  coarray was allocated; `element_length * product(ubounds-lbounds+1)`
+* **Procedure Interface**:
+```
+subroutine prif_local_data_size(coarray_handle, data_size)
+  type(prif_coarray_handle), intent(in) :: coarray_handle
+  integer(c_size_t), intent(out) :: data_size
+end subroutine
+```
+
+##### `prif_lcobound`
+
+* **Description**: returns the lower cobound(s) of the coarray referred to by
+  the coarray_handle. It is the compiler's responsibility to convert to a
+  different kind if the `kind` argument appears.
+* **Procedure Interface**:
+```
+interface prif_lcobound
+  subroutine prif_lcobound_with_dim(coarray_handle, dim, lcobound)
+    type(prif_coarray_handle), intent(in) :: coarray_handle
+    integer(c_int), intent(in) :: dim
+    integer(c_intmax_t), intent(out):: lcobound
+  end subroutine
+  subroutine prif_lcobound_no_dim(coarray_handle, lcobounds)
+    type(prif_coarray_handle), intent(in) :: coarray_handle
+    integer(c_intmax_t), intent(out) :: lcobounds(:)
+  end subroutine
+end interface
+```
+* **Further argument descriptions**:
+  * **`dim`**: which codimension of the coarray to report the lower cobound of
+  * **`lcobound`**: the lower cobound of the given dimension
+  * **`lcobounds`**: an array of the size of the corank of the coarray, returns
+    the lower cobounds of the given coarray
+
+##### `prif_ucobound`
+
+* **Description**: returns the upper cobound(s) of the coarray referred to by
+  the coarray_handle. It is the compiler's responsibility to convert to a
+  different kind if the `kind` argument appears.
+* **Procedure Interface**:
+```
+interface prif_ucobound
+  subroutine prif_ucobound_with_dim(coarray_handle, dim, ucobound)
+    type(prif_coarray_handle), intent(in) :: coarray_handle
+    integer(c_int), intent(in) :: dim
+    integer(c_intmax_t), intent(out):: ucobound
+  end subroutine
+  subroutine prif_ucobound_no_dim(coarray_handle, ucobounds)
+    type(prif_coarray_handle), intent(in) :: coarray_handle
+    integer(c_intmax_t), intent(out) :: ucobounds(:)
+  end subroutine
+end interface
+```
+* **Further argument descriptions**:
+  * **`dim`**: which codimension of the coarray to report the upper cobound of
+  * **`ucobound`**: the upper cobound of the given dimension
+  * **`ucobounds`**: an array of the size of the corank of the coarray, returns
+    the upper cobounds of the given coarray
+
+##### `prif_coshape`
+
+* **Description**:
+* **Procedure Interface**:
+```
+subroutine prif_coshape(coarray_handle, sizes)
+  type(prif_coarray_handle), intent(in) :: coarray_handle
+  integer(c_size_t), intent(out) :: sizes(:)
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`sizes`**: an array of the size of the corank of the coarray, returns the
+    difference between the upper and lower cobounds + 1
+
+##### `prif_image_index`
+
+* **Description**: returns the index of the image identified by the coindices
+  provided in the `sub` argument with the given coarray on the identified team
+  or the current team if no team is identified
+* **Procedure Interface**:
+```
+subroutine prif_image_index( &
+    coarray_handle, sub, team, team_number, image_index)
+  type(prif_coarray_handle), intent(in) :: coarray_handle
+  integer(c_intmax_t), intent(in) :: sub(:)
+  type(prif_team_type), intent(in), optional :: team
+  integer(c_int), intent(in), optional :: team_number
+  integer(c_int), intent(out) :: image_index
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`team` and `team_number`**: optional arguments that specify a team. They
+    shall not both be present in the same call.
+  * **`sub`**: A list of integers that identify a specific image in the
+    identified or current team when interpreted as coindices for the provided
+    coarray.
+
+#### Access
+
+Coarray accesses will maintain serial dependencies for the issuing image. Any
+data access ordering between images is defined only with respect to ordered
+segments. Note that for put operations, "local completion" means that the provided
+arguments are no longer needed (e.g. their memory can be freed once the procedure
+has returned).
+
+##### Common Arguments
+
+* **`notify_ptr`**: optional pointer on the identified image to the notify
+  variable that should be updated on completion of the put operation. The
+  referenced variable shall be of type `prif_notify_type`. If this
+  argument is not present, no notification is performed.
+
+##### `prif_put`
+
+* **Description**: This procedure assigns to the elements of a coarray, when the elements to be
+  assigned to are contiguous in linear memory on both sides. 
+  The compiler can use this to implement assignment to a `coindexed-object`. 
+  It need not call this procedure when the coarray reference is not a `coindexed-object`. 
+  This procedure blocks on local completion.
+* **Procedure Interface**:
+```
+subroutine prif_put( &
+    coarray_handle, coindices, value, first_element_addr, &
+    team, team_number, notify_ptr, stat, errmsg, errmsg_alloc)
+  type(prif_coarray_handle), intent(in) :: coarray_handle
+  integer(c_intmax_t), intent(in) :: coindices(:)
+  type(*), dimension(..), intent(in), contiguous :: value
+  type(c_ptr), intent(in) :: first_element_addr
+  type(prif_team_type), optional, intent(in) :: team
+  integer(c_intmax_t), optional, intent(in) :: team_number
+  integer(c_intptr_t), optional, intent(in) :: notify_ptr
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`first_element_addr`**: The address of the local data in the coarray
+    corresponding to the first element to be assigned to on the identified image
+
+##### `prif_put_raw`
+
+* **Description**: Assign to `size` number of bytes on given image, starting at
+  remote pointer, copying from local_buffer.
+* **Procedure Interface**:
+```
+subroutine prif_put_raw( &
+    image_num, local_buffer, remote_ptr, notify_ptr, size, &
+    stat, errmsg, errmsg_alloc)
+  integer(c_int), intent(in) :: image_num
+  type(c_ptr), intent(in) :: local_buffer
+  integer(c_intptr_t), intent(in) :: remote_ptr
+  integer(c_intptr_t), optional, intent(in) :: notify_ptr
+  integer(c_size_t), intent(in) :: size
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`image_num`**: identifies the image to be written to in the initial team
+  * **`local_buffer`**: pointer to the contiguous local data which should be copied to the
+    identified image. 
+  * **`remote_ptr`**: pointer to where on the identified image the data should be written
+  * **`size`**: how much data is to be transferred in bytes
+
+##### `prif_put_raw_strided`
+
+* **Description**: Assign to memory on given image, starting at
+  remote pointer, copying from local_buffer, progressing through local_buffer
+  in local_buffer_stride increments and through remote memory in remote_ptr_stride
+  increments, transferring extent number of elements in each dimension.
+* **Procedure Interface**:
+```
+subroutine prif_put_raw_strided( &
+    image_num, local_buffer, remote_ptr, element_size, extent, &
+    remote_ptr_stride, local_buffer_stride, notify_ptr, &
+    stat, errmsg, errmsg_alloc)
+  integer(c_int), intent(in) :: image_num
+  type(c_ptr), intent(in) :: local_buffer
+  integer(c_intptr_t), intent(in) :: remote_ptr
+  integer(c_size_t), intent(in) :: element_size
+  integer(c_size_t), intent(in) :: extent(:)
+  integer(c_ptrdiff_t), intent(in) :: remote_ptr_stride(:)
+  integer(c_ptrdiff_t), intent(in) :: local_buffer_stride(:)
+  integer(c_intptr_t), optional, intent(in) :: notify_ptr
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * `remote_ptr_stride`, `local_buffer_stride` and `extent` must each have size
+    equal to the rank of the referenced coarray.
+  * **`image_num`**: identifies the image to be written to in the initial team
+  * **`local_buffer`**: pointer to the local data which should be copied to the
+    identified image.
+  * **`remote_ptr`**: pointer to where on the identified image the data should be written
+  * **`element_size`**: The size of each element in bytes
+  * **`extent`**: How many elements in each dimension should be transferred
+  * **`remote_ptr_stride`**: The stride (in units of bytes) between elements in
+    each dimension on the specified image. Each component of stride may
+    independently be positive or negative, but (together with `extent`) must
+    specify a region of distinct (non-overlapping) elements. The striding starts
+    at the `remote_ptr`.
+  * **`local_buffer_stride`**: The stride between elements in each dimension in
+    the local buffer. Each component of stride may independently be positive or
+    negative, but (together with `extent`) must specify a region of distinct
+    (non-overlapping) elements. The striding starts at the `local_buffer`.
+
+##### `prif_get`
+
+* **Description**: This procedure fetches data in a coarray from a specified image,
+  when the elements are contiguous in linear memory on both sides.
+  The compiler can use this to implement reads from a `coindexed-object`. 
+  It need not call this procedure when the coarray reference is not a `coindexed-object`. 
+  This procedure blocks until the requested data has been successfully assigned
+  to the `value` argument.
+* **Procedure Interface**:
+```
+subroutine prif_get( &
+    coarray_handle, coindices, first_element_addr, value, team, team_number, &
+    stat, errmsg, errmsg_alloc)
+  type(prif_coarray_handle), intent(in) :: coarray_handle
+  integer(c_intmax_t), intent(in) :: coindices(:)
+  type(c_ptr), intent(in) :: first_element_addr
+  type(*), dimension(..), intent(out), contiguous :: value
+  type(prif_team_type), optional, intent(in) :: team
+  integer(c_intmax_t), optional, intent(in) :: team_number
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`first_element_addr`**: The address of the local data in the coarray
+    corresponding to the first element to be fetched from the identified image
+
+##### `prif_get_raw`
+
+* **Description**: Fetch `size` number of contiguous bytes from given image, starting at
+  remote pointer, copying into local_buffer.
+* **Procedure Interface**:
+```
+subroutine prif_get_raw( &
+    image_num, local_buffer, remote_ptr, size, &
+    stat, errmsg, errmsg_alloc)
+  integer(c_int), intent(in) :: image_num
+  type(c_ptr), intent(in) :: local_buffer
+  integer(c_intptr_t), intent(in) :: remote_ptr
+  integer(c_size_t), intent(in) :: size
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`image_num`**: identifies the image from which the data should be fetched
+    in the initial team
+  * **`local_buffer`**: pointer to the contiguous local memory into which the retrieved
+    data should be written
+  * **`remote_ptr`**: pointer to where on the identified image the data begins
+  * **`size`**: how much data is to be transferred in bytes
+
+##### `prif_get_raw_strided`
+
+* **Description**: Copy from given image, starting at remote pointer, writing
+  into local_buffer, progressing through local_buffer in local_buffer_stride
+  increments and through remote memory in remote_ptr_stride
+  increments, transferring extent number of elements in each dimension.
+* **Procedure Interface**:
+```
+subroutine prif_get_raw_strided( &
+    image_num, local_buffer, remote_ptr, element_size, extent, &
+    remote_ptr_stride, local_buffer_stride, &
+    stat, errmsg, errmsg_alloc)
+  integer(c_int), intent(in) :: image_num
+  type(c_ptr), intent(in) :: local_buffer
+  integer(c_intptr_t), intent(in) :: remote_ptr
+  integer(c_size_t), intent(in) :: element_size
+  integer(c_size_t), intent(in) :: extent(:)
+  integer(c_ptrdiff_t), intent(in) :: remote_ptr_stride(:)
+  integer(c_ptrdiff_t), intent(in) :: local_buffer_stride(:)
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * `remote_ptr_stride`, `local_buffer_stride` and `extent` must each have size
+    equal to the rank of the referenced coarray.
+  * **`image_num`**: identifies the image from which the data should be fetched
+    in the initial team
+  * **`local_buffer`**: pointer to the local memory into which the retrieved
+    data should be written
+  * **`remote_ptr`**: pointer to where on the identified image the data begins
+  * **`element_size`**: The size of each element in bytes
+  * **`extent`**: How many elements in each dimension should be transferred
+  * **`remote_ptr_stride`**: The stride (in units of bytes) between elements in
+    each dimension on the specified image. Each component of stride may
+    independently be positive or negative, but (together with `extent`) must
+    specify a region of distinct (non-overlapping) elements. The striding starts
+    at the `remote_ptr`.
+  * **`local_buffer_stride`**: The stride between elements in each dimension in
+    the local buffer. Each component of stride may independently be positive or
+    negative, but (together with `extent`) must specify a region of distinct
+    (non-overlapping) elements. The striding starts at the `local_buffer`.
+
+### Synchronization
+
+#### `prif_sync_memory`
+
+* **Description**: Ends one segment and begins another, waiting on pending
+  communication operations with other images.
+* **Procedure Interface**:
+```
+subroutine prif_sync_memory(stat, errmsg, errmsg_alloc)
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+
+#### `prif_sync_all`
+
+* **Description**: Performs a synchronization of all images in the current team.
+* **Procedure Interface**:
+```
+subroutine prif_sync_all(stat, errmsg, errmsg_alloc)
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+
+#### `prif_sync_images`
+
+* **Description**: Performs a synchronization with the listed images.
+* **Procedure Interface**:
+```
+subroutine prif_sync_images(image_set, stat, errmsg, errmsg_alloc)
+  integer(c_int), intent(in), optional :: image_set(:)
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`image_set`**: The image inidices of the images in the current team with
+    which to synchronize. Note, if a scalar appears, the compiler should pass
+    its value as a size 1 array, and if an asterisk (`*`) appears, the compiler
+    should not pass `image_set`.
+
+#### `prif_sync_team`
+
+* **Description**: Performs a synchronization with the images of the identified
+  team.
+* **Procedure Interface**:
+```
+subroutine prif_sync_team(team, stat, errmsg, errmsg_alloc)
+  type(prif_team_type), intent(in) :: team
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`team`**: Identifies the team to synchronize.
+
+#### `prif_lock`
+
+* **Description**: Waits until the identified lock variable is unlocked
+  and then locks it if the `acquired_lock` argument is not present. Otherwise it
+  sets the `acquired_lock` argument to `.false.` if the identified lock variable
+  was locked, or locks the identified lock variable and sets the `acquired_lock`
+  argument to `.true.`. Note that if the identified lock variable was already
+  locked by the current image an error condition occurs.
+* **Procedure Interface**:
+```
+subroutine prif_lock( &
+    image_num, lock_var_ptr, acquired_lock, &
+    stat, errmsg, errmsg_alloc)
+  integer(c_int), intent(in) :: image_num
+  integer(c_intptr_t), intent(in) :: lock_var_ptr
+  logical(c_bool), intent(out), optional :: acquired_lock
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`image_num`**: the image index in the initial team for the lock variable
+    to be locked
+  * **`lock_var_ptr`**: a pointer to the base address of the lock variable to
+    be locked on the identified image, typically obtained from a call to
+    `prif_base_pointer`
+  * **`acquired_lock`**: if present is set to `.true.` if the lock was locked
+    by the current image, or set to `.false.` otherwise
+
+#### `prif_unlock`
+
+* **Description**: Unlocks the identified lock variable. Note that if the
+  identified lock variable was not locked by the current image an error
+  condition occurs.
+* **Procedure Interface**:
+```
+subroutine prif_unlock( &
+    image_num, lock_var_ptr, stat, errmsg, errmsg_alloc)
+  integer(c_int), intent(in) :: image_num
+  integer(c_intptr_t), intent(in) :: lock_var_ptr
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`image_num`**: the image index in the initial team for the lock variable
+    to be unlocked
+  * **`lock_var_ptr`**: a pointer to the base address of the lock variable to
+    be unlocked on the identified image, typically obtained from a call to
+    `prif_base_pointer`
+
+#### `prif_critical`
+
+* **Description**: The compiler shall define a coarray, and establish (allocate)
+  it in the initial team, that shall only be used to begin and end the critical
+  block. An efficient implementation will likely define one for each critical
+  block. The coarray shall be a scalar coarray of type `prif_critical_type` and
+  the associated coarray handle shall be passed to this procedure. This
+  procedure waits until any other image which has executed this procedure with
+  a corresponding coarray handle has subsequently executed `prif_end_critical`
+  with the same coarray handle an identical number of times.
+* **Procedure Interface**:
+```
+subroutine prif_critical( &
+    critical_coarray, stat, errmsg, errmsg_alloc)
+  type(prif_coarray_handle), intent(in) :: critical_coarray
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`critical_coarray`**: the handle for the `prif_critical_type` coarray
+    associated with a given critical construct
+
+#### `prif_end_critical`
+
+* **Description**: Completes execution of the critical construct associated with
+  the provided coarray handle.
+* **Procedure Interface**:
+```
+subroutine prif_end_critical(critical_coarray)
+  type(prif_coarray_handle), intent(in) :: critical_coarray
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`critical_coarray`**: the handle for the `prif_critical_type` coarray
+    associated with a given critical construct
+
+### Events and Notifications
+
+#### `prif_event_post`
+
+* **Description**: Atomically increment the count of the event variable by one.
+* **Procedure Interface**:
+```
+subroutine prif_event_post( &
+    image_num, event_var_ptr, stat, errmsg, errmsg_alloc)
+  integer(c_int), intent(in) :: image_num
+  integer(c_intptr_t), intent(in) :: event_var_ptr
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`image_num`**: the image index in the initial team for the event variable
+    to be incremented
+  * **`event_var_ptr`**: a pointer to the base address of the event variable to
+    be incremented on the identified image, typically obtained from a call to
+    `prif_base_pointer`
+
+#### `prif_event_wait`
+
+* **Description**: Wait until the count of the provided event variable is greater
+  than or equal to `until_count`, and then atomically decrement the count by that
+  value. If `until_count` is not present it has the value 1.
+* **Procedure Interface**:
+```
+subroutine prif_event_wait( &
+    event_var_ptr, until_count, stat, errmsg, errmsg_alloc)
+  integer(c_ptr), intent(in) :: event_var_ptr
+  integer(c_intmax_t), intent(in), optional :: until_count
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`event_var_ptr`**: a pointer to the event variable to be waited on
+  * **`until_count`**: the count of the given event variable to be waited for.
+    Has the value 1 if not provided.
+
+#### `prif_event_query`
+
+* **Description**: Query the count of an event.
+* **Procedure Interface**:
+```
+subroutine prif_event_query(event_var_ptr, count, stat)
+  integer(c_ptr), intent(in) :: event_var_ptr
+  integer(c_intmax_t), intent(out) :: count
+  integer(c_int), intent(out), optional :: stat
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`event_var_ptr`**: a pointer to the event variable to be queried
+  * **`count`**: the current count of the given event variable.
+
+#### `prif_notify_wait`
+
+* **Description**: Wait on notification of a put operation
+* **Procedure Interface**:
+  ```
+    subroutine prif_notify_wait( &
+        notify_var_ptr, until_count, stat, errmsg, errmsg_alloc)
+      integer(c_ptr), intent(in) :: notify_var_ptr
+      integer(c_intmax_t), intent(in), optional :: until_count
+      integer(c_int), intent(out), optional :: stat
+      character(len=*), intent(inout), optional :: errmsg
+      character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+    end subroutine
+  ```
+* **Further argument descriptions**:
+  * **`notify_var_ptr`**: a pointer to the notify variable to be waited on. The
+  referenced variable shall be of type `prif_notify_type`.
+  * **`until_count`**: the count of the given notify variable to be waited for.
+    Has the value 1 if not provided.
 
 ### Teams
-  (REMOVE_NOTE_TODO: check the interface for caf_change_team and caf_form_team, currently are same as the procedures in Caffeine, but these interfaces have not yet been discussed and decided upon for the Coarray Fortran Parallel Runtime Library Interface. May need to add something? Change something?)
-  TODO: add general points related to our team design
-  The only time we create a new handle for an established coarray is in a change-team-stmt. The only times we create handles is `caf_allocate`, and a `change-team-stmt`.
 
- #### `caf_change_team`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_change_team(team)
-        implicit none
-        type(team_type), target, intent(in) :: team
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_end_team`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_end_team(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_form_team`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_form_team(num, team, new_index, stat, errmsg)
-        implicit none
-        integer,          intent(in)  :: num
-        type(team_type),  intent(out) :: team
-        integer,          intent(in),    optional :: new_index
-        integer,          intent(out),   optional :: stat
-        character(len=*), intent(inout), optional :: errmsg
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_sync_team`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_sync_team(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
+Team creation forms a tree structure, where a given team may create multiple
+child teams. The initial team is created by the `prif_init` procedure. Each
+subsequently created team's parent team is then the current team. Team
+membership is thus strictly hierarchical, following a single path along the
+tree formed by team creation.
+
+#### `prif_form_team`
+
+* **Description**: Create teams. Each image receives a team value denoting the
+  newly created team containing all images in the current team which specify the
+  same value for `team_number`.
+* **Procedure Interface**:
+```
+subroutine prif_form_team( &
+    team_number, team, new_index, stat, errmsg, errmsg_alloc)
+  integer(c_intmax_t), intent(in) :: team_number
+  type(prif_team_type), intent(out) :: team
+  integer(c_int), intent(in), optional :: new_index
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`new_index`**: the index that the current image will have in its new team
+
+#### `prif_get_team`
+
+* **Description**: Get the team value for the current or an ancestor team. It
+  returns the current team if `level` is not present or has the value
+  `PRIF_CURRENT_TEAM`, the parent team if `level` is present with the
+  value `PRIF_PARENT_TEAM`, or the initial team if `level` is present with the value
+  `PRIF_INITIAL_TEAM`
+* **Procedure Interface**:
+```
+subroutine prif_get_team(level, team)
+  integer(c_int), intent(in), optional :: level
+  type(prif_team_type), intent(out) :: team
+end subroutine
+```
+* **Further argument descriptions**:
+  * **`level`**: identify which team value to be returned
+
+#### `prif_team_number`
+
+* **Description**: Return the `team_number` that was specified in the call to
+  `prif_form_team` for the specified team, or -1 if the team is the initial
+  team. If `team` is not present, the current team is used.
+* **Procedure Interface**:
+```
+subroutine prif_team_number(team, team_number)
+  type(prif_team_type), intent(in), optional :: team
+  integer(c_intmax_t), intent(out) :: team_number
+end subroutine
+```
+
+#### `prif_change_team`
+
+* **Description**: changes the current team to the specified team. For any
+  associate names specified in the `CHANGE TEAM` statement the compiler should
+  follow a call to this procedure with calls to `prif_alias_create` to create
+  the alias coarray handle, and associate any non-coindexed references to the
+  associate name within the `CHANGE TEAM` construct with the selector.
+* **Procedure Interface**:
+```
+subroutine prif_change_team(team, stat, errmsg, errmsg_alloc)
+  type(prif_team_type), intent(in) :: team
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+
+#### `prif_end_team`
+
+* **Description**: Changes the current team to the parent team. During the
+  execution of `prif_end_team`, the PRIF implementation will deallocate any coarrays allocated during the
+  change team construct. Prior to invoking `prif_end_team`, the compiler is
+  responsible for invoking `prif_alias_destroy` for any `prif_coarray_handle`
+  handles created as part of the `change team` statement.
+* **Procedure Interface**:
+```
+subroutine prif_end_team(stat, errmsg, errmsg_alloc)
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
 
- #### `caf_get_team`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_get_team(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
+### Collectives
 
- #### `caf_team_number`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_team_number(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
+#### Common arguments
+
+* **`a`**
+  * Argument for all the collective subroutines: [`prif_co_broadcast`](#prif_co_broadcast),
+    [`prif_co_max`](#prif_co_max), [`prif_co_min`](#prif_co_min),
+    [`prif_co_reduce`](#prif_co_reduce), [`prif_co_sum`](#prif_co_sum),
+  * may be any type for `co_broadcast` or `co_reduce`, any numeric for `co_sum`,
+    and integer, real, or character for `co_min` or `co_max`
+  * is always `intent(inout)`
+  * for `co_max`, `co_min`, `co_reduce`, `co_sum` it is assigned the value
+    computed by the collective operation, if no error conditions occurs and if
+    `result_image` is absent, or the executing image is the one identified by
+    `result_image`, otherwise `a` becomes undefined
+  * for `co_broadcast`, the value of the argument on the `source_image` is
+    assigned to the `a` argument on all other images
+
+* **`source_image` or `result_image`**
+  * These arguments are of type `integer(c_int)`, to minimize the frequency
+    that integer conversions will be needed.
+
+#### `prif_co_broadcast`
+
+* **Description**: Broadcast value to images
+* **Procedure Interface**:
+```
+subroutine prif_co_broadcast( &
+    a, source_image, stat, errmsg, errmsg_alloc)
+  type(*), intent(inout), contiguous, target :: a(..)
+  integer(c_int), intent(in) :: source_image
+  integer(c_int), optional, intent(out) :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+
+#### `prif_co_max`
+
+* **Description**: Compute maximum value across images
+* **Procedure Interface**:
+```
+subroutine prif_co_max( &
+    a, result_image, stat, errmsg, errmsg_alloc)
+  type(*), intent(inout), contiguous, target :: a(..)
+  integer(c_int), intent(in), optional :: result_image
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+
+#### `prif_co_min`
+
+* **Description**: Compute minimum value across images
+* **Procedure Interface**:
+```
+subroutine prif_co_min( &
+    a, result_image, stat, errmsg, errmsg_alloc)
+  type(*), intent(inout), contiguous, target :: a(..)
+  integer(c_int), intent(in), optional :: result_image
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+
+#### `prif_co_reduce`
+
+* **Description**: Generalized reduction across images
+* **Procedure Interface**:
+```
+subroutine prif_co_reduce( &
+    a, operation, result_image, stat, errmsg, errmsg_alloc)
+  type(*), intent(inout), contiguous, target :: a(..)
+  type(c_funptr), value :: operation
+  integer(c_int), intent(in), optional :: result_image
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
+
+#### `prif_co_sum`
+
+* **Description**: Compute sum across images
+* **Procedure Interface**:
+```
+subroutine prif_co_sum( &
+    a, result_image, stat, errmsg, errmsg_alloc)
+  type(*), intent(inout), contiguous, target :: a(..)
+  integer(c_int), intent(in), optional :: result_image
+  integer(c_int), intent(out), optional :: stat
+  character(len=*), intent(inout), optional :: errmsg
+  character(len=:), intent(inout), allocatable, optional :: errmsg_alloc
+end subroutine
+```
 
 ### Atomic Memory Operation
 
 All atomic operations are blocking operations.
 
- #### Common arguments
-  * **`target`**
-    * Argument for all of the atomics (REMOVE_NOTE_TODO_DECISION: have we decided to deal with atomics with the offset option or the target option?)
-    * assumed-rank array of `type(*)`
-    * The location of this argument is the relevant information, not its value. As such, the compiler needs to ensure that when codegen (REMOVE_NOTE: ?) occurs, this argument is pass by reference and there is no copy made. The location of `target` is needed to compute the offset when the atomic operations' `atom` dummy argument is part of a derived type.
-
-
- #### `caf_atomic_add`
-  * **Description**:
-  * **Procedure Interface**: REMOVE_NOTE_TODO_DECISION:
-  Option 1 with offset:
-    ```
-      subroutine caf_atomic_add(coarray_handle, coindicies, offset, value, stat)
-        implicit none
-        type(caf_co_handle_t) :: coarray_handle
-        integer, intent(in) :: coindices(:)
-        integer :: offset, value, stat
-      end subroutine
-    ```
-
-  Option 2 with target:
-    ```
-      subroutine caf_atomic_add(coarray_handle, coindicies, target, value, stat)
-        implicit none
-        type(caf_co_handle_t) :: coarray_handle
-        integer, intent(in) :: coindices(:) ! names image num
-        integer(kind=atomic_int_kind), intent(in) :: target !location of target is relevant, not the value of target, need this to compute the offset when the `atom` dummy argument to the intrinsic is part of a derived type
-        integer :: value, stat
-      end subroutine
-    ```
-
- #### `caf_atomic_and`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_atomic_and(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_atomic_cas`
-  * **Description**:
-  * **Procedure Interfaces**:
-    ```
-      subroutine caf_atomic_cas_int_raw(image_num, atom_remote_ptr, old, compare, new, stat)
-        implicit none
-        integer, intent(in) :: image_num
-        integer(kind=c_int64_t), intent(in) :: atom_remote_ptr
-        integer(kind=atomic_int_kind), intent(in)  :: compare, new
-        integer(kind=atomic_int_kind), intent(out) :: old
-        integer, optional, intent(out) :: stat
-      end subroutine
-
-      subroutine caf_atomic_cas_logical_raw(image_num, atom_remote_ptr, old, compare, new, stat)
-        implicit none
-        integer, intent(in) :: image_num
-        integer(kind=c_int64_t), intent(in) :: atom_remote_ptr
-        logical(kind=atomic_logical_kind), intent(in)  :: compare, new
-        logical(kind=atomic_logical_kind), intent(out) :: old
-        integer, optional, intent(out) :: stat
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_atomic_define`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_atomic_define(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_atomic_fetch_add`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_atomic_fetch_add_raw(image_num, atom_remote_ptr, value, old, stat)
-        implicit none
-        integer, intent(in) :: image_num
-        integer(kind=c_int64_t), intent(in) :: atom_remote_ptr
-        integer(kind=atomic_int_kind), intent(in)  :: value
-        integer(kind=atomic_int_kind), intent(out) :: old
-        integer, optional, intent(out) :: stat
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_atomic_fetch_and`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_atomic_fetch_and(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_atomic_fetch_or`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_atomic_fetch_or(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_atomic_fetch_xor`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_fetch_xor(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_atomic_or`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_atomic_or(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_atomic_ref`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_atomic_ref(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_atomic_xor`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_atomic_xor(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
-### Coarray Queries
-
- #### `caf_lcobound`
-  * **Description**: This procedure returns the lcobounds of the coarray referred to by the coarray_handle. The lcobounds will always be a 64bit integer and it is the compiler's responsibility to convert to a different kind, as needed by the user.
-  * **Procedure Interface**:
-    ```
-      interface caf_lcobound
-        module procedure caf_lcobound_with_dim
-        module procedure caf_lcobound_no_dim
-      end interface caf_lcobound
-
-      subroutine caf_lcobound_with_dim(coarray_handle, dim, lcobound)
-        type(caf_co_handle_t), intent(in) :: coarray_handle
-        integer, intent(in) :: dim
-        integer(kind=c_int64_t), intent(out) :: lcobound
-      end subroutine
-
-      subroutine caf_lcobound_no_dim(coarray_handle, lcobounds)
-        type(caf_co_handle_t), intent(in) :: coarray_handle
-        integer(kind=c_int64_t), intent(out) :: lcobounds(:)
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_ucobound`
-  * **Description**: This procedure returns the ucobounds of the coarray referred to by the coarray_handle. The ucobounds will always be a 64bit integer and it is the compiler's responsibility to convert to a different kind, as needed by the user.
-
-  * **Procedure Interface**:
-    ```
-      interface caf_ucobound
-        module procedure caf_ucobound_with_dim
-        module procedure caf_ucobound_no_dim
-      end interface
-
-      subroutine caf_ucobound_with_dim(coarray_handle, dim, ucobound)
-        type(caf_co_handle_t), intent(in) :: coarray_handle
-        integer, intent(in) :: dim
-        integer(kind=c_int64_t), intent(out) :: ucobound
-      end subroutine
-
-      subroutine caf_ucobound_no_dim(coarray_handle, ucobounds)
-        type(caf_co_handle_t), intent(in) :: coarray_handle
-        integer(kind=c_int64_t), intent(out) :: ucobounds(:)
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_coshape`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_coshape(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_image_index`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      function caf_image_index(coarray_handle, coindices)
-        implicit none
-        type(caf_co_handle_t), intent(in) :: coarray_handle
-        integer, intent(in) :: coindices(:)
-        integer(kind=c_int) :: caf_image_index
-      end function
-    ```
-  * **Further argument descriptions**:
-
-### Image Queries
-
- #### `caf_num_images`
-  * **Description**:
-  * **Procedure Interface**:   (REMOVE_NOTE_TODO: check the interface for caf_num_images, currently is same as the procedure in Caffeine, but this interface has not yet been discussed and decided upon for the Fortran Parallel Runtime Interface. May need to add something? Change something?)
-    ```
-      function caf_num_images(team, team_number) result(image_count)
-        implicit none
-        type(team_type), intent(in), optional :: team
-        integer, intent(in), optional :: team_number
-        integer :: image_count
-      end function
-    ```
-  * **Further argument descriptions**:
-    * **`team` and `team_number`**: optional arguments that specify a team. They shall not both be present in the same call.
-  * **Result**:
-
- #### `caf_this_image`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      function caf_this_image(fill in...)
-        implicit none
-      end function
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_failed_images`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_failed_images(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_stopped_images`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_stopped_images(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
- #### `caf_image_status`
-  * **Description**:
-  * **Procedure Interface**:
-    ```
-      subroutine caf_image_status(fill in...)
-        implicit none
-      end subroutine
-    ```
-  * **Further argument descriptions**:
-
-
-## Establish and initialize static coarrays prior to `main`
-
-  (REMOVE_NOTE: complete this section, potentially move to earlier in doc) Compiler will need to: call caf_init, call caf_allocate ... for each coarray and in the right order. And then copy any initializers.
-
-# Testing plan
-[tbd]
+#### Common arguments
+
+* **`atom_remote_ptr`**
+  * Argument for all of the atomic subroutines
+  * is type `integer(c_intptr_t)`
+  * is the location of the atomic variable on the identified image to be
+    operated on
+  * it is the responsibility of the compiler to perform the necessary operations
+    on the coarray or coindexed actual argument to get the relevant remote pointer
+* **`image_num`**
+  * identifies the image on which the atomic operation is to be performed
+  * is the image index in the initial team
+
+#### Non-fetching Atomic Operations
+
+Perform specified operation on a variable in a coarray atomically.
+
+##### Common argument
+
+* **`value`**: value to perform the operation with
+
+##### `prif_atomic_add`, Addition
+
+```
+subroutine prif_atomic_add(atom_remote_ptr, image_num, value, stat)
+  integer(c_intptr_t), intent(in) :: atom_remote_ptr
+  integer(c_int), intent(in) :: image_num
+  integer(atomic_int_kind), intent(in) :: value
+  integer(c_int), intent(out), optional :: stat
+end subroutine
+```
+
+##### `prif_atomic_and`, Bitwise And
+
+```
+subroutine prif_atomic_and(atom_remote_ptr, image_num, value, stat)
+  integer(c_intptr_t), intent(in) :: atom_remote_ptr
+  integer(c_int), intent(in) :: image_num
+  integer(atomic_int_kind), intent(in) :: value
+  integer(c_int), intent(out), optional :: stat
+end subroutine
+```
+
+##### `prif_atomic_or`, Bitwise Or
+
+```
+subroutine prif_atomic_or(atom_remote_ptr, image_num, value, stat)
+  integer(c_intptr_t), intent(in) :: atom_remote_ptr
+  integer(c_int), intent(in) :: image_num
+  integer(atomic_int_kind), intent(in) :: value
+  integer(c_int), intent(out), optional :: stat
+end subroutine
+```
+
+##### `prif_atomic_xor`, Bitwise Xor
+
+```
+subroutine prif_atomic_xor(atom_remote_ptr, image_num, value, stat)
+  integer(c_intptr_t), intent(in) :: atom_remote_ptr
+  integer(c_int), intent(in) :: image_num
+  integer(atomic_int_kind), intent(in) :: value
+  integer(c_int), intent(out), optional :: stat
+end subroutine
+```
+
+#### Atomic Fetch Operations
+
+Perform specified operation on a variable in a coarray atomically and save its
+original value.
+
+##### Common arguments
+
+* **`value`**: value to perform the operation with
+* **`old`**: is set to the initial value of the atomic variable
+
+##### `prif_atomic_fetch_add`, Addition
+
+```
+subroutine prif_atomic_fetch_add( &
+    atom_remote_ptr, image_num, value, old, stat)
+  integer(c_intptr_t), intent(in) :: atom_remote_ptr
+  integer(c_int), intent(in) :: image_num
+  integer(atomic_int_kind), intent(in) :: value
+  integer(atomic_int_kind), intent(out) :: old
+  integer(c_int), intent(out), optional :: stat
+end subroutine
+```
+
+##### `prif_atomic_fetch_and`, Bitwise And
+
+```
+subroutine prif_atomic_fetch_and( &
+    atom_remote_ptr, image_num, value, old, stat)
+  integer(c_intptr_t), intent(in) :: atom_remote_ptr
+  integer(c_int), intent(in) :: image_num
+  integer(atomic_int_kind), intent(in) :: value
+  integer(atomic_int_kind), intent(out) :: old
+  integer(c_int), intent(out), optional :: stat
+end subroutine
+```
+
+##### `prif_atomic_fetch_or`, Bitwise Or
+
+```
+subroutine prif_atomic_fetch_or( &
+    atom_remote_ptr, image_num, value, old, stat)
+  integer(c_intptr_t), intent(in) :: atom_remote_ptr
+  integer(c_int), intent(in) :: image_num
+  integer(atomic_int_kind), intent(in) :: value
+  integer(atomic_int_kind), intent(out) :: old
+  integer(c_int), intent(out), optional :: stat
+end subroutine
+```
+
+##### `prif_atomic_fetch_xor`, Bitwise Xor
+
+```
+subroutine prif_atomic_fetch_xor( &
+    atom_remote_ptr, image_num, value, old, stat)
+  integer(c_intptr_t), intent(in) :: atom_remote_ptr
+  integer(c_int), intent(in) :: image_num
+  integer(atomic_int_kind), intent(in) :: value
+  integer(atomic_int_kind), intent(out) :: old
+  integer(c_int), intent(out), optional :: stat
+end subroutine
+```
+
+#### Atomic Access
+
+Atomically set or retrieve the value of an atomic variable in a coarray.
+
+##### Common argument
+
+* **`value`**: value to which the variable shall be set, or retrieved from the
+  variable
+
+##### `prif_atomic_define`, set variable's value
+
+```
+interface prif_atomic_define
+  subroutine prif_atomic_define_int( &
+      atom_remote_ptr, image_num, value, stat)
+    integer(c_intptr_t), intent(in) :: atom_remote_ptr
+    integer(c_int), intent(in) :: image_num
+    integer(atomic_int_kind), intent(in) :: value
+    integer(c_int), intent(out), optional :: stat
+  end subroutine
+
+  subroutine prif_atomic_define_logical( &
+      atom_remote_ptr, image_num, value, stat)
+    integer(c_intptr_t), intent(in) :: atom_remote_ptr
+    integer(c_int), intent(in) :: image_num
+    logical(atomic_logical_kind), intent(in) :: value
+    integer(c_int), intent(out), optional :: stat
+  end subroutine
+end interface
+```
+
+##### `prif_atomic_ref`, retrieve variable's value
+
+```
+interface prif_atomic_ref
+  subroutine prif_atomic_ref_int( &
+      value, atom_remote_ptr, image_num, stat)
+    integer(atomic_int_kind), intent(out) :: value
+    integer(c_intptr_t), intent(in) :: atom_remote_ptr
+    integer(c_int), intent(in) :: image_num
+    integer(c_int), intent(out), optional :: stat
+  end subroutine
+
+  subroutine prif_atomic_ref_logical( &
+      value, atom_remote_ptr, image_num, stat)
+    logical(atomic_logical_kind), intent(out) :: value
+    integer(c_intptr_t), intent(in) :: atom_remote_ptr
+    integer(c_int), intent(in) :: image_num
+    integer(c_int), intent(out), optional :: stat
+  end subroutine
+end interface
+```
+
+##### `prif_atomic_cas`, Compare and Swap
+
+If the value of the atomic variable is equal to the value of the `compare`
+argument, set it to the value of the `new` argument. The `old` argument is set
+to the initial value of the atomic variable.
+
+```
+interface prif_atomic_cas
+  subroutine prif_atomic_cas_int( &
+      atom_remote_ptr, image_num, old, compare, new, stat)
+    integer(c_intptr_t), intent(in) :: atom_remote_ptr
+    integer(c_int), intent(in) :: image_num
+    integer(atomic_int_kind), intent(out) :: old
+    integer(atomic_int_kind), intent(in) :: compare
+    integer(atomic_int_kind), intent(in) :: new
+    integer(c_int), intent(out), optional :: stat
+  end subroutine
+
+  subroutine prif_atomic_cas_logical( &
+      atom_remote_ptr, image_num, old, compare, new, stat)
+    integer(c_intptr_t), intent(in) :: atom_remote_ptr
+    integer(c_int), intent(in) :: image_num
+    logical(atomic_logical_kind), intent(out) :: old
+    logical(atomic_logical_kind), intent(in) :: compare
+    logical(atomic_logical_kind), intent(in) :: new
+    integer(c_int), intent(out), optional :: stat
+  end subroutine
+end interface
+```
+* **Further argument descriptions**:
+  * **`old`**: is set to the initial value of the atomic variable
+  * **`compare`**: the value with which to compare the atomic variable
+  * **`new`**: the value to set the atomic variable too if it is initially equal
+    to the `compare` argument
+
+
+# Future Work
+
+At present all communication operations are semantically blocking on at least
+local completion. We acknowledge that this prohibits certain types of static
+optimization, namely the explicit overlap of communication with computation. In
+the future we intend to develop split-phased/asynchronous versions of various
+communication operations to enable more opportunities for static optimization of
+communication.
+
+# Acknowledgments
+
+This research is supported by the Exascale Computing Project (17-SC-20-SC), a
+collaborative effort of the U.S. Department of Energy Office of Science and the
+National Nuclear Security Administration
+
+This research used resources of the National Energy Research Scientific
+Computing Center (NERSC), a U.S. Department of Energy Office of Science User
+Facility located at Lawrence Berkeley National Laboratory, operated under
+Contract No. DE-AC02-05CH11231
+
+# Copyright
+
+This work is licensed under [CC BY-ND](https://creativecommons.org/licenses/by-nd/4.0/)
+
+This manuscript has been authored by authors at Lawrence Berkeley National Laboratory under Contract No.
+DE-AC02-05CH11231 with the U.S. Department of Energy. The U.S. Government retains, and the publisher,
+by accepting the article for publication, acknowledges, that the U.S. Government retains a non-exclusive,
+paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or
+allow others to do so, for U.S. Government purposes.
+
+# Legal Disclaimer
+
+This document was prepared as an account of work sponsored by the United States Government. While
+this document is believed to contain correct information, neither the United States Government nor any
+agency thereof, nor the Regents of the University of California, nor any of their employees, makes any
+warranty, express or implied, or assumes any legal responsibility for the accuracy, completeness, or usefulness
+of any information, apparatus, product, or process disclosed, or represents that its use would not infringe
+privately owned rights. Reference herein to any specific commercial product, process, or service by its trade
+name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement,
+recommendation, or favoring by the United States Government or any agency thereof, or the Regents of the
+University of California. The views and opinions of authors expressed herein do not necessarily state or reflect
+those of the United States Government or any agency thereof or the Regents of the University of California.
 
 [Caffeine]: https://go.lbl.gov/caffeine
 [GASNet-EX]: https://go.lbl.gov/gasnet



More information about the flang-commits mailing list