[flang-commits] [flang] [flang][RFC] Adding a design document for assumed-rank objects (PR #71959)
Peter Klausler via flang-commits
flang-commits at lists.llvm.org
Fri Nov 10 09:27:21 PST 2023
================
@@ -0,0 +1,632 @@
+<!--===- docs/AssumedRank.md
+
+ Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+ See https://llvm.org/LICENSE.txt for license information.
+ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+
+-->
+# Assumed-Rank Objects
+
+An assumed-rank is a data object that takes its rank from its effective
+argument. It is a dummy, or the associated entity of a SELECT RANK in the `RANK
+DEFAULT` or `RANK(*)` blocks. Its rank is not known at compile time. The rank
+can be anything from 0 (scalar) to the maximum allowed rank in Fortran
+(currently 15 according to Fortran 2018 standard section 5.4.6 point 1).
+
+This document summarizes the contexts where assumed-rank objects can appear,
+and then describes how they are implemented and lowered to HLFIR and FIR. All
+section references are made to the Fortran 2018 standard.
+
+## Fortran Standard References
+
+Here is a list of sections and constraints from the Fortran standard involving
+assumed-ranks.
+
+- 7.3.2.2 TYPE
+ - C711
+- 7.5.6.1 FINAL statement
+ - C789
+- 8.5.7 CONTIGUOUS attribute
+ - C830
+- 8.5.8 DIMENSION attribute
+- 8.5.8.7 Assumed-rank entity
+ - C837
+ - C838
+ - C839
+- 11.1.10 SELECT RANK
+- 15.5.2.13 Restrictions on entities associated with dummy arguments
+ - 1 (3) (b) and (c)
+ - 1 (4) (b) and (c)
+- 15.5.2.4 Ordinary dummy variables - point 17
+- 18 Interoperability with C
+ - 18.3.6 point 2 (5)
+
+### Summary of the constraints:
+
+Assumed-rank can:
+- be pointers, allocatables (or have neither of those atttributes).
+- be monomorphic or polymorphic (both `TYPE(*)` and `CLASS(*)`)
+- have all the attributes, except VALUE and CODIMENSION (C837). Notably, they
+ can have the CONTIGUOUS or OPTIONAL attributes (C830).
+- appear as an actual argument of an assumed-rank dummy (C838)
+- appear as the selector of SELECT RANK (C838)
+- appear as the argument of C_LOC and C_SIZEOF from ISO_C_BINDING (C838)
+- appear as the first argument of inquiry intrinsic functions (C838). These
+ inquiry functions listed in table 16.1 are detailed in the "Assumed-rank
+ features" section below.
+- appear in BIND(C) and non BIND(C interface (18.1 point 3)
+- be finalized on entry as INTENT(OUT) under some conditions (C839)
+- be associated with any kind of scalars and arrays, including assumed size.
+
+Assumed-rank cannot:
+- be coarrays (C837)
+- have the VALUE attribute (C837)
+- be something that is not a named variable (they cannot be the result of a
+ function or a component reference)
+- appear in a designator other than the case listed above (C838). Notably, they
+ cannot be directly addressed, they cannot be used in elemental operations or
+ transformational intrinsics, they cannot be used in IO, they cannot be assigned
+ to....
+- be finalized on entry as INTENT(OUT) under certain weird conditions (C839).
+- be used in a reference to a procedure without an explicit interface
+ (15.4.2.2. point 3 (c)).
+
+With regard to aliasing, assumed-rank dummy objects follow the same rules as
+for assumed shapes, with the addition of 15.5.2.13 (c) which adds a rule when
+the actual is a scalar (adding that TARGET assumed-rank may alias if the actual
+argument is a scalar even if they have the CONTIGUOUS attribute, while it is OK
+to assume that CONTIGUOUS TARGET assumed shape do not alias with other
+dummies).
+
+---
+
+## Assumed-Rank Representations in Flang
+
+### Representation in Semantics
+In semantics (there is no concept of assumed-rank expression needed in
+`evaluate::Expr`). Such symbols have either `semantics::ObjectEntityDetails` (
+dummy data objects) with a `semantics::ArraySpec` that encodes the
+"assumed-rank-shape" (can be tested with IsAssumedRank()), or they have
+`semantics::AssocEntityDetails` (associated entity in the RANK DEFAULT case).
+
+Inside a select rank, a `semantics::Symbol` is created for the associated
+entity with `semantics::AssocEntityDetails` that points to the the selector
+and holds the rank outside of the RANK DEFAULT case.
+
+Assumed-rank dummies are also represented in the
+`evaluate::characteristics::TypeAndShape` (with the AssumedRank attribute) to
+represent assumed-rank in procedure characteristics.
+
+### Runtime Representation of Assumed-Ranks
+Assumed-ranks are implemented as CFI_cdesct_t (18.5.3) with the addition of an
+f18 specific addendum when required for the type. This is the usual f18
+descriptor, and no changes is required to represent assumed-ranks in this data
+structure. In fact, there is no difference between the runtime descriptor
+created for an assumed shape and the runtime descriptor created when the
+corresponding entity is passed as an assumed-rank.
+
+This means that any descriptor can be passed to an assumed-rank dummy (with
+care to ensure that the POINTER/ALLOCATABLE attribute match the dummy argument
+attributes as usual). Notably, any runtime interface that takes descriptor
+arguments of any ranks already work with assumed-rank entities without any
+changes or special cases.
+
+This also implies that the runtime cannot tell that an entity is an
+assumed-rank based on its descriptor, but there seems to be not need for this
+so far ("rank based" dispatching for user defined assignments and IO is not
+possible with assumed-ranks, and finalization is possible, but there is no need
+for the runtime to distinguish between finalization of an assumed-rank and
+finalization of other entities: only the runtime rank matters).
+
+The only difference and difficulty is that descriptor storage size of
+assumed-rank cannot be precisely known at compile time, and this impacts the
+way descriptor copies are generated in inline code. The size can still be
+maximized using the maximum rank, which the runtime code already does when
+creating temporary descriptor in many cases. Inline code also needs care if it
+needs to access the descriptor addendum (like the type descriptor), since its
+offset will not be a compile time constant as usual.
+
+Note that an alternative to maximizing the allocation of assumed-rank temporary
+descriptor could be to use automatic allocation based on the rank of the input
+descriptor, but this would make stack allocation analysis more complex (tools
+will likely not have the Fortran knowledge that this allocation size is bounded
+for instance) while the stack "over" allocation is likely reasonable (24 bytes
+per dimension). Hence the selection of the simple approach using static size
+allocation to the maximum rank.
+
+### Representation in FIR and HLFIR
+SSA values for assumed-rank entities have an MLIR type containing a
+`!fir.array<*xT>` sequence type wrapped in a `!fir.box` or `!fir.class` type
+(additionally wrapped in a `!fir.ref` type for pointers and allocatables).
+
+Examples:
+`INTEGER :: x(..)` -> `!fir.box<!fir.array<* x i32>>`
+`CLASS(*) :: x(..)` -> `!fir.class<!fir.array<* x none>>`
+`TYPE(*) :: x(..)` -> `!fir.box<!fir.array<* x none>>`
+`REAL, ALLOCATABLE :: x(..)` -> `!fir.ref<!fir.box<!fir.heap<!fir.array<* x f32>>>>`
+`TYPE(t), POINTER :: x(..)` -> `!fir.ref<!fir.box<!fir.ptr<!fir.array<* x !fir.type<t>>>>>`
+
+All these FIR types are implemented as the address of a CFI_cdesct_t in code
+generation.
+
+There is no need to allow assumed-rank "expression" in HLFIR (hlfir.expr) since
+assumed-rank cannot appear in expressions (except as the actual argument to an
+assumed-rank dummy). Assumed-rank are variables. Also, since they cannot have
+the VALUE attribute, there is no need to use the hlfir.as_expr +
+hlfir.associate idiom to make copies for them.
+
+FIR/HLFIR operation where assumed-rank may appear:
+- as `hlfir.declare` and `fir.declare` operand and result.
+- as `fir.convert` operand and/or result.
+- as `fir.load` operand and result (POINTER and ALLOCATABLE dereference).
+- as a block argument (dummy argument).
+- as `fir.rebox_assumed_rank` operand/result (new operation to change some
+ fields of assumed-rank descriptors).
+- as `fir.box_rank` operand (rank inquiry).
+- as `fir.box_dim` operand (brutal user inquiry about the bounds of an
+ assumed-rank in a compile time constant dimension).
+- as `fir.box_addr` operand (to get the base address in inlined code for
+ C_LOC).
+- as `fir.box_elesize` operand (to implement LEN and STORAGE_SIZE).
+- as `fir.absent` result (passing absent actual to OPTIONAL assumed-rank dummy)
+- as `fir.is_present` operand (PRESENT inquiry)
+- as `hlfir.copy_in` and `hlfir.copy_out` operand and result (copy in and
+ copy-out of assumed-rank)
+- as `fir.alloca` type and result (when creating an assumed-rank POINTER dummy
+ from a non POINTER dummy).
+- as `fir.store` operands (same case as `fir.alloca`).
+
+FIR/HLFIR Operations that should not need to accept assumed-ranks but where it
+could still be relevant:
+- `fir.box_tdesc` and `fir.box_typecode` (polymorphic assumed-rank cannot
+ appear in a SELECT TYPE directly without using a SELECT RANK). Given the
+ CFI_cdesct_t structure, no change would be needed for `fir.box_typecode` to
+ support assumed-ranks, but `fir.box_tdesc` would require change since the
+ position of the type descriptor pointer depends on the rank.
+- as `fir.allocmem` / `fir.global` result (assumed-ranks are never local/global
+ entities).
+- as `fir.embox` result (When creating descriptor for an explicit shape, the
+ descriptor can be created with the entity rank, and then casted via
+`fir.convert`).
+
+It is not expected for any other FIR or HLFIR operations to handle assumed-rank
+SSA values.
+
+#### Summary of the impact in FIR
+One new operation is needed, `fir.rebox_assumed_rank`, the rational being that
+fir.rebox codegen is already quite complex and not all the aspects of fir.rebox
+matters for assumed-ranks (only simple field changes are required with
+assumed-ranks). Also, this operation will be allowed to take an operand in
+memory to avoid expensive fir.load of pointer/allocatable inputs.
+
+It is proposed that the FIR descriptor inquiry operation (fir.box_addr,
+fir.box_rank, fir.box_dim, fir.box_elesize at least) be allowed to take
+fir.ref<fir.box> arguments (allocatable and pointer descriptors) directly
+instead of generating a fir.load first. A conditional "read" effect will be
+added in such case. Again, the purpose is to avoid generating descriptor copies
+for the sole purpose of satisfying the SSA IR constraints. This change will
+likely benefit the non assumed-rank case too (even though LLVM is quite good at
+removing pointless descriptor copies in these cases).
+
+It will be ensured that all the operation listed above accept assumed-rank
+operands (both the verifiers and coedgen). The codegen of `fir.load`,
+`fir.alloca`, `fir.store`, `hlfir.copy_in` and `hlfir.copy_out` will need
+special handling for assumed-ranks.
+
+### Representation in LLVM IR
+
+Assumed-rank descriptor types are lowered to the LLVM type of a CFI_cdesct_t
+descriptor with no dimension array field and no addendum. That way, any inline
+code attempt to directly access dimensions and addendum with constant offset
+will be invalid for more safety, but it will still be easy to generate LLVM GEP
+to address the first descriptor fields in LLVM (to get the base address, rank,
+type code and attributes).
+
+`!fir.box<!fir.array<* x i32>>` -> `!llvm.struct<(ptr, i64, i32, i8, i8, i8, i8>`
+
+## Assumed-rank Features
+
+This section list the different Fortran features where assumed-rank objects are
+involved and describes the related implementation design.
+
+### Assumed-rank in procedure references
+Assumed-rank arguments are implemented as being the address of a CFI_cdesct_t.
+
+When passing an actual argument to an assumed-rank dummy, the following points
+need special attention and are further described below:
+- Copy-in/copy-out when required
+- Creation of forwarding of the assumed-rank dummy descriptor
+- Finalization, deallocation, and initialization of INTENT(OUT) assumed-rank
+ dummy.
+
+OPTIONAL assumed-ranks are implemented like other non assumed-rank OPTIONAL
+objects passed by descriptor : an absent assumed-rank is represented by a null
+pointer to a CFI_cdesct_t.
+
+The passing interface for assumed-rank described above and below is compliant
+by default with the BIND(C) case, except for the assumed-rank dummy descriptor
+lower bounds, which are only set to zeros in BIND(C) interface because it
+implies in most of the cases to create a new descriptor.
+
+VALUE is forbidden for assumed-rank dummies, so there is nothing to be done for
+it (although since copy-in/copy-out is possible, the compiler must anyway deal
+with creating assumed-rank copies, so it would likely not be an issue to relax
+this constraint).
+
+#### Copy-in and Copy out
+Copy-in and copy-out is required when passing an actual that is not contiguous
+to a non POINTER CONTIGUOUS assumed-rank.
+
+When the actual argument is ranked, the copy-in/copy-out can be performed on
+the ranked actual argument where the dynamic type has been aligned with the
+dummy type if needed (passing CLASS(T) to TYPE(T)) as illustrated below.
+
+```Fortran
+module m
+type t
+ integer :: i
+end type
+contains
+subroutine foo(x)
+ class(t) :: x(:)
+ interface
+ subroutine bar(x)
+ import :: t
+ type(t), contiguous :: x(..)
+ end subroutine
+ end interface
+ ! copy-in and copy-out is required aroud bar
+ call bar(x)
+end
+end module
+```
+
+When the actual is also an assumed-rank special the same copy-in/copy-out need
+may arise, and the `hlfir.copy_in` and `hlfir.copy_out` are also used to cover
+this case. The `hlfir.copy_in`operation is implemented using the `IsContiguous`
+runtime (can be used as-is) and the `AssignTemporary` temporary runtime.
+
+The difference with the ranked case is that more care is needed to create the
+output descriptor passed to `AssignTemporary`: it must be allocated to the
+maximum rank with the same type as the input descriptor and only the descriptor
+fields prior to the array dimensions will be initialized to those of an
+unallocated descriptor prior to the runtime call (`AssignTemporary` copies the
+addendum if needed).
+
+```Fortran
+subroutine foo2(x)
+ class(t) :: x(..)
+ interface
+ subroutine bar(x)
+ import :: t
+ type(t), contiguous :: x(..)
+ end subroutine
+ end interface
+ ! copy-in and copy-out is required aroud bar
+ call bar(x)
+end
+```
+#### Creating the descriptor for assumed-rank dummies
+
+There are four cases to distinguish:
+1. Actual does not have a descriptor (and is therefore ranked)
+2. Actual has a descriptor that can be forwarded for the dummy
+3. Actual has a ranked descriptor that cannot be forwarded for the dummy
+4. Actual has an assumed-rank descriptor that cannot be forwarded for the dummy
+
+For the first case, a descriptor will be created for the dummy with `fir.embox`
+has if it has the rank of the actual argument. This is the same logic as when
+dealing with assumed shape or INTENT(IN) POINTER dummy arguments, except that
+an extra cast to the assumed-rank descriptor type is added (no-op at runtime).
+
+For the second case, a cast is added to assumed-rank descriptor type if it is
+not one already and the descriptor is forwarded.
+
+For the third case, a new ranked descriptor with the dummy attribute/lower
+bounds is created from the actual argument descriptor with `fir.rebox` as it is
+done when passing to an assume shape dummy, and a cast to the assumed-rank
+descriptor is added .
+
+The last case is the same as the third one, except the that the descriptor
+manipulation is more complex since the storage size of the descriptor is
+unknown. `fir.rebox` codegen is already quite complex since it deals with
+creating descriptor for descriptor based array sections and pointer remapping.
+Both of those are meaningless in this case where the output descriptor is the
+same as the input one, except for the lower bounds, attribute, and derived type
+pointer field that may need to be changed to match the values describing the
+dummy. A simpler `fir.rebox_assumed_rank` operation is added for this use case.
+Notably, this operation can take fir.ref<fir.box> inputs to avoid creating an
+expensive and useless fir.load of POINTER/ALLOCATABLE descriptors.
+
+Fortran requires the compiler to fall in the 3rd and 4th case and create
+descriptor temporary for the dummy a lot more than one would think and hope. An
+annex section below discusses cases that force the compiler to create a new
+descriptor for the dummy even if the actual already has a descriptor. These are
+the same situations than with non assumed-rank arguments, but when passing
+assumed-rank to assumed-ranks, the cost of this extra copy is higher.
+
+#### Intent(out) assumed-rank finalization, deallocation, initialization
+
+C839 is limiting the cases where assumed-rank may be finalizable, but I do not
+understand how that helps. An assumed-rank allocatable or pointer actual can
+still be passed to an INTENT(OUT) dummy with a type requiring
+finalization/initialization, and nothing prevents passing allocatable
+assumed-rank to INTENT(OUT) allocatables, which require conditional
+deallocation which may trigger finalization. Flang therefore needs to
+implement finalization, deallocation and initialization of INTENT(OUT) as
+usual. Non pointer non allocatable INTENT(OUT) finalization is done via a call
+to `Destroy` runtime API that takes a descriptor and can be directly used with
+an assumed-rank descriptor with no change. The initialization is done via a
+call to the `Initialize` runtime API that takes a descriptor and can also
+directly be used with an assumed descriptor. Conditional deallocation of
+INTENT(OUT) allocatable is done via an inline allocation status check and
+either an inline deallocate for intrinsic types, or a runtime call to
+`Deallocate` for the other cases. For assumed-ranks, the runtime call is always
+used regardless of the type to avoid inline descriptor manipulations.
+`Deallocate` runtime API also works with assumed-rank descriptor with no
+changes (like any runtime API taking descriptors of any rank).
+
+```Fortran
+subroutine foo(x)
+ class(*), allocatable :: x(..)
+ interface
+ subroutine bar(x)
+ class(*), intent(out) :: x(..)
+ end subroutine
+ end interface
+ ! x may require finalization and initialization on bar entry.
+ call bar(x)
+end
+subroutine bar(x)
+ class(*), intent(out) :: x(..)
+end subroutine
+```
+### Select Rank
+
+Select rank is implemented with a rank inquiry (and last extent for `RANK(*)`),
+followed by a jump in the related block where the selector descriptor is cast
+to a descriptor with the associated entity rank for the current block (or kept
+as-is for the `RANK DEFAULT` and `RANK(*)` case). The cast values are mapped to
+the associated entity symbol and lowering precede as usual. This is very
+similar to how Select Type is implemented. The `RANK(*)` is a bit odd, it
+detects assumed-ranks associated with an assumed-size arrays regardless of the
+rank, and takes precedence over any rank based matching.
+
+Note that `-1` is a magic extent number that encodes that a descriptor describe
----------------
klausler wrote:
This answers my question above.
https://github.com/llvm/llvm-project/pull/71959
More information about the flang-commits
mailing list