[flang-commits] [flang] 8fc07fe - [flang] Design document for runtime derived type descriptions (NFC)

Mon Nov 2 09:24:15 PST 2020

Author: peter klausler
Date: 2020-11-02T09:23:22-08:00
New Revision: 8fc07fed32b0f18bd496b7014cdf61619ba77316

URL: https://github.com/llvm/llvm-project/commit/8fc07fed32b0f18bd496b7014cdf61619ba77316
DIFF: https://github.com/llvm/llvm-project/commit/8fc07fed32b0f18bd496b7014cdf61619ba77316.diff

LOG: [flang] Design document for runtime derived type descriptions (NFC)

Differential revision: https://reviews.llvm.org/D90500

Added: 
    flang/docs/RuntimeTypeInfo.md

Modified: 
    

Removed: 
    


################################################################################
diff  --git a/flang/docs/RuntimeTypeInfo.md b/flang/docs/RuntimeTypeInfo.md
new file mode 100644
index 000000000000..2a511b208d0e

--- /dev/null
+++ b/flang/docs/RuntimeTypeInfo.md
@@ -0,0 +1,271 @@
+<!--===- docs/RuntimeTypeInfo.md 
+  
+   Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+   See https://llvm.org/LICENSE.txt for license information.
+   SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+  
+-->
+
+# The derived type runtime information table
+
+```eval_rst
+.. contents::
+   :local:
+```
+
+## Overview
+
+Many operations on derived types must be implemented, or can be
+implemented, with calls to the runtime support library rather than
+directly with generated code.
+Some operations might be initially implemented in the runtime library
+and then reimplemented later in generated code for compelling
+performance gains in optimized compilations.
+
+The runtime library uses *derived type description* tables to represent
+the relevant characteristics of derived types.
+This note summarizes the requirements for these descriptions.
+
+The semantics phase of the F18 frontend constructs derived type
+descriptions from its scoped symbol table after name resolution
+and semantic constraint checking have succeeded.
+The lowering phase then transfers the tables to the static
+read-only data section of the generated program by translating them into
+initialized objects.
+During execution, references to the tables occur by passing their addresses
+as arguments to relevant runtime library APIs and as pointers in
+the addenda of descriptors.
+
+## Requirements
+
+The following Fortran language features require, or may require, the use of
+derived type descriptions in the runtime library.
+
+### Components
+
+The components of a derived type need to be described in component
+order (7.4.7), but when there is a parent component, its components
+can be described by reference to the description of the type of the
+parent component.
+
+The ordered component descriptions are needed to implement
+* default initialization
+* `ALLOCATE`, with and without `SOURCE=`
+* intrinsic assignment of derived types with `ALLOCATABLE` and
+  automatic components
+* intrinsic I/O of derived type instances
+* `NAMELIST` I/O of derived type instances
+* "same type" tests
+
+The characteristics of data components include their names, types,
+offsets, bounds, cobounds, derived type descriptions when appropriate,
+default component initializers, and flags for `ALLOCATABLE`, `POINTER`,
+`PRIVATE`, and automatic components (implicit allocatables).
+Procedure pointer components require only their offsets and address(es).
+
+### Calls to type-bound procedures
+
+Only extensible derived types -- those without `SEQUENCE` or `BIND(C)`
+-- are allowed to have type-bound procedures.
+Calls to these bindings will be resolved at compilation time when
+the binding is `NON_OVERRIDABLE` or when an object is not polymorphic.
+Calls to overridable bindings of polymorphic objects requires the
+use of a runtime table of procedure addresses.
+
+Each derived type (or instantiation of a parameterized derived type)
+will have a complete type-bound procedure table in which all of the
+bindings of its ancestor types appear first.
+(Specifically, the table offsets of any inherited bindings must be
+the same as they are in the table of the ancestral type's table.)
+These ancestral bindings reflect their overrides, if any.
+
+The non-inherited bindings of a type then follow the inherited
+bindings, and they do so in alphabetical order of binding name.
+(This is an arbitrary choice -- we could also define them to
+appear in binding declaration order, I suppose -- but a consistent
+ordering should be used so that relocatables generated by distinct
+versions of the F18 compiler will have a better chance to interoperate.)
+
+### Type parameter values and "same type" testing
+
+The values of the `KIND` and `LEN` parameters of a particular derived type
+instance can be obtained to implement type parameter inquiries without
+requiring derived type information tables.
+In the case of a `KIND` type parameter, it's a constant value known at
+compilation time, and in the case of a `LEN` type parameter, it's a
+member of the addendum to the object's descriptor.
+
+The runtime library will have an API (TBD) to be called as
+part of the implementation of `TYPE IS` and `CLASS IS` guards
+of the `SELECT TYPE` construct.
+This language support predicate returns a true result when
+an object's type matches a particular type specification and
+`KIND` (but not `LEN`) type parameter values.
+
+Note that this "is same type as" predicate is *not* the same as
+the one to be called to implement the `SAME_TYPE_AS()` intrinsic function,
+which is specified so as to *ignore* the values of `KIND` type
+parameters.
+
+Subclause 7.5.2 defines what being the "same" derived type means
+in Fortran.
+In short, each definition of a derived type defines a distinct type,
+so type equality testing can usually compare addresses of derived
+type descriptions at runtime.
+The exceptions are `SEQUENCE` types and interoperable (`BIND(C)`)
+types.
+Independent definitions of each of these are considered to be the "same type"
+when these definitions match in terms of names, types, and attributes,
+both being either `SEQUENCE` or `BIND(C)`, and containing
+no `PRIVATE` components.
+These "sequence" derived types cannot have type parameters, type-bound
+procedures, an absence of components, or components that are not themselves
+of a sequence type, so we can use a static hash code to implement
+their "same type" tests.
+
+### FINAL subroutines
+
+When an instance of a derived type is deallocated or goes out of scope,
+one of its `FINAL` subroutines may be called.
+Subclause 7.5.6.3 defines when finalization occurs -- it doesn't happen
+in all situations.
+
+The subroutines named in a derived type's `FINAL` statements are not
+bindings, so their arguments are not passed object dummy arguments and
+do not have to satisfy the constraints of a passed object.
+Specifically, they can be arrays, and cannot be polymorphic.
+If a `FINAL` subroutine's dummy argument is an array, it may be
+assumed-shape or assumed-rank, but it could also be an explicit-shape
+or assumed-size argument.
+This means that it may or may not be passed by means of a descriptor.
+
+Note that a `FINAL` subroutine with a scalar argument does not define
+a finalizer for array objects unless the subroutine is elemental
+(and probably `IMPURE`).
+This seems to be a language pitfall and F18 will emit a
+warning when an array of a finalizable derived type is declared
+with a rank lacking a `FINAL` subroutine when other ranks do have one.
+
+So the necessary information in the derived type table for a `FINAL`
+subroutine comprises:
+* address(es) of the subroutine
+* rank of the argument, or whether it is assumed-rank
+* for rank 0, whether the subroutine is elemental
+* for rank > 0, whether the argument requires a descriptor
+
+This descriptor flag is needed to handle a 
diff icult case with
+`FINAL` subroutines that most other implementations of Fortran
+fail to get right: a `FINAL` subroutine
+whose argument is a an explicit shape or assumed size array may
+have to be called upon the parent component of an array of
+an extended derived type.
+
+```
+  module m
+    type :: parent
+      integer :: n
+     contains
+      final :: subr
+    end type
+    type, extends(parent) :: extended
+      integer :: m
+    end type
+   contains
+    subroutine subr(a)
+      type(parent) :: a(1)
+    end subroutine
+  end module
+  subroutine demo
+    use m
+    type(extended) :: arr(1)
+  end subroutine
+```
+
+If the `FINAL` subroutine doesn't use a descriptor -- and it
+will not if there are no `LEN` type parameters -- the runtime
+will have to allocate and populate a temporary array of copies
+elements of the parent component of the array so that it can
+be passed by reference to the `FINAL` subroutine.
+
+### Defined assignment
+
+A defined assignment subroutine for a derived type can be declared
+by means of a generic `INTERFACE ASSIGNMENT(=)` and by means of
+a generic type-bound procedure.
+Defined assignments with non-type-bound generic interfaces are
+resolved to specific subroutines at compilation time.
+Most cases of type-bound defined assignment are resolved to their
+bindings at compilation time as well (with possible runtime
+resolution of overridable bindings).
+
+Intrinsic assignment of derived types with components that have
+derived types with type-bound generic assignments is specified
+by subclause 10.2.1.3 paragraph 13 as invoking defined assignment
+subroutines, however.
+
+This seems to be the only case of defined assignment that may be of
+interest to the runtime library.
+If this is correct, then the requirements are somewhat constrained;
+we know that the rank of the target of the assignment must match
+the rank of the source, and that one of the dummy arguments of the
+bound subroutine is a passed object dummy argument and satisfies
+all of the constraints of one -- in particular, it's scalar and
+polymorphic.
+
+So the derived type information for a defined assignment needs to
+comprise:
+* address(es) of the subroutine
+* whether the first, second, or both arguments are descriptors
+* whether the subroutine is elemental
+
+### User defined derived type I/O
+
+Fortran programs can specify subroutines that implement formatted and
+unformatted `READ` and `WRITE` operations for derived types.
+These defined I/O subroutines may be specified with an explicit `INTERFACE`
+or with a type-bound generic.
+When specified with an `INTERFACE`, the first argument must not be
+polymorphic, but when specified with a type-bound generic, the first
+argument is a passed-object dummy argument and required to be so.
+In any case, the argument is scalar.
+
+Nearly all invocations of user defined derived type I/O subroutines
+are resolved at compilation time to specific procedures or to
+overridable bindings.
+(The I/O library APIs for acquiring their arguments remain to be
+designed, however.)
+The case that is of interest to the runtime library is that of
+NAMELIST I/O, which is specified to invoke user defined derived
+type I/O subroutines if they have been defined.
+
+The derived type information for a user defined derived type I/O
+subroutine comprises:
+* address(es) of the subroutine
+* whether it is for a read or a write
+* whether it is formatted or unformatted
+* whether the first argument is a descriptor (true if it is a
+  binding of the derived type, or has a `LEN` type parameter)
+
+## Exporting derived type descriptions from module relocatables
+
+Subclause 7.5.2 requires that two objects be considered as having the
+same derived type if they are declared "with reference to the same
+derived type definition".
+For derived types that are defined in modules and accessed by means
+of use association, we need to be able to describe the type in the
+read-only static data section of the module and access the description
+as a link-time external.
+
+This is not always possible to achieve in the case of instantiations
+of parameterized derived types, however.
+Two identical instantiations in distinct compilation units of the same
+use associated parameterized derived type seem impractical to implement
+using the same address.
+(Perhaps some linkers would support unification of global objects
+with "mangled" names and identical contents, but this seems unportable.)
+
+Derived type descriptions therefore will contain pointers to
+their "uninstantiated" original derived types.
+For derived types with no `KIND` type parameters, these pointers
+will be null; for uninstantiated derived types, these pointers
+will point at themselves.