[flang-commits] [flang] 7875362 - [flang] Add the proposal document and rationale for the internal naming module that was previously added.

Eric Schweitz via flang-commits flang-commits at lists.llvm.org
Thu Apr 30 11:32:14 PDT 2020


Author: Eric Schweitz
Date: 2020-04-30T11:32:01-07:00
New Revision: 7875362986fcdb40230b8a142e8288eeb6d547eb

URL: https://github.com/llvm/llvm-project/commit/7875362986fcdb40230b8a142e8288eeb6d547eb
DIFF: https://github.com/llvm/llvm-project/commit/7875362986fcdb40230b8a142e8288eeb6d547eb.diff

LOG: [flang] Add the proposal document and rationale for the internal naming module that was previously added.

Summary:
This document describes how uniquing of internal names is done. This
name uniquing is done to support the constraints and invariants of the FIR
dialect of MLIR.

Reviewers: jeanPerier, mehdi_amini, DavidTruby, jdoerfert, sscalpone, kiranchandramohan

Reviewed By: jeanPerier, sscalpone, kiranchandramohan

Subscribers: tskeith, kiranchandramohan, rriddle, llvm-commits

Tags: #llvm, #flang

Differential Revision: https://reviews.llvm.org/D79089

Added: 
    flang/documentation/BijectiveInternalNameUniquing.md

Modified: 
    

Removed: 
    


################################################################################
diff  --git a/flang/documentation/BijectiveInternalNameUniquing.md b/flang/documentation/BijectiveInternalNameUniquing.md
new file mode 100644
index 000000000000..e23264aeb0b5
--- /dev/null
+++ b/flang/documentation/BijectiveInternalNameUniquing.md
@@ -0,0 +1,118 @@
+## Bijective Internal Name Uniquing
+
+FIR has a flat namespace.  No two objects may have the same name at
+the module level.  (These would be functions, globals, etc.)
+This necessitates some sort of encoding scheme to unique
+symbols from the front-end into FIR.
+
+Another requirement is
+to be able to reverse these unique names and recover the associated
+symbol in the symbol table.
+
+Fortran is case insensitive, which allows the compiler to convert the
+user's identifiers to all lower case.  Such a universal conversion implies
+that all upper case letters are available for use in uniquing.
+
+### Prefix `_Q`
+
+All uniqued names have the prefix sequence `_Q` to indicate the name has
+been uniqued.  (Q is chosen because it is a
+[low frequency letter](http://pi.math.cornell.edu/~mec/2003-2004/cryptography/subs/frequencies.html)
+in English.)
+
+### Scope Building
+
+Symbols can be scoped by the module, submodule, or procedure that contains
+that symbol.  After the `_Q` sigil, names are constructed from outermost to
+innermost scope as
+
+   * Module name prefixed with `M`
+   * Submodule name prefixed with `S`
+   * Procedure name prefixed with `F`
+
+Given:
+```
+    submodule (mod:s1mod) s2mod
+      ...
+      subroutine sub
+        ...
+      contains
+        function fun
+```
+
+The uniqued name of `fun` becomes:
+```
+    _QMmodSs1modSs2modFsubPfun
+```
+
+### Common blocks
+
+   * A common block name will be prefixed with `B`
+
+### Module scope global data
+
+   * A global data entity is prefixed with `E`
+   * A global entity that is constant (parameter) will be prefixed with `EC`
+
+### Procedures/Subprograms
+
+   * A procedure/subprogram is prefixed with `P`
+
+Given:
+```
+    subroutine sub
+```
+The uniqued name of `sub` becomes:
+```
+    _QPsub
+```
+
+### Derived types and related
+
+   * A derived type is prefixed with `T`
+   * If a derived type has KIND parameters, they are listed in a consistent
+     canonical order where each takes the form `Ki` and where _i_ is the
+     compile-time constant value. (All type parameters are integer.)  If _i_
+     is a negative value, the prefix `KN` will be used and _i_ will reflect
+     the magnitude of the value.
+
+Given:
+```
+    module mymodule
+      type mytype
+        integer :: member
+      end type
+      ...
+```
+The uniqued name of `mytype` becomes:
+```
+    _QMmymoduleTmytype
+```
+
+Given:
+```
+    type yourtype(k1,k2)
+      integer, kind :: k1, k2
+      real :: mem1
+      complex :: mem2
+    end type
+```
+
+The uniqued name of `yourtype` where `k1=4` and `k2=-6` (at compile-time):
+```
+    _QTyourtypeK4KN6
+```
+
+   * A derived type dispatch table is prefixed with `D`.  The dispatch table
+     for `type t` would be `_QDTt`
+   * A type descriptor instance is prefixed with `C`.  Intrinsic types can
+     be encoded with their names and kinds.  The type descriptor for the
+     type `yourtype` above would be `_QCTyourtypeK4KN6`.  The type
+     descriptor for `REAL(4)` would be `_QCrealK4`.
+
+### Compiler generated names
+
+Compiler generated names do not have to be mapped back to Fortran.  These
+names will be prefixed with `_QQ` and followed by a unique compiler
+generated identifier. There is, of course, no mapping back to a symbol
+derived from the input source in this case as no such symbol exists.


        


More information about the flang-commits mailing list