[clang] Revise the modules document for clarity (PR #90237)

via cfe-commits cfe-commits at lists.llvm.org
Fri Apr 26 10:17:08 PDT 2024


llvmbot wrote:


<!--LLVM PR SUMMARY COMMENT-->

@llvm/pr-subscribers-clang-modules

Author: Aaron Ballman (AaronBallman)

<details>
<summary>Changes</summary>

The intention isn't to add or change the information provided, but to improve clarity through some grammar fixes, improvements to the markdown, and so forth.

---

Patch is 84.98 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/90237.diff


1 Files Affected:

- (modified) clang/docs/StandardCPlusPlusModules.rst (+561-572) 


``````````diff
diff --git a/clang/docs/StandardCPlusPlusModules.rst b/clang/docs/StandardCPlusPlusModules.rst
index ee57fb5da64857..e0ceb2582725c2 100644
--- a/clang/docs/StandardCPlusPlusModules.rst
+++ b/clang/docs/StandardCPlusPlusModules.rst
@@ -8,79 +8,60 @@ Standard C++ Modules
 Introduction
 ============
 
-The term ``modules`` has a lot of meanings. For the users of Clang, modules may
-refer to ``Objective-C Modules``, ``Clang C++ Modules`` (or ``Clang Header Modules``,
-etc.) or ``Standard C++ Modules``. The implementation of all these kinds of modules in Clang
-has a lot of shared code, but from the perspective of users, their semantics and
-command line interfaces are very different. This document focuses on
-an introduction of how to use standard C++ modules in Clang.
-
-There is already a detailed document about `Clang modules <Modules.html>`_, it
-should be helpful to read `Clang modules <Modules.html>`_ if you want to know
-more about the general idea of modules. Since standard C++ modules have different semantics
-(and work flows) from `Clang modules`, this page describes the background and use of
-Clang with standard C++ modules.
-
-Modules exist in two forms in the C++ Language Specification. They can refer to
-either "Named Modules" or to "Header Units". This document covers both forms.
+The term ``modules`` has a lot of meanings. For Clang users, modules may refer
+to ``Objective-C Modules``, `Clang Modules <Modules.html>`_ (also called
+``Clang Header Modules``, etc.) or ``C++20 Modules`` (or
+``Standard C++ Modules``). The implementation of all these kinds of modules in
+Clang shares a lot of code, but from the perspective of users, their semantics
+and command line interfaces are very different. This document focuses on an
+introduction of focusing on the use of C++20 modules in Clang. In the remainder
+of this document, the term ``modules`` will refer to Standard C++20 modules and
+the term ``Clang modules`` will refer to the Clang modules extension.
+
+Modules exist in two forms in the C++ Standard. They can refer to either
+"Named Modules" or "Header Units". This document covers both forms.
 
 Standard C++ Named modules
 ==========================
 
-This document was intended to be a manual first and foremost, however, we consider it helpful to
-introduce some language background here for readers who are not familiar with
-the new language feature. This document is not intended to be a language
-tutorial; it will only introduce necessary concepts about the
-structure and building of the project.
+In order to understand compiler behavior, it is helpful to introduce some
+language background here for readers who are not familiar with the C++ feature.
+This document is not a tutorial on C++; it only introduces necessary concepts
+to better understand use of modules for a project.
 
 Background and terminology
 --------------------------
 
-Modules
-~~~~~~~
-
-In this document, the term ``Modules``/``modules`` refers to standard C++ modules
-feature if it is not decorated by ``Clang``.
-
-Clang Modules
-~~~~~~~~~~~~~
-
-In this document, the term ``Clang Modules``/``Clang modules`` refer to Clang
-c++ modules extension. These are also known as ``Clang header modules``,
-``Clang module map modules`` or ``Clang c++ modules``.
-
 Module and module unit
 ~~~~~~~~~~~~~~~~~~~~~~
 
-A module consists of one or more module units. A module unit is a special
-translation unit. Every module unit must have a module declaration. The syntax
-of the module declaration is:
+A module consists of one or more module units. A module unit is a special kind
+of translation unit. Every module unit must have a module declaration. The
+syntax of the module declaration is:
 
 .. code-block:: c++
 
   [export] module module_name[:partition_name];
 
-Terms enclosed in ``[]`` are optional. The syntax of ``module_name`` and ``partition_name``
-in regex form corresponds to ``[a-zA-Z_][a-zA-Z_0-9\.]*``. In particular, a literal dot ``.``
-in the name has no semantic meaning (e.g. implying a hierarchy).
-
-In this document, module units are classified into:
-
-* Primary module interface unit.
-
-* Module implementation unit.
+Terms enclosed in ``[]`` are optional. ``module_name`` and ``partition_name``
+are typical C++ identifiers, except that they may contain a period (``.``).
+Note that a ``.`` in the name has no semantic meaning (e.g. implying a
+hierarchy).
 
-* Module interface partition unit.
+In this document, module units are classified as:
 
-* Internal module partition unit.
+* Primary module interface unit
+* Module implementation unit
+* Module interface partition unit
+* Internal module partition unit
 
 A primary module interface unit is a module unit whose module declaration is
-``export module module_name;``. The ``module_name`` here denotes the name of the
+``export module module_name;`` where ``module_name`` denotes the name of the
 module. A module should have one and only one primary module interface unit.
 
 A module implementation unit is a module unit whose module declaration is
-``module module_name;``. A module could have multiple module implementation
-units with the same declaration.
+``module module_name;``. Multiple module implementation units can be declared
+in the same translation unit.
 
 A module interface partition unit is a module unit whose module declaration is
 ``export module module_name:partition_name;``. The ``partition_name`` should be
@@ -95,22 +76,23 @@ In this document, we use the following umbrella terms:
 * A ``module interface unit`` refers to either a ``primary module interface unit``
   or a ``module interface partition unit``.
 
-* An ``importable module unit`` refers to either a ``module interface unit``
-  or a ``internal module partition unit``.
+* An ``importable module unit`` refers to either a ``module interface unit`` or
+  an ``internal module partition unit``.
 
 * A ``module partition unit`` refers to either a ``module interface partition unit``
-  or a ``internal module partition unit``.
+  or an ``internal module partition unit``.
 
-Built Module Interface file
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Binary Module Interface
+~~~~~~~~~~~~~~~~~~~~~~~
 
-A ``Built Module Interface file`` stands for the precompiled result of an importable module unit.
-It is also called the acronym ``BMI`` generally.
+A ``Binary Module Interface`` (or ``BMI``) is the precompiled result of an
+importable module unit.
 
 Global module fragment
 ~~~~~~~~~~~~~~~~~~~~~~
 
-In a module unit, the section from ``module;`` to the module declaration is called the global module fragment.
+The ``global module fragment`` (or ``GMF``) is the code between the ``module;``
+and the module declaration within a module unit.
 
 
 How to build projects using modules
@@ -138,7 +120,7 @@ Let's see a "hello world" example that uses modules.
     return 0;
   }
 
-Then we type:
+From the command line, enter:
 
 .. code-block:: console
 
@@ -148,9 +130,9 @@ Then we type:
   Hello World!
 
 In this example, we make and use a simple module ``Hello`` which contains only a
-primary module interface unit ``Hello.cppm``.
+primary module interface unit named ``Hello.cppm``.
 
-Then let's see a little bit more complex "hello world" example which uses the 4 kinds of module units.
+A more complex "hello world" example which uses the 4 kinds of module units is:
 
 .. code-block:: c++
 
@@ -192,7 +174,7 @@ Then let's see a little bit more complex "hello world" example which uses the 4
     return 0;
   }
 
-Then we are able to compile the example by the following command:
+Back on the command line, compile the example with:
 
 .. code-block:: console
 
@@ -216,51 +198,56 @@ We explain the options in the following sections.
 How to enable standard C++ modules
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-Currently, standard C++ modules are enabled automatically
-if the language standard is ``-std=c++20`` or newer.
+Standard C++ modules are enabled automatically if the language standard is
+``-std=c++20`` or newer.
 
 How to produce a BMI
 ~~~~~~~~~~~~~~~~~~~~
 
-We can generate a BMI for an importable module unit by either ``--precompile``
-or ``-fmodule-output`` flags.
+To generate a BMI for an importable module unit, use either the ``--precompile``
+or ``-fmodule-output`` command line option.
 
-The ``--precompile`` option generates the BMI as the output of the compilation and the output path
-can be specified using the ``-o`` option.
+The ``--precompile`` option generates the BMI as the output of the compilation
+and the output path can be specified using the ``-o`` option.
 
-The ``-fmodule-output`` option generates the BMI as a by-product of the compilation.
-If ``-fmodule-output=`` is specified, the BMI will be emitted the specified location. Then if
-``-fmodule-output`` and ``-c`` are specified, the BMI will be emitted in the directory of the
-output file with the name of the input file with the new extension ``.pcm``. Otherwise, the BMI
-will be emitted in the working directory with the name of the input file with the new extension
+The ``-fmodule-output`` option generates the BMI as a by-product of the
+compilation. If ``-fmodule-output=`` is specified, the BMI will be emitted to
+the specified location. If ``-fmodule-output`` and ``-c`` are specified, the
+BMI will be emitted in the directory of the output file with the name of the
+input file with the extension ``.pcm``. Otherwise, the BMI will be emitted in
+the working directory with the name of the input file with the extension
 ``.pcm``.
 
-The style to generate BMIs by ``--precompile`` is called two-phase compilation since it takes
-2 steps to compile a source file to an object file. The style to generate BMIs by ``-fmodule-output``
-is called one-phase compilation respectively. The one-phase compilation model is simpler
-for build systems to implement and the two-phase compilation has the potential to compile faster due
-to higher parallelism. As an example, if there are two module units A and B, and B depends on A, the
-one-phase compilation model would need to compile them serially, whereas the two-phase compilation
-model may be able to compile them simultaneously if the compilation from A.pcm to A.o takes a long
-time.
-
-File name requirement
-~~~~~~~~~~~~~~~~~~~~~
+Generating BMIs with ``--precompile`` is called two-phase compilation because
+it takes two steps to compile a source file to an object file. Generating BMIs
+with ``-fmodule-output`` is called one-phase compilation. The one-phase
+compilation model is simpler for build systems to implement and the two-phase
+compilation has the potential to compile faster due to higher parallelism. As
+an example, if there are two module units ``A`` and ``B``, and ``B`` depends on
+``A``, the one-phase compilation model needs to compile them serially, whereas
+the two-phase compilation model may be able to compile them simultaneously if
+the compilation from ``A.pcm`` to ``A.o`` takes a long time.
+
+File name requirements
+~~~~~~~~~~~~~~~~~~~~~~
 
-The file name of an ``importable module unit`` should end with ``.cppm``
-(or ``.ccm``, ``.cxxm``, ``.c++m``). The file name of a ``module implementation unit``
-should end with ``.cpp`` (or ``.cc``, ``.cxx``, ``.c++``).
+By convention, ``importable module unit`` files should use ``.cppm`` (or
+``.ccm``, ``.cxxm``, or ``.c++m``) as a file extension.
+``Module implementation unit`` files should use ``.cpp`` (or ``.cc``, ``.cxx``,
+or ``.c++``) as a file extension.
 
-The file name of BMIs should end with ``.pcm``.
-The file name of the BMI of a ``primary module interface unit`` should be ``module_name.pcm``.
-The file name of BMIs of ``module partition unit`` should be ``module_name-partition_name.pcm``.
+A BMI should use ``.pcm`` as a file extension. The file name of the BMI for a
+``primary module interface unit`` should be ``module_name.pcm``. The file name
+of a BMI for a ``module partition unit`` should be
+``module_name-partition_name.pcm``.
 
-If the file names use different extensions, Clang may fail to build the module.
-For example, if the filename of an ``importable module unit`` ends with ``.cpp`` instead of ``.cppm``,
-then we can't generate a BMI for the ``importable module unit`` by ``--precompile`` option
-since ``--precompile`` option now would only run preprocessor, which is equal to `-E` now.
-If we want the filename of an ``importable module unit`` ends with other suffixes instead of ``.cppm``,
-we could put ``-x c++-module`` in front of the file. For example,
+Clang may fail to build the module if different extensions are used. For
+example, if the filename of an ``importable module unit`` ends with ``.cpp``
+instead of ``.cppm``, then Clang cannot generate a BMI for the
+``importable module unit`` with the ``--precompile`` option because the
+``--precompile`` option would only run the preprocessor (``-E``). If using a
+different extension than the conventional one for an ``importable module unit``
+you can specify ``-x c++-module`` before the file. For example,
 
 .. code-block:: c++
 
@@ -279,8 +266,9 @@ we could put ``-x c++-module`` in front of the file. For example,
     return 0;
   }
 
-Now the filename of the ``module interface`` ends with ``.cpp`` instead of ``.cppm``,
-we can't compile them by the original command lines. But we are still able to do it by:
+In this example, the extension used by the ``module interface`` is ``.cpp``
+instead of ``.cppm``, so is cannot be compiled with the command lines from the
+previous example, but it can be compiled with:
 
 .. code-block:: console
 
@@ -289,12 +277,12 @@ we can't compile them by the original command lines. But we are still able to do
   $ ./Hello.out
   Hello World!
 
-Module name requirement
-~~~~~~~~~~~~~~~~~~~~~~~
+Module name requirements
+~~~~~~~~~~~~~~~~~~~~~~~~
 
-[module.unit]p1 says:
+..
 
-.. code-block:: text
+  [module.unit]p1:
 
   All module-names either beginning with an identifier consisting of std followed by zero
   or more digits or containing a reserved identifier ([lex.name]) are reserved and shall not
@@ -302,7 +290,7 @@ Module name requirement
   module-name is a reserved identifier, the module name is reserved for use by C++ implementations;
   otherwise it is reserved for future standardization.
 
-So all of the following name is not valid by default:
+So none of the following names are valid by default:
 
 .. code-block:: text
 
@@ -312,75 +300,76 @@ So all of the following name is not valid by default:
     __test
     // and so on ...
 
-If you still want to use the reserved module names for any reason, use
-``-Wno-reserved-module-identifier`` to suppress the warning.
+Using a reserved module name is strongly discouraged, but
+``-Wno-reserved-module-identifier`` can be used to suppress the warning.
 
-How to specify the dependent BMIs
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Specifying dependent BMIs
+~~~~~~~~~~~~~~~~~~~~~~~~~
 
-There are 3 methods to specify the dependent BMIs:
+There are 3 ways to specify a dependent BMI:
 
-* (1) ``-fprebuilt-module-path=<path/to/directory>``.
-* (2) ``-fmodule-file=<path/to/BMI>`` (Deprecated).
-* (3) ``-fmodule-file=<module-name>=<path/to/BMI>``.
+1. ``-fprebuilt-module-path=<path/to/directory>``.
+2. ``-fmodule-file=<path/to/BMI>`` (Deprecated).
+3. ``-fmodule-file=<module-name>=<path/to/BMI>``.
 
-The option ``-fprebuilt-module-path`` tells the compiler the path where to search for dependent BMIs.
-It may be used multiple times just like ``-I`` for specifying paths for header files. The look up rule here is:
+The ``-fprebuilt-module-path`` option specifies the path to search for
+dependent BMIs. Multiple paths may be specified, similar to using ``-I`` to
+specify a search path for header files. When importing a module ``M``, the
+compiler looks for ``M.pcm`` in the directories specified by
+``-fprebuilt-module-path``. Similarly,  When importing a partition module unit
+``M:P``, the compiler looks for ``M-P.pcm`` in the directories specified by
+``-fprebuilt-module-path``.
 
-* (1) When we import module M. The compiler would look up M.pcm in the directories specified
-  by ``-fprebuilt-module-path``.
-* (2) When we import partition module unit M:P. The compiler would look up M-P.pcm in the
-  directories specified by ``-fprebuilt-module-path``.
-
-The option ``-fmodule-file=<path/to/BMI>`` tells the compiler to load the specified BMI directly.
-The option ``-fmodule-file=<module-name>=<path/to/BMI>`` tells the compiler to load the specified BMI
-for the module specified by ``<module-name>`` when necessary. The main difference is that
+The ``-fmodule-file=<path/to/BMI>`` option causes the compiler to load the
+specified BMI directly. The ``-fmodule-file=<module-name>=<path/to/BMI>``
+option causes the compiler to load the specified BMI for the module specified
+by ``<module-name>`` when necessary. The main difference is that
 ``-fmodule-file=<path/to/BMI>`` will load the BMI eagerly, whereas
-``-fmodule-file=<module-name>=<path/to/BMI>`` will only load the BMI lazily, which is similar
-with ``-fprebuilt-module-path``. The option ``-fmodule-file=<path/to/BMI>`` for named modules is deprecated
-and is planning to be removed in future versions.
+``-fmodule-file=<module-name>=<path/to/BMI>`` will only load the BMI lazily,
+which is similar to ``-fprebuilt-module-path``. The
+``-fmodule-file=<path/to/BMI>`` option for named modules is deprecated and will
+be removed in a future version of Clang.
 
-In case all ``-fprebuilt-module-path=<path/to/directory>``, ``-fmodule-file=<path/to/BMI>`` and
-``-fmodule-file=<module-name>=<path/to/BMI>`` exist, the ``-fmodule-file=<path/to/BMI>`` option
-takes highest precedence and ``-fmodule-file=<module-name>=<path/to/BMI>`` will take the second
-highest precedence.
+When these options are specified in the same invocation of the compiler, the
+``-fmodule-file=<path/to/BMI>`` option takes precedence over
+``-fmodule-file=<module-name>=<path/to/BMI>``, which takes precedence over
+``-fprebuilt-module-path=<path/to/directory>``.
 
-We need to specify all the dependent (directly and indirectly) BMIs.
-See https://github.com/llvm/llvm-project/issues/62707 for detail.
+Note: you must specify all the (directly or indirectly) dependent BMIs
+explicitly. See https://github.com/llvm/llvm-project/issues/62707 for details.
 
-When we compile a ``module implementation unit``, we must specify the BMI of the corresponding
-``primary module interface unit``.
-Since the language specification says a module implementation unit implicitly imports
-the primary module interface unit.
+When compiling a ``module implementation unit``, the BMI of the corresponding
+``primary module interface unit`` must be specified. This is because a module
+implementation unit implicitly imports the primary module interface unit.
 
   [module.unit]p8
 
   A module-declaration that contains neither an export-keyword nor a module-partition implicitly
   imports the primary module interface unit of the module as if by a module-import-declaration.
 
-All of the 3 options ``-fprebuilt-module-path=<path/to/directory>``, ``-fmodule-file=<path/to/BMI>``
-and ``-fmodule-file=<module-name>=<path/to/BMI>`` may occur multiple times.
-For example, the command line to compile ``M.cppm`` in
-the above example could be rewritten into:
+The ``-fprebuilt-module-path=<path/to/directory>``, ``-fmodule-file=<path/to/BMI>``, 
+and ``-fmodule-file=<module-name>=<path/to/BMI>`` options may be specified
+multiple times. For example, the command line to compile ``M.cppm`` in
+the previous example could be rewritten as:
 
 .. code-block:: console
 
   $ clang++ -std=c++20 M.cppm --precompile -fmodule-file=M:interface_part=M-interface_part.pcm -fmodule-file=M:impl_part=M-impl_part.pcm -o M.pcm
 
 When there are multiple ``-fmodule-file=<module-name>=`` options for the same
-``<module-name>``, the last ``-fmodule-file=<module-name>=`` will override the previous
-``-fmodule-file=<module-name>=`` options.
+``<module-name>``, the last ``-fmodule-file=<module-name>=`` overrides the
+previous ``-fmodule-file=<module-name>=`` option.
 
-``-fprebuilt-module-path`` is more convenient and ``-fmodule-file`` is faster since
-it saves time for file lookup.
+``-fprebuilt-module-path`` is more convenient while ``-fmodule-file`` is faster
+because it saves time for file lookup.
 
 Remember that module units still have an object counterpart to the BMI
 ~~~~~~~~~~~~...
[truncated]

``````````

</details>


https://github.com/llvm/llvm-project/pull/90237


More information about the cfe-commits mailing list