[clang] 9263318 - Revise the modules document for clarity (#90237)
via cfe-commits
cfe-commits at lists.llvm.org
Wed May 8 09:01:26 PDT 2024
Author: Aaron Ballman
Date: 2024-05-08T12:01:22-04:00
New Revision: 9263318f9da22e6632f1aae3e85717ed88fde64f
URL: https://github.com/llvm/llvm-project/commit/9263318f9da22e6632f1aae3e85717ed88fde64f
DIFF: https://github.com/llvm/llvm-project/commit/9263318f9da22e6632f1aae3e85717ed88fde64f.diff
LOG: Revise the modules document for clarity (#90237)
The intention isn't to add or change the information provided, but to
improve clarity through some grammar fixes, improvements to the
markdown, and so forth.
Added:
Modified:
clang/docs/StandardCPlusPlusModules.rst
Removed:
################################################################################
diff --git a/clang/docs/StandardCPlusPlusModules.rst b/clang/docs/StandardCPlusPlusModules.rst
index ee57fb5da648..1c3c4d319c0e 100644
--- a/clang/docs/StandardCPlusPlusModules.rst
+++ b/clang/docs/StandardCPlusPlusModules.rst
@@ -8,109 +8,92 @@ Standard C++ Modules
Introduction
============
-The term ``modules`` has a lot of meanings. For the users of Clang, modules may
-refer to ``Objective-C Modules``, ``Clang C++ Modules`` (or ``Clang Header Modules``,
-etc.) or ``Standard C++ Modules``. The implementation of all these kinds of modules in Clang
-has a lot of shared code, but from the perspective of users, their semantics and
-command line interfaces are very
diff erent. This document focuses on
-an introduction of how to use standard C++ modules in Clang.
-
-There is already a detailed document about `Clang modules <Modules.html>`_, it
-should be helpful to read `Clang modules <Modules.html>`_ if you want to know
-more about the general idea of modules. Since standard C++ modules have
diff erent semantics
-(and work flows) from `Clang modules`, this page describes the background and use of
-Clang with standard C++ modules.
-
-Modules exist in two forms in the C++ Language Specification. They can refer to
-either "Named Modules" or to "Header Units". This document covers both forms.
+The term ``module`` is ambiguous, as it is used to mean multiple things in
+Clang. For Clang users, a module may refer to an ``Objective-C Module``,
+`Clang Module <Modules.html>`_ (also called a ``Clang Header Module``) or a
+``C++20 Module`` (or a ``Standard C++ Module``). The implementation of all
+these kinds of modules in Clang shares a lot of code, but from the perspective
+of users their semantics and command line interfaces are very
diff erent. This
+document is an introduction to the use of C++20 modules in Clang. In the
+remainder of this document, the term ``module`` will refer to Standard C++20
+modules and the term ``Clang module`` will refer to the Clang Modules
+extension.
+
+In terms of the C++ Standard, modules consist of two components: "Named
+Modules" or "Header Units". This document covers both.
Standard C++ Named modules
==========================
-This document was intended to be a manual first and foremost, however, we consider it helpful to
-introduce some language background here for readers who are not familiar with
-the new language feature. This document is not intended to be a language
-tutorial; it will only introduce necessary concepts about the
-structure and building of the project.
+In order to better understand the compiler's behavior, it is helpful to
+understand some terms and definitions for readers who are not familiar with the
+C++ feature. This document is not a tutorial on C++; it only introduces
+necessary concepts to better understand use of modules in a project.
Background and terminology
--------------------------
-Modules
-~~~~~~~
-
-In this document, the term ``Modules``/``modules`` refers to standard C++ modules
-feature if it is not decorated by ``Clang``.
-
-Clang Modules
-~~~~~~~~~~~~~
-
-In this document, the term ``Clang Modules``/``Clang modules`` refer to Clang
-c++ modules extension. These are also known as ``Clang header modules``,
-``Clang module map modules`` or ``Clang c++ modules``.
-
Module and module unit
~~~~~~~~~~~~~~~~~~~~~~
-A module consists of one or more module units. A module unit is a special
-translation unit. Every module unit must have a module declaration. The syntax
-of the module declaration is:
+A module consists of one or more module units. A module unit is a special kind
+of translation unit. A module unit should almost always start with a module
+declaration. The syntax of the module declaration is:
.. code-block:: c++
[export] module module_name[:partition_name];
-Terms enclosed in ``[]`` are optional. The syntax of ``module_name`` and ``partition_name``
-in regex form corresponds to ``[a-zA-Z_][a-zA-Z_0-9\.]*``. In particular, a literal dot ``.``
-in the name has no semantic meaning (e.g. implying a hierarchy).
+Terms enclosed in ``[]`` are optional. ``module_name`` and ``partition_name``
+follow the rules for a C++ identifier, except that they may contain one or more
+period (``.``) characters. Note that a ``.`` in the name has no semantic
+meaning and does not imply any hierarchy.
-In this document, module units are classified into:
+In this document, module units are classified as:
-* Primary module interface unit.
-
-* Module implementation unit.
-
-* Module interface partition unit.
-
-* Internal module partition unit.
+* Primary module interface unit
+* Module implementation unit
+* Module partition interface unit
+* Internal module partition unit
A primary module interface unit is a module unit whose module declaration is
-``export module module_name;``. The ``module_name`` here denotes the name of the
+``export module module_name;`` where ``module_name`` denotes the name of the
module. A module should have one and only one primary module interface unit.
A module implementation unit is a module unit whose module declaration is
-``module module_name;``. A module could have multiple module implementation
-units with the same declaration.
+``module module_name;``. Multiple module implementation units can be declared
+in the same module.
-A module interface partition unit is a module unit whose module declaration is
+A module partition interface unit is a module unit whose module declaration is
``export module module_name:partition_name;``. The ``partition_name`` should be
unique within any given module.
-An internal module partition unit is a module unit whose module declaration
-is ``module module_name:partition_name;``. The ``partition_name`` should be
-unique within any given module.
+An internal module partition unit is a module unit whose module
+declaration is ``module module_name:partition_name;``. The ``partition_name``
+should be unique within any given module.
-In this document, we use the following umbrella terms:
+In this document, we use the following terms:
* A ``module interface unit`` refers to either a ``primary module interface unit``
- or a ``module interface partition unit``.
+ or a ``module partition interface unit``.
-* An ``importable module unit`` refers to either a ``module interface unit``
- or a ``internal module partition unit``.
+* An ``importable module unit`` refers to either a ``module interface unit`` or
+ an ``internal module partition unit``.
-* A ``module partition unit`` refers to either a ``module interface partition unit``
- or a ``internal module partition unit``.
+* A ``module partition unit`` refers to either a ``module partition interface unit``
+ or an ``internal module partition unit``.
-Built Module Interface file
-~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Built Module Interface
+~~~~~~~~~~~~~~~~~~~~~~
-A ``Built Module Interface file`` stands for the precompiled result of an importable module unit.
-It is also called the acronym ``BMI`` generally.
+A ``Built Module Interface`` (or ``BMI``) is the precompiled result of an
+importable module unit.
Global module fragment
~~~~~~~~~~~~~~~~~~~~~~
-In a module unit, the section from ``module;`` to the module declaration is called the global module fragment.
+The ``global module fragment`` (or ``GMF``) is the code between the ``module;``
+and the module declaration within a module unit.
How to build projects using modules
@@ -138,7 +121,7 @@ Let's see a "hello world" example that uses modules.
return 0;
}
-Then we type:
+Then, on the command line, invoke Clang like:
.. code-block:: console
@@ -148,9 +131,9 @@ Then we type:
Hello World!
In this example, we make and use a simple module ``Hello`` which contains only a
-primary module interface unit ``Hello.cppm``.
+primary module interface unit named ``Hello.cppm``.
-Then let's see a little bit more complex "hello world" example which uses the 4 kinds of module units.
+A more complex "hello world" example which uses the 4 kinds of module units is:
.. code-block:: c++
@@ -192,7 +175,7 @@ Then let's see a little bit more complex "hello world" example which uses the 4
return 0;
}
-Then we are able to compile the example by the following command:
+Then, back on the command line, invoke Clang with:
.. code-block:: console
@@ -216,51 +199,57 @@ We explain the options in the following sections.
How to enable standard C++ modules
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Currently, standard C++ modules are enabled automatically
-if the language standard is ``-std=c++20`` or newer.
+Standard C++ modules are enabled automatically when the language standard mode
+is ``-std=c++20`` or newer.
How to produce a BMI
~~~~~~~~~~~~~~~~~~~~
-We can generate a BMI for an importable module unit by either ``--precompile``
-or ``-fmodule-output`` flags.
+To generate a BMI for an importable module unit, use either the ``--precompile``
+or ``-fmodule-output`` command line options.
-The ``--precompile`` option generates the BMI as the output of the compilation and the output path
-can be specified using the ``-o`` option.
+The ``--precompile`` option generates the BMI as the output of the compilation
+with the output path specified using the ``-o`` option.
-The ``-fmodule-output`` option generates the BMI as a by-product of the compilation.
-If ``-fmodule-output=`` is specified, the BMI will be emitted the specified location. Then if
-``-fmodule-output`` and ``-c`` are specified, the BMI will be emitted in the directory of the
-output file with the name of the input file with the new extension ``.pcm``. Otherwise, the BMI
-will be emitted in the working directory with the name of the input file with the new extension
+The ``-fmodule-output`` option generates the BMI as a by-product of the
+compilation. If ``-fmodule-output=`` is specified, the BMI will be emitted to
+the specified location. If ``-fmodule-output`` and ``-c`` are specified, the
+BMI will be emitted in the directory of the output file with the name of the
+input file with the extension ``.pcm``. Otherwise, the BMI will be emitted in
+the working directory with the name of the input file with the extension
``.pcm``.
-The style to generate BMIs by ``--precompile`` is called two-phase compilation since it takes
-2 steps to compile a source file to an object file. The style to generate BMIs by ``-fmodule-output``
-is called one-phase compilation respectively. The one-phase compilation model is simpler
-for build systems to implement and the two-phase compilation has the potential to compile faster due
-to higher parallelism. As an example, if there are two module units A and B, and B depends on A, the
-one-phase compilation model would need to compile them serially, whereas the two-phase compilation
-model may be able to compile them simultaneously if the compilation from A.pcm to A.o takes a long
-time.
-
-File name requirement
-~~~~~~~~~~~~~~~~~~~~~
-
-The file name of an ``importable module unit`` should end with ``.cppm``
-(or ``.ccm``, ``.cxxm``, ``.c++m``). The file name of a ``module implementation unit``
-should end with ``.cpp`` (or ``.cc``, ``.cxx``, ``.c++``).
-
-The file name of BMIs should end with ``.pcm``.
-The file name of the BMI of a ``primary module interface unit`` should be ``module_name.pcm``.
-The file name of BMIs of ``module partition unit`` should be ``module_name-partition_name.pcm``.
-
-If the file names use
diff erent extensions, Clang may fail to build the module.
-For example, if the filename of an ``importable module unit`` ends with ``.cpp`` instead of ``.cppm``,
-then we can't generate a BMI for the ``importable module unit`` by ``--precompile`` option
-since ``--precompile`` option now would only run preprocessor, which is equal to `-E` now.
-If we want the filename of an ``importable module unit`` ends with other suffixes instead of ``.cppm``,
-we could put ``-x c++-module`` in front of the file. For example,
+Generating BMIs with ``--precompile`` is referred to as two-phase compilation
+because it takes two steps to compile a source file to an object file.
+Generating BMIs with ``-fmodule-output`` is called one-phase compilation. The
+one-phase compilation model is simpler for build systems to implement while the
+two-phase compilation has the potential to compile faster due to higher
+parallelism. As an example, if there are two module units ``A`` and ``B``, and
+``B`` depends on ``A``, the one-phase compilation model needs to compile them
+serially, whereas the two-phase compilation model is able to be compiled as
+soon as ``A.pcm`` is available, and thus can be compiled simultaneously as the
+``A.pcm`` to ``A.o`` compilation step.
+
+File name requirements
+~~~~~~~~~~~~~~~~~~~~~~
+
+By convention, ``importable module unit`` files should use ``.cppm`` (or
+``.ccm``, ``.cxxm``, or ``.c++m``) as a file extension.
+``Module implementation unit`` files should use ``.cpp`` (or ``.cc``, ``.cxx``,
+or ``.c++``) as a file extension.
+
+A BMI should use ``.pcm`` as a file extension. The file name of the BMI for a
+``primary module interface unit`` should be ``module_name.pcm``. The file name
+of a BMI for a ``module partition unit`` should be
+``module_name-partition_name.pcm``.
+
+Clang may fail to build the module if
diff erent extensions are used. For
+example, if the filename of an ``importable module unit`` ends with ``.cpp``
+instead of ``.cppm``, then Clang cannot generate a BMI for the
+``importable module unit`` with the ``--precompile`` option because the
+``--precompile`` option would only run the preprocessor (``-E``). If using a
+
diff erent extension than the conventional one for an ``importable module unit``
+you can specify ``-x c++-module`` before the file. For example,
.. code-block:: c++
@@ -279,8 +268,9 @@ we could put ``-x c++-module`` in front of the file. For example,
return 0;
}
-Now the filename of the ``module interface`` ends with ``.cpp`` instead of ``.cppm``,
-we can't compile them by the original command lines. But we are still able to do it by:
+In this example, the extension used by the ``module interface`` is ``.cpp``
+instead of ``.cppm``, so it cannot be compiled like the previous example, but
+it can be compiled with:
.. code-block:: console
@@ -289,12 +279,12 @@ we can't compile them by the original command lines. But we are still able to do
$ ./Hello.out
Hello World!
-Module name requirement
-~~~~~~~~~~~~~~~~~~~~~~~
+Module name requirements
+~~~~~~~~~~~~~~~~~~~~~~~~
-[module.unit]p1 says:
+..
-.. code-block:: text
+ [module.unit]p1:
All module-names either beginning with an identifier consisting of std followed by zero
or more digits or containing a reserved identifier ([lex.name]) are reserved and shall not
@@ -302,7 +292,7 @@ Module name requirement
module-name is a reserved identifier, the module name is reserved for use by C++ implementations;
otherwise it is reserved for future standardization.
-So all of the following name is not valid by default:
+Therefore, none of the following names are valid by default:
.. code-block:: text
@@ -312,75 +302,74 @@ So all of the following name is not valid by default:
__test
// and so on ...
-If you still want to use the reserved module names for any reason, use
-``-Wno-reserved-module-identifier`` to suppress the warning.
+Using a reserved module name is strongly discouraged, but
+``-Wno-reserved-module-identifier`` can be used to suppress the warning.
-How to specify the dependent BMIs
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+Specifying dependent BMIs
+~~~~~~~~~~~~~~~~~~~~~~~~~
-There are 3 methods to specify the dependent BMIs:
+There are 3 ways to specify a dependent BMI:
-* (1) ``-fprebuilt-module-path=<path/to/directory>``.
-* (2) ``-fmodule-file=<path/to/BMI>`` (Deprecated).
-* (3) ``-fmodule-file=<module-name>=<path/to/BMI>``.
+1. ``-fprebuilt-module-path=<path/to/directory>``.
+2. ``-fmodule-file=<path/to/BMI>`` (Deprecated).
+3. ``-fmodule-file=<module-name>=<path/to/BMI>``.
-The option ``-fprebuilt-module-path`` tells the compiler the path where to search for dependent BMIs.
-It may be used multiple times just like ``-I`` for specifying paths for header files. The look up rule here is:
+The ``-fprebuilt-module-path`` option specifies the path to search for
+dependent BMIs. Multiple paths may be specified, similar to using ``-I`` to
+specify a search path for header files. When importing a module ``M``, the
+compiler looks for ``M.pcm`` in the directories specified by
+``-fprebuilt-module-path``. Similarly, when importing a partition module unit
+``M:P``, the compiler looks for ``M-P.pcm`` in the directories specified by
+``-fprebuilt-module-path``.
-* (1) When we import module M. The compiler would look up M.pcm in the directories specified
- by ``-fprebuilt-module-path``.
-* (2) When we import partition module unit M:P. The compiler would look up M-P.pcm in the
- directories specified by ``-fprebuilt-module-path``.
-
-The option ``-fmodule-file=<path/to/BMI>`` tells the compiler to load the specified BMI directly.
-The option ``-fmodule-file=<module-name>=<path/to/BMI>`` tells the compiler to load the specified BMI
-for the module specified by ``<module-name>`` when necessary. The main
diff erence is that
+The ``-fmodule-file=<path/to/BMI>`` option causes the compiler to load the
+specified BMI directly. The ``-fmodule-file=<module-name>=<path/to/BMI>``
+option causes the compiler to load the specified BMI for the module specified
+by ``<module-name>`` when necessary. The main
diff erence is that
``-fmodule-file=<path/to/BMI>`` will load the BMI eagerly, whereas
-``-fmodule-file=<module-name>=<path/to/BMI>`` will only load the BMI lazily, which is similar
-with ``-fprebuilt-module-path``. The option ``-fmodule-file=<path/to/BMI>`` for named modules is deprecated
-and is planning to be removed in future versions.
+``-fmodule-file=<module-name>=<path/to/BMI>`` will only load the BMI lazily,
+as will ``-fprebuilt-module-path``. The ``-fmodule-file=<path/to/BMI>`` option
+for named modules is deprecated and will be removed in a future version of
+Clang.
-In case all ``-fprebuilt-module-path=<path/to/directory>``, ``-fmodule-file=<path/to/BMI>`` and
-``-fmodule-file=<module-name>=<path/to/BMI>`` exist, the ``-fmodule-file=<path/to/BMI>`` option
-takes highest precedence and ``-fmodule-file=<module-name>=<path/to/BMI>`` will take the second
-highest precedence.
+When these options are specified in the same invocation of the compiler, the
+``-fmodule-file=<path/to/BMI>`` option takes precedence over
+``-fmodule-file=<module-name>=<path/to/BMI>``, which takes precedence over
+``-fprebuilt-module-path=<path/to/directory>``.
-We need to specify all the dependent (directly and indirectly) BMIs.
-See https://github.com/llvm/llvm-project/issues/62707 for detail.
+Note: all dependant BMIs must be specified explicitly, either directly or
+indirectly dependent BMIs explicitly. See
+https://github.com/llvm/llvm-project/issues/62707 for details.
-When we compile a ``module implementation unit``, we must specify the BMI of the corresponding
-``primary module interface unit``.
-Since the language specification says a module implementation unit implicitly imports
-the primary module interface unit.
+When compiling a ``module implementation unit``, the BMI of the corresponding
+``primary module interface unit`` must be specified because a module
+implementation unit implicitly imports the primary module interface unit.
[module.unit]p8
A module-declaration that contains neither an export-keyword nor a module-partition implicitly
imports the primary module interface unit of the module as if by a module-import-declaration.
-All of the 3 options ``-fprebuilt-module-path=<path/to/directory>``, ``-fmodule-file=<path/to/BMI>``
-and ``-fmodule-file=<module-name>=<path/to/BMI>`` may occur multiple times.
-For example, the command line to compile ``M.cppm`` in
-the above example could be rewritten into:
+The ``-fprebuilt-module-path=<path/to/directory>``, ``-fmodule-file=<path/to/BMI>``,
+and ``-fmodule-file=<module-name>=<path/to/BMI>`` options may be specified
+multiple times. For example, the command line to compile ``M.cppm`` in
+the previous example could be rewritten as:
.. code-block:: console
$ clang++ -std=c++20 M.cppm --precompile -fmodule-file=M:interface_part=M-interface_part.pcm -fmodule-file=M:impl_part=M-impl_part.pcm -o M.pcm
When there are multiple ``-fmodule-file=<module-name>=`` options for the same
-``<module-name>``, the last ``-fmodule-file=<module-name>=`` will override the previous
-``-fmodule-file=<module-name>=`` options.
-
-``-fprebuilt-module-path`` is more convenient and ``-fmodule-file`` is faster since
-it saves time for file lookup.
+``<module-name>``, the last ``-fmodule-file=<module-name>=`` overrides the
+previous ``-fmodule-file=<module-name>=`` option.
Remember that module units still have an object counterpart to the BMI
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-It is easy to forget to compile BMIs at first since we may envision module interfaces like headers.
-However, this is not true.
-Module units are translation units. We need to compile them to object files
-and link the object files like the example shows.
+While module interfaces resemble traditional header files, they still require
+compilation. Module units are translation units, and need to be compiled to
+object files, which then need to be linked together as the following examples
+show.
For example, the traditional compilation processes for headers are like:
@@ -400,24 +389,27 @@ And the compilation process for module units are like:
mod1.cppm -> clang++ mod1.cppm ... -> mod1.pcm --,--> clang++ mod1.pcm ... -> mod1.o -+
src2.cpp ----------------------------------------+> clang++ src2.cpp -------> src2.o -'
-As the diagrams show, we need to compile the BMI from module units to object files and link the object files.
-(But we can't do this for the BMI from header units. See the later section for the definition of header units)
+As the diagrams show, we need to compile the BMI from module units to object
+files and then link the object files. (However, this cannot be done for the BMI
+from header units. See the section on :ref:`header units <header-units>` for
+more details.
-If we want to create a module library, we can't just ship the BMIs in an archive.
-We must compile these BMIs(``*.pcm``) into object files(``*.o``) and add those object files to the archive instead.
+BMIs cannot be shipped in an archive to create a module library. Instead, the
+BMIs(``*.pcm``) are compiled into object files(``*.o``) and those object files
+are added to the archive instead.
-Consistency Requirement
-~~~~~~~~~~~~~~~~~~~~~~~
+Consistency Requirements
+~~~~~~~~~~~~~~~~~~~~~~~~
-If we envision modules as a cache to speed up compilation, then - as with other caching techniques -
-it is important to keep cache consistency.
-So **currently** Clang will do very strict check for consistency.
+Modules can be viewed as a kind of cache to speed up compilation. Thus, like
+other caching techniques, it is important to maintain cache consistency which
+is why Clang does very strict checking for consistency.
Options consistency
^^^^^^^^^^^^^^^^^^^
-The language option of module units and their non-module-unit users should be consistent.
-The following example is not allowed:
+Compiler options related to the language dialect for a module unit and its
+non-module-unit uses need to be consistent. Consider the following example:
.. code-block:: c++
@@ -432,9 +424,8 @@ The following example is not allowed:
$ clang++ -std=c++20 M.cppm --precompile -o M.pcm
$ clang++ -std=c++23 Use.cpp -fprebuilt-module-path=.
-The compiler would reject the example due to the inconsistent language options.
-Not all options are language options.
-For example, the following example is allowed:
+Clang rejects the example due to the inconsistent language standard modes. Not
+all compiler options are language dialect options, though. For example:
.. code-block:: console
@@ -444,9 +435,12 @@ For example, the following example is allowed:
# Inconsistent debugging level.
$ clang++ -std=c++20 -g Use.cpp -fprebuilt-module-path=.
-Although the two examples have inconsistent optimization and debugging level, both of them are accepted.
+Although the optimization and debugging levels are inconsistent, these
+compilations are accepted because the compiler options do not impact the
+language dialect.
-Note that **currently** the compiler doesn't consider inconsistent macro definition a problem. For example:
+Note that the compiler **currently** doesn't reject inconsistent macro
+definitions (this may change in the future). For example:
.. code-block:: console
@@ -454,43 +448,43 @@ Note that **currently** the compiler doesn't consider inconsistent macro definit
# Inconsistent optimization level.
$ clang++ -std=c++20 -O3 -DNDEBUG Use.cpp -fprebuilt-module-path=.
-Currently Clang would accept the above example. But it may produce surprising results if the
-debugging code depends on consistent use of ``NDEBUG`` also in other translation units.
+Currently, Clang accepts the above example, though it may produce surprising
+results if the debugging code depends on consistent use of ``NDEBUG`` in other
+translation units.
-Definitions consistency
-^^^^^^^^^^^^^^^^^^^^^^^
+Object definition consistency
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+The C++ language requires that declarations of the same entity in
diff erent
+translation units have the same definition, which is known as the One
+Definition Rule (ODR). Without modules, the compiler cannot perform strong ODR
+violation checking because it only sees one translation unit at a time. With
+the use of modules, the compiler can perform checks for ODR violations across
+translation units.
-The C++ language defines that same declarations in
diff erent translation units should have
-the same definition, as known as ODR (One Definition Rule). Prior to modules, the translation
-units don't dependent on each other and the compiler itself can't perform a strong
-ODR violation check. With the introduction of modules, now the compiler have
-the chance to perform ODR violations with language semantics across translation units.
-
-However, in the practice, we found the existing ODR checking mechanism is not stable
-enough. Many people suffers from the false positive ODR violation diagnostics, AKA,
-the compiler are complaining two identical declarations have
diff erent definitions
-incorrectly. Also the true positive ODR violations are rarely reported.
-Also we learned that MSVC don't perform ODR check for declarations in the global module
-fragment.
-
-So in order to get better user experience, save the time checking ODR and keep consistent
-behavior with MSVC, we disabled the ODR check for the declarations in the global module
-fragment by default. Users who want more strict check can still use the
-``-Xclang -fno-skip-odr-check-in-gmf`` flag to get the ODR check enabled. It is also
-encouraged to report issues if users find false positive ODR violations or false negative ODR
-violations with the flag enabled.
+However, the current ODR checking mechanisms are not perfect. There are a
+significant number of false positive ODR violation diagnostics, where the
+compiler incorrectly diagnoses two identical declarations as having
diff erent
+definitions. Further, true positive ODR violations are not always reported.
+
+To give a better user experience, improve compilation performance, and for
+consistency with MSVC, ODR checking of declarations in the global module
+fragment is disabled by default. These checks can be enabled by specifying
+``-Xclang -fno-skip-odr-check-in-gmf`` when compiling. If the check is enabled
+and you encounter incorrect or missing diagnostics, please report them via the
+`community issue tracker <https://github.com/llvm/llvm-project/issues/>`_.
ABI Impacts
-----------
-This section describes the new ABI changes brought by modules.
-
-Only Itanium C++ ABI related change are mentioned
+This section describes the new ABI changes brought by modules. Only changes to
+the Itanium C++ ABI are covered.
-Mangling Names
-~~~~~~~~~~~~~~
+Name Mangling
+~~~~~~~~~~~~~
-The declarations in a module unit which are not in the global module fragment have new linkage names.
+The declarations in a module unit which are not in the global module fragment
+have new linkage names.
For example,
@@ -501,22 +495,24 @@ For example,
export int foo();
}
-The linkage name of ``NS::foo()`` would be ``_ZN2NSW1M3fooEv``.
-This couldn't be demangled by previous versions of the debugger or demangler.
-As of LLVM 15.x, users can utilize ``llvm-cxxfilt`` to demangle this:
+The linkage name of ``NS::foo()`` is ``_ZN2NSW1M3fooEv``. This couldn't be
+demangled by previous versions of the debugger or demangler. As of LLVM 15.x,
+``llvm-cxxfilt`` can be used to demangle this:
.. code-block:: console
$ llvm-cxxfilt _ZN2NSW1M3fooEv
+ NS::foo at M()
-The result would be ``NS::foo at M()``, which reads as ``NS::foo()`` in module ``M``.
+The result should be read as ``NS::foo()`` in module ``M``.
-The ABI implies that we can't declare something in a module unit and define it in a non-module unit (or vice-versa),
-as this would result in linking errors.
+The ABI implies that something cannot be declared in a module unit and defined
+in a non-module unit (or vice-versa), as this would result in linking errors.
-If we still want to implement declarations within the compatible ABI in module unit,
-we can use the language-linkage specifier. Since the declarations in the language-linkage specifier
-is attached to the global module fragments. For example:
+Despite this, it is possible to implement declarations with a compatible ABI in
+a module unit by using a language linkage specifier because the declarations in
+the language linkage specifier are attached to the global module fragment. For
+example:
.. code-block:: c++
@@ -530,43 +526,47 @@ Now the linkage name of ``NS::foo()`` will be ``_ZN2NS3fooEv``.
Module Initializers
~~~~~~~~~~~~~~~~~~~
-All the importable module units are required to emit an initializer function.
-The initializer function should contain calls to importing modules first and
-all the dynamic-initializers in the current module unit then.
+All importable module units are required to emit an initializer function to
+handle the dynamic initialization of non-inline variables in the module unit.
+The importable module unit has to emit the initializer even if there is no
+dynamic initialization; otherwise, the importer may call a nonexistent
+function. The initializer function emits calls to imported modules first
+followed by calls to all to of the dynamic initializers in the current module
+unit.
-Translation units explicitly or implicitly importing named modules must call
-the initializer functions of the imported named modules within the sequence of
-the dynamic-initializers in the TU. Initializations of entities at namespace
-scope are appearance-ordered. This (recursively) extends into imported modules
-at the point of appearance of the import declaration.
+Translation units that explicitly or implicitly import a named module must call
+the initializer functions of the imported named module within the sequence of
+the dynamic initializers in the translation unit. Initializations of entities
+at namespace scope are appearance-ordered. This (recursively) extends to
+imported modules at the point of appearance of the import declaration.
-It is allowed to omit calls to importing modules if it is known empty.
-
-It is allowed to omit calls to importing modules for which is known to be called.
+If the imported module is known to be empty, the call to its initializer may be
+omitted. Additionally, if the imported module is known to have already been
+imported, the call to its initializer may be omitted.
Reduced BMI
-----------
-To support the 2 phase compilation model, Clang chose to put everything needed to
-produce an object into the BMI. But every consumer of the BMI, except itself, doesn't
-need such informations. It makes the BMI to larger and so may introduce unnecessary
-dependencies into the BMI. To mitigate the problem, we decided to reduce the information
-contained in the BMI.
-
-To be clear, we call the default BMI as Full BMI and the new introduced BMI as Reduced
-BMI.
+To support the two-phase compilation model, Clang puts everything needed to
+produce an object into the BMI. However, other consumers of the BMI generally
+don't need that information. This makes the BMI larger and may introduce
+unnecessary dependencies for the BMI. To mitigate the problem, Clang has a
+compiler option to reduce the information contained in the BMI. These two
+formats are known as Full BMI and Reduced BMI, respectively.
-Users can use ``-fexperimental-modules-reduced-bmi`` flag to enable the Reduced BMI.
+Users can use the ``-fexperimental-modules-reduced-bmi`` option to produce a
+Reduced BMI.
-For one phase compilation model (CMake implements this model), with
-``-fexperimental-modules-reduced-bmi``, the generated BMI will be Reduced BMI automatically.
-(The output path of the BMI is specified by ``-fmodule-output=`` as usual one phase
-compilation model).
+For the one-phase compilation model (CMake implements this model), with
+``-fexperimental-modules-reduced-bmi``, the generated BMI will be a Reduced
+BMI automatically. (The output path of the BMI is specified by
+``-fmodule-output=`` as usual with the one-phase compilation model).
-It is still possible to support Reduced BMI in two phase compilation model. With
-``-fexperimental-modules-reduced-bmi``, ``--precompile`` and ``-fmodule-output=`` specified,
-the generated BMI specified by ``-o`` will be full BMI and the BMI specified by
-``-fmodule-output=`` will be Reduced BMI. The dependency graph may be:
+It is also possible to produce a Reduced BMI with the two-phase compilation
+model. When ``-fexperimental-modules-reduced-bmi``, ``--precompile``, and
+``-fmodule-output=`` are specified, the generated BMI specified by ``-o`` will
+be a full BMI and the BMI specified by ``-fmodule-output=`` will be a Reduced
+BMI. The dependency graph in this case would look like:
.. code-block:: none
@@ -577,15 +577,16 @@ the generated BMI specified by ``-o`` will be full BMI and the BMI specified by
-> ...
-> consumer_n.cpp
-We don't emit diagnostics if ``-fexperimental-modules-reduced-bmi`` is used with a non-module
-unit. This design helps the end users of one phase compilation model to perform experiments
-early without asking for the help of build systems. The users of build systems which supports
-two phase compilation model still need helps from build systems.
+Clang does not emit diagnostics when ``-fexperimental-modules-reduced-bmi`` is
+used with a non-module unit. This design permits users of the one-phase
+compilation model to try using reduced BMIs without needing to modify the build
+system. The two-phase compilation module requires build system support.
-Within Reduced BMI, we won't write unreachable entities from GMF, definitions of non-inline
-functions and non-inline variables. This may not be a transparent change.
-`[module.global.frag]ex2 <https://eel.is/c++draft/module.global.frag#example-2>`_ may be a good
-example:
+In a Reduced BMI, Clang does not emit unreachable entities from the global
+module fragment, or definitions of non-inline functions and non-inline
+variables. This may not be a transparent change.
+
+Consider the following example:
.. code-block:: c++
@@ -633,22 +634,23 @@ example:
// module M's interface, so is discarded
int c = use_h<int>(); // OK
-In the above example, the function definition of ``N::g`` is elided from the Reduced
-BMI of ``M.cppm``. Then the use of ``use_g<int>`` in ``M-impl.cpp`` fails
-to instantiate. For such issues, users can add references to ``N::g`` in the module purview
-of ``M.cppm`` to make sure it is reachable, e.g., ``using N::g;``.
+In the above example, the function definition of ``N::g`` is elided from the
+Reduced BMI of ``M.cppm``. Then the use of ``use_g<int>`` in ``M-impl.cpp``
+fails to instantiate. For such issues, users can add references to ``N::g`` in
+the `module purview <https://eel.is/c++draft/module.unit#5>`_ of ``M.cppm`` to
+ensure it is reachable, e.g. ``using N::g;``.
-We think the Reduced BMI is the correct direction. But given it is a drastic change,
-we'd like to make it experimental first to avoid breaking existing users. The roadmap
-of Reduced BMI may be:
+Support for Reduced BMIs is still experimental, but it may become the default
+in the future. The expected roadmap for Reduced BMIs as of Clang 19.x is:
-1. ``-fexperimental-modules-reduced-bmi`` is opt in for 1~2 releases. The period depends
-on testing feedbacks.
-2. We would announce Reduced BMI is not experimental and introduce ``-fmodules-reduced-bmi``.
-and suggest users to enable this mode. This may takes 1~2 releases too.
-3. Finally we will enable this by default. When that time comes, the term BMI will refer to
-the reduced BMI today and the Full BMI will only be meaningful to build systems which
-loves to support two phase compilations.
+1. ``-fexperimental-modules-reduced-bmi`` is opt-in for 1~2 releases. The period depends
+ on user feedback and may be extended.
+2. Announce that Reduced BMIs are no longer experimental and introduce
+ ``-fmodules-reduced-bmi`` as a new option, and recommend use of the new
+ option. This transition is expected to take 1~2 additional releases as well.
+3. Finally, ``-fmodules-reduced-bmi`` will be the default. When that time
+ comes, the term BMI will refer to the Reduced BMI and the Full BMI will only
+ be meaningful to build systems which elect to support two-phase compilation.
Performance Tips
----------------
@@ -656,13 +658,11 @@ Performance Tips
Reduce duplications
~~~~~~~~~~~~~~~~~~~
-While it is legal to have duplicated declarations in the global module fragments
-of
diff erent module units, it is not free for clang to deal with the duplicated
-declarations. In other word, for a translation unit, it will compile slower if the
-translation unit itself and its importing module units contains a lot duplicated
-declarations.
-
-For example,
+While it is valid to have duplicated declarations in the global module fragments
+of
diff erent module units, it is not free for Clang to deal with the duplicated
+declarations. A translation unit will compile more slowly if there is a lot of
+duplicated declarations between the translation unit and modules it imports.
+For example:
.. code-block:: c++
@@ -698,9 +698,9 @@ For example,
import M;
... // use declarations from module M.
-When ``big.header.h`` is big enough and there are a lot of partitions,
-the compilation of ``use.cpp`` may be slower than
-the following style significantly:
+When ``big.header.h`` is big enough and there are a lot of partitions, the
+compilation of ``use.cpp`` may be significantly slower than the following
+approach:
.. code-block:: c++
@@ -738,22 +738,21 @@ the following style significantly:
import M;
... // use declarations from module M.
-The key part of the tip is to reduce the duplications from the text includes.
-
-Ideas for converting to modules
--------------------------------
+Reducing the duplication from textual includes is what improves compile-time
+performance.
-For new libraries, we encourage them to use modules completely from day one if possible.
-This will be pretty helpful to make the whole ecosystems to get ready.
+Transitioning to modules
+------------------------
-For many existing libraries, it may be a breaking change to refactor themselves
-into modules completely. So that many existing libraries need to provide headers and module
-interfaces for a while to not break existing users.
-Here we provide some ideas to ease the transition process for existing libraries.
-**Note that the this section is only about helping ideas instead of requirement from clang**.
+It is best for new code and libraries to use modules from the start if
+possible. However, it may be a breaking change for existing code or libraries
+to switch to modules. As a result, many existing libraries need to provide
+both headers and module interfaces for a while to not break existing users.
-Let's start with the case that there is no dependency or no dependent libraries providing
-modules for your library.
+This section suggests some suggestions on how to ease the transition process
+for existing libraries. **Note that this information is only intended as
+guidance, rather than as requirements to use modules in Clang.** It presumes
+the project is starting with no module-based dependencies.
ABI non-breaking styles
~~~~~~~~~~~~~~~~~~~~~~~
@@ -776,9 +775,9 @@ export-using style
using decl_n;
}
-As the example shows, you need to include all the headers containing declarations needs
-to be exported and `using` such declarations in an `export` block. Then, basically,
-we're done.
+This example shows how to include all the headers containing declarations which
+need to be exported, and uses `using` declarations in an `export` block to
+produce the module interface.
export extern-C++ style
^^^^^^^^^^^^^^^^^^^^^^^
@@ -799,7 +798,7 @@ export extern-C++ style
#include "header_n.h"
}
-Then in your headers (from ``header_1.h`` to ``header_n.h``), you need to define the macro:
+Headers (from ``header_1.h`` to ``header_n.h``) need to define the macro:
.. code-block:: c++
@@ -809,9 +808,10 @@ Then in your headers (from ``header_1.h`` to ``header_n.h``), you need to define
#define EXPORT
#endif
-And you should put ``EXPORT`` to the beginning of the declarations you want to export.
+and put ``EXPORT`` on the declarations you want to export.
-Also it is suggested to refactor your headers to include thirdparty headers conditionally:
+Also, it is recommended to refactor headers to include third-party headers
+conditionally:
.. code-block:: c++
@@ -823,26 +823,25 @@ Also it is suggested to refactor your headers to include thirdparty headers cond
...
-This may be helpful to get better diagnostic messages if you forgot to update your module
-interface unit file during maintaining.
+This can be helpful because it gives better diagnostic messages if the module
+interface unit is not properly updated when modifying code.
-The reasoning for the practice is that the declarations in the language linkage are considered
-to be attached to the global module. So the ABI of your library in the modular version
-wouldn't change.
+This approach works because the declarations with language linkage are attached
+to the global module. Thus, the ABI of the modular form of the library does not
+change.
-While this style looks not as convenient as the export-using style, it is easier to convert
-to other styles.
+While this style is more involved than the export-using style, it makes it
+easier to further refactor the library to other styles.
ABI breaking style
~~~~~~~~~~~~~~~~~~
-The term ``ABI breaking`` sounds terrifying generally. But you may want it here if you want
-to force your users to introduce your library in a consistent way. E.g., they either include
-your headers all the way or import your modules all the way.
-The style prevents the users to include your headers and import your modules at the same time
-in the same repo.
+The term ``ABI breaking`` may sound like a bad approach. However, this style
+forces consumers of the library use it in a consistent way. e.g., either always
+include headers for the library or always import modules. The style prevents
+the ability to mix includes and imports for the library.
-The pattern for ABI breaking style is similar with export extern-C++ style.
+The pattern for ABI breaking style is similar to the export extern-C++ style.
.. code-block:: c++
@@ -865,7 +864,7 @@ The pattern for ABI breaking style is similar with export extern-C++ style.
...
#include "source_n.cpp"
#else // the number of .cpp files in your project are a lot
- // Using all the declarations from thirdparty libraries which are
+ // Using all the declarations from third-party libraries which are
// used in the .cpp files.
namespace third_party_namespace {
using third_party_decl_used_in_cpp_1;
@@ -875,11 +874,11 @@ The pattern for ABI breaking style is similar with export extern-C++ style.
}
#endif
-(And add `EXPORT` and conditional include to the headers as suggested in the export
-extern-C++ style section)
+(And add `EXPORT` and conditional include to the headers as suggested in the
+export extern-C++ style section.)
-Remember that the ABI get changed and we need to compile our source files into the
-new ABI format. This is the job of the additional part of the interface unit:
+The ABI with modules is
diff erent and thus we need to compile the source files
+into the new ABI. This is done by an additional part of the interface unit:
.. code-block:: c++
@@ -890,7 +889,7 @@ new ABI format. This is the job of the additional part of the interface unit:
...
#include "source_n.cpp"
#else // the number of .cpp files in your project are a lot
- // Using all the declarations from thirdparty libraries which are
+ // Using all the declarations from third-party libraries which are
// used in the .cpp files.
namespace third_party_namespace {
using third_party_decl_used_in_cpp_1;
@@ -900,16 +899,17 @@ new ABI format. This is the job of the additional part of the interface unit:
}
#endif
-In case the number of your source files are small, we may put everything in the private
-module fragment directly. (it is suggested to add conditional include to the source
-files too). But it will make the compilation of the module interface unit to be slow
-when the number of the source files are not small enough.
+If the number of source files is small, everything can be put in the private
+module fragment directly (it is recommended to add conditional includes to the
+source files as well). However, compile time performance will be bad if there
+are a lot of source files to compile.
-**Note that the private module fragment can only be in the primary module interface unit
-and the primary module interface unit containing private module fragment should be the only
-module unit of the corresponding module.**
+**Note that the private module fragment can only be in the primary module
+interface unit and the primary module interface unit containing the private
+module fragment should be the only module unit of the corresponding module.**
-In that case, you need to convert your source files (.cpp files) to module implementation units:
+In this case, source files (.cpp files) must be converted to module
+implementation units:
.. code-block:: c++
@@ -925,45 +925,40 @@ In that case, you need to convert your source files (.cpp files) to module imple
// Following off should be unchanged.
...
-The module implementation unit will import the primary module implicitly.
-We don't include any headers in the module implementation units
-here since we want to avoid duplicated declarations between translation units.
-This is the reason why we add non-exported using declarations from the third
-party libraries in the primary module interface unit.
-
-And if you provide your library as ``libyour_library.so``, you probably need to
-provide a modular one ``libyour_library_modules.so`` since you changed the ABI.
-
-What if there are headers only inclued by the source files
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+The module implementation unit will import the primary module implicitly. Do
+not include any headers in the module implementation units as it avoids
+duplicated declarations between translation units. This is why non-exported
+using declarations should be added from third-party libraries in the primary
+module interface unit.
-The above practice may be problematic if there are headers only included by the source
-files. If you're using private module fragment, you may solve the issue by including them
-in the private module fragment. While it is OK to solve it by including the implementation
-headers in the module purview if you're using implementation module units, it may be
-suboptimal since the primary module interface units now containing entities not belongs
-to the interface.
+If the library is provided as ``libyour_library.so``, a modular library (e.g.,
+``libyour_library_modules.so``) may also need to be provided for ABI
+compatibility.
-If you're a perfectionist, maybe you can improve it by introducing internal module partition unit.
+What if there are headers only included by the source files
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-The internal module partition unit is an importable module unit which is internal
-to the module itself. The concept just meets the headers only included by the source files.
+The above practice may be problematic if there are headers only included by the
+source files. When using a private module fragment, this issue may be solved by
+including those headers in the private module fragment. While it is OK to solve
+it by including the implementation headers in the module purview when using
+implementation module units, it may be suboptimal because the primary module
+interface units now contain entities that do not belong to the interface.
-We don't show code snippet since it may be too verbose or not good or not general.
-But it may not be too hard if you can understand the points of the section.
+This can potentially be improved by introducing a module partition
+implementation unit. An internal module partition unit is an importable
+module unit which is internal to the module itself.
Providing a header to skip parsing redundant headers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-It is a problem for clang to handle redeclarations between translation units.
-Also there is a long standing issue in clang (`problematic include after import <https://github.com/llvm/llvm-project/issues/61465>`_).
-But even if the issue get fixed in clang someday, the users may still get slower compilation speed
-and larger BMI size. So it is suggested to not include headers after importing the corresponding
-library.
-
-However, it is not easy for users if your library are included by other dependencies.
-
-So the users may have to write codes like:
+Many redeclarations shared between translation units causes Clang to have
+slower compile-time performance. Further, there are known issues with
+`include after import <https://github.com/llvm/llvm-project/issues/61465>`_.
+Even when that issue is resolved, users may still get slower compilation speed
+and larger BMIs. For these reasons, it is recommended to not include headers
+after importing the corresponding module. However, it is not always easy if the
+library is included by other dependencies, as in:
.. code-block:: c++
@@ -977,9 +972,9 @@ or
import your_library;
#include "third_party/A.h" // #include "your_library/a_header.h"
-For such cases, we suggest the libraries providing modules and the headers at the same time
-to provide a header to skip parsing all the headers in your libraries. So the users can
-import your library as the following style to skip redundant handling:
+For such cases, it is best if the library providing both module and header
+interfaces also provides a header which skips parsing so that the library can
+be imported with the following approach that skips redundant redeclarations:
.. code-block:: c++
@@ -987,9 +982,9 @@ import your library as the following style to skip redundant handling:
#include "your_library_imported.h"
#include "third_party/A.h" // #include "your_library/a_header.h" but got skipped
-The implementation of ``your_library_imported.h`` can be a set of controlling macros or
-an overall controlling macro if you're using `#pragma once`. So you can convert your
-headers to:
+The implementation of ``your_library_imported.h`` can be a set of controlling
+macros or an overall controlling macro if using `#pragma once`. Then headers
+can be refactored to:
.. code-block:: c++
@@ -998,25 +993,24 @@ headers to:
...
#endif
-If the modules imported by your library provides such headers too, remember to add them to
-your ``your_library_imported.h`` too.
+If the modules imported by the library provide such headers, remember to add
+them to ``your_library_imported.h`` too.
Importing modules
~~~~~~~~~~~~~~~~~
-When there are dependent libraries providing modules, we suggest you to import that in
-your module.
-
-Most of the existing libraries would fall into this catagory once the std module gets available.
+When there are dependent libraries providing modules, they should be imported
+in your module as well. Many existing libraries will fall into this category
+once the ``std`` module is more widely available.
All dependent libraries providing modules
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-Life gets easier if all the dependent libraries providing modules.
+Of course, most of the complexity disappears if all the dependent libraries
+provide modules.
-You need to convert your headers to include thirdparty headers conditionally.
-
-Then for export-using style:
+Headers need to be converted to include third-party headers conditionally. Then,
+for the export-using style:
.. code-block:: c++
@@ -1035,7 +1029,7 @@ Then for export-using style:
using decl_n;
}
-For export extern-C++ style:
+or, for the export extern-C++ style:
.. code-block:: c++
@@ -1049,7 +1043,7 @@ For export extern-C++ style:
#include "header_n.h"
}
-For ABI breaking style,
+or, for the ABI-breaking style,
.. code-block:: c++
@@ -1069,35 +1063,39 @@ For ABI breaking style,
#include "source_n.cpp"
#endif
-We don't need the non-exported using declarations if we're using implementation module
-units now. We can import thirdparty modules directly in the implementation module
-units.
+Non-exported ``using`` declarations are unnecessary if using implementation
+module units. Instead, third-party modules can be imported directly in
+implementation module units.
Partial dependent libraries providing modules
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-In this case, we have to mix the use of ``include`` and ``import`` in the module of our
-library. The key point here is still to remove duplicated declarations in translation
-units as much as possible. If the imported modules provide headers to skip parsing their
-headers, we should include that after the including. If the imported modules don't provide
-the headers, we can make it ourselves if we still want to optimize it.
-
-Known Problems
---------------
-
-The following describes issues in the current implementation of modules.
-Please see https://github.com/llvm/llvm-project/labels/clang%3Amodules for more issues
-or file a new issue if you don't find an existing one.
-If you're going to create a new issue for standard C++ modules,
-please start the title with ``[C++20] [Modules]`` (or ``[C++23] [Modules]``, etc)
-and add the label ``clang:modules`` (if you have permissions for that).
-
-For higher level support for proposals, you could visit https://clang.llvm.org/cxx_status.html.
-
-Including headers after import is problematic
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+If the library has to mix the use of ``include`` and ``import`` in its module,
+the primary goal is still the removal of duplicated declarations in translation
+units as much as possible. If the imported modules provide headers to skip
+parsing their headers, those should be included after the import. If the
+imported modules don't provide such a header, one can be made manually for
+improved compile time performance.
+
+Known Issues
+------------
+
+The following describes issues in the current implementation of modules. Please
+see
+`the issues list for modules <https://github.com/llvm/llvm-project/labels/clang%3Amodules>`_
+for a list of issues or to file a new issue if you don't find an existing one.
+When creating a new issue for standard C++ modules, please start the title with
+``[C++20] [Modules]`` (or ``[C++23] [Modules]``, etc) and add the label
+``clang:modules`` if possible.
+
+A high-level overview of support for standards features, including modules, can
+be found on the `C++ Feature Status <https://clang.llvm.org/cxx_status.html>`_
+page.
+
+Including headers after import is not well-supported
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-For example, the following example can be accept:
+The following example is accepted:
.. code-block:: c++
@@ -1110,8 +1108,8 @@ For example, the following example can be accept:
return 0;
}
-but it will get rejected if we reverse the order of ``#include <iostream>`` and
-``import foo;``:
+but if the order of ``#include <iostream>`` and ``import foo;`` is reversed,
+then the code is currently rejected:
.. code-block:: c++
@@ -1126,33 +1124,31 @@ but it will get rejected if we reverse the order of ``#include <iostream>`` and
Both of the above examples should be accepted.
-This is a limitation in the implementation. In the first example,
-the compiler will see and parse <iostream> first then the compiler will see the import.
-So the ODR Checking and declarations merging will happen in the deserializer.
-In the second example, the compiler will see the import first and the include second.
-As a result, the ODR Checking and declarations merging will happen in the semantic analyzer.
+This is a limitation of the implementation. In the first example, the compiler
+will see and parse ``<iostream>`` first then it will see the ``import``. In
+this case, ODR checking and declaration merging will happen in the
+deserializer. In the second example, the compiler will see the ``import`` first
+and the ``#include`` second which results in ODR checking and declarations
+merging happening in the semantic analyzer. This is due to a divergence in the
+implementation path. This is tracked by
+`#61465 <https://github.com/llvm/llvm-project/issues/61465>`_.
-So there is divergence in the implementation path. It might be understandable that why
-the orders matter here in the case.
-(Note that "understandable" is
diff erent from "makes sense").
-
-This is tracked in: https://github.com/llvm/llvm-project/issues/61465
-
-Ignored PreferredName Attribute
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Due to a tricky problem, when Clang writes BMIs, Clang will ignore the ``preferred_name`` attribute, if any.
-This implies that the ``preferred_name`` wouldn't show in debugger or dumping.
+Ignored ``preferred_name`` Attribute
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-This is tracked in: https://github.com/llvm/llvm-project/issues/56490
+When Clang writes BMIs, it will ignore the ``preferred_name`` attribute on
+declarations which use it. Thus, the preferred name will not be displayed in
+the debugger as expected. This is tracked by
+`#56490 <https://github.com/llvm/llvm-project/issues/56490>`_.
Don't emit macros about module declaration
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-This is covered by P1857R3. We mention it again here since users may abuse it before we implement it.
+This is covered by `P1857R3 <https://wg21.link/P1857R3>`_. It is mentioned here
+because we want users to be aware that we don't yet implement it.
-Someone may want to write code which could be compiled both by modules or non-modules.
-A direct idea would be use macros like:
+A direct approach to write code that can be compiled by both modules and
+non-module builds may look like:
.. code-block:: c++
@@ -1162,39 +1158,37 @@ A direct idea would be use macros like:
IMPORT header_name
EXPORT ...
-So this file could be triggered like a module unit or a non-module unit depending on the definition
-of some macros.
-However, this kind of usage is forbidden by P1857R3 but we haven't implemented P1857R3 yet.
-This means that is possible to write illegal modules code now, and obviously this will stop working
-once P1857R3 is implemented.
-A simple suggestion would be "Don't play macro tricks with module declarations".
+The intent of this is that this file can be compiled like a module unit or a
+non-module unit depending on the definition of some macros. However, this usage
+is forbidden by P1857R3 which is not yet implemented in Clang. This means that
+is possible to write invalid modules which will no longer be accepted once
+P1857R3 is implemented. This is tracked by
+`#56917 <https://github.com/llvm/llvm-project/issues/56917>`_.
+
+Until then, it is recommended not to mix macros with module declarations.
-This is tracked in: https://github.com/llvm/llvm-project/issues/56917
In consistent filename suffix requirement for importable module units
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Currently, clang requires the file name of an ``importable module unit`` should end with ``.cppm``
-(or ``.ccm``, ``.cxxm``, ``.c++m``). However, the behavior is inconsistent with other compilers.
-
-This is tracked in: https://github.com/llvm/llvm-project/issues/57416
-
-clang-cl is not compatible with the standard C++ modules
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Now we can't use the `/clang:-fmodule-file` or `/clang:-fprebuilt-module-path` to specify
-the BMI within ``clang-cl.exe``.
+Currently, Clang requires the file name of an ``importable module unit`` to
+have ``.cppm`` (or ``.ccm``, ``.cxxm``, ``.c++m``) as the file extension.
+However, the behavior is inconsistent with other compilers. This is tracked by
+`#57416 <https://github.com/llvm/llvm-project/issues/57416>`_.
-This is tracked in: https://github.com/llvm/llvm-project/issues/64118
+clang-cl is not compatible with standard C++ modules
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-false positive ODR violation diagnostic due to using inconsistent qualified but the same type
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+``/clang:-fmodule-file`` and ``/clang:-fprebuilt-module-path`` cannot be used
+to specify the BMI with ``clang-cl.exe``. This is tracked by
+`#64118 <https://github.com/llvm/llvm-project/issues/64118>`_.
-ODR violation is a pretty common issue when using modules.
-Sometimes the program violated the One Definition Rule actually.
-But sometimes it shows the compiler gives false positive diagnostics.
+Incorrect ODR violation diagnostics
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-One often reported example is:
+ODR violations are a common issue when using modules. Clang sometimes produces
+false-positive diagnostics or fails to produce true-positive diagnostics of the
+One Definition Rule. One often-reported example is:
.. code-block:: c++
@@ -1222,51 +1216,49 @@ One often reported example is:
export module repro;
export import :part;
-Currently the compiler complains about the inconsistent definition of `fun()` in
-2 module units. This is incorrect. Since both definitions of `fun()` has the same
-spelling and `T` refers to the same type entity finally. So the program should be
-fine.
-
-This is tracked in https://github.com/llvm/llvm-project/issues/78850.
+Currently the compiler incorrectly diagnoses the inconsistent definition of
+``fun()`` in two module units. Because both definitions of ``fun()`` have the
+same spelling and ``T`` refers to the same type entity, there is no ODR
+violation. This is tracked by
+`#78850 <https://github.com/llvm/llvm-project/issues/78850>`_.
Using TU-local entity in other units
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Module units are translation units. So the entities which should only be local to the
-module unit itself shouldn't be used by other units in any means.
+Module units are translation units, so the entities which should be local to
+the module unit itself should never be used by other units.
-In the language side, to address the idea formally, the language specification defines
-the concept of ``TU-local`` and ``exposure`` in
+The C++ standard defines the concept of ``TU-local`` and ``exposure`` in
`basic.link/p14 <https://eel.is/c++draft/basic.link#14>`_,
`basic.link/p15 <https://eel.is/c++draft/basic.link#15>`_,
`basic.link/p16 <https://eel.is/c++draft/basic.link#16>`_,
-`basic.link/p17 <https://eel.is/c++draft/basic.link#17>`_ and
+`basic.link/p17 <https://eel.is/c++draft/basic.link#17>`_, and
`basic.link/p18 <https://eel.is/c++draft/basic.link#18>`_.
-However, the compiler doesn't support these 2 ideas formally.
-This results in unclear and confusing diagnostic messages.
-And it is worse that the compiler may import TU-local entities to other units without any
-diagnostics.
+However, Clang doesn't formally support these two concepts. This results in
+unclear or confusing diagnostic messages. Further, Clang may import
+``TU-local`` entities to other units without any diagnostics. This is tracked
+by `#78173 <https://github.com/llvm/llvm-project/issues/78173>`_.
-This is tracked in https://github.com/llvm/llvm-project/issues/78173.
+.. _header-units:
Header Units
============
-How to build projects using header unit
----------------------------------------
+How to build projects using header units
+----------------------------------------
.. warning::
- The user interfaces of header units is highly experimental. There are still
- many unanswered question about how tools should interact with header units.
- The user interfaces described here may change after we have progress on how
- tools should support for header units.
+ The support for header units, including related command line options, is
+ experimental. There are still many unanswered question about how tools
+ should interact with header units. The details described here may change in
+ the future.
Quick Start
~~~~~~~~~~~
-For the following example,
+The following example:
.. code-block:: c++
@@ -1275,7 +1267,7 @@ For the following example,
std::cout << "Hello World.\n";
}
-we could compile it as
+could be compiled with:
.. code-block:: console
@@ -1285,14 +1277,14 @@ we could compile it as
How to produce BMIs
~~~~~~~~~~~~~~~~~~~
-Similar to named modules, we could use ``--precompile`` to produce the BMI.
-But we need to specify that the input file is a header by ``-xc++-system-header`` or ``-xc++-user-header``.
+Similar to named modules, ``--precompile`` can be used to produce a BMI.
+However, that requires specifying that the input file is a header by using
+``-xc++-system-header`` or ``-xc++-user-header``.
-Also we could use `-fmodule-header={user,system}` option to produce the BMI for header units
-which has suffix like `.h` or `.hh`.
-The value of `-fmodule-header` means the user search path or the system search path.
-The default value for `-fmodule-header` is `user`.
-For example,
+The ``-fmodule-header={user,system}`` option can also be used to produce a BMI
+for header units which have a file extension like `.h` or `.hh`. The argument to
+``-fmodule-header`` specifies either the user search path or the system search
+path. The default value for ``-fmodule-header`` is ``user``. For example:
.. code-block:: c++
@@ -1308,16 +1300,16 @@ For example,
Hello();
}
-We could compile it as:
+could be compiled with:
.. code-block:: console
$ clang++ -std=c++20 -fmodule-header foo.h -o foo.pcm
$ clang++ -std=c++20 -fmodule-file=foo.pcm use.cpp
-For headers which don't have a suffix, we need to pass ``-xc++-header``
-(or ``-xc++-system-header`` or ``-xc++-user-header``) to mark it as a header.
-For example,
+For headers which do not have a file extension, ``-xc++-header`` (or
+``-xc++-system-header``, ``-xc++-user-header``) must be used to specify the
+file as a header. For example:
.. code-block:: c++
@@ -1332,23 +1324,25 @@ For example,
$ clang++ -std=c++20 -fmodule-header=system -xc++-header iostream -o iostream.pcm
$ clang++ -std=c++20 -fmodule-file=iostream.pcm use.cpp
-How to specify the dependent BMIs
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+How to specify dependent BMIs
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-We could use ``-fmodule-file`` to specify the BMIs, and this option may occur multiple times as well.
+``-fmodule-file`` can be used to specify a dependent BMI (or multiple times for
+more than one dependent BMI).
-With the existing implementation ``-fprebuilt-module-path`` cannot be used for header units
-(since they are nominally anonymous).
-For header units, use ``-fmodule-file`` to include the relevant PCM file for each header unit.
+With the existing implementation, ``-fprebuilt-module-path`` cannot be used for
+header units (because they are nominally anonymous). For header units, use
+``-fmodule-file`` to include the relevant PCM file for each header unit.
-This is expect to be solved in future editions of the compiler either by the tooling finding and specifying
-the -fmodule-file or by the use of a module-mapper that understands how to map the header name to their PCMs.
+This is expect to be solved in a future version of Clang either by the compiler
+finding and specifying ``-fmodule-file`` automatically, or by the use of a
+module-mapper that understands how to map the header name to their PCMs.
-Don't compile the BMI
-~~~~~~~~~~~~~~~~~~~~~
+Compiling a header unit to an object file
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-Another
diff erence with modules is that we can't compile the BMI from a header unit.
-For example:
+A header unit cannot be compiled to an object file due to the semantics of
+header units. For example:
.. code-block:: console
@@ -1356,15 +1350,13 @@ For example:
# This is not allowed!
$ clang++ iostream.pcm -c -o iostream.o
-It makes sense due to the semantics of header units, which are just like headers.
-
Include translation
~~~~~~~~~~~~~~~~~~~
-The C++ spec allows the vendors to convert ``#include header-name`` to ``import header-name;`` when possible.
-Currently, Clang would do this translation for the ``#include`` in the global module fragment.
-
-For example, the following two examples are the same:
+The C++ standard allows vendors to convert ``#include header-name`` to
+``import header-name;`` when possible. Currently, Clang does this translation
+for the ``#include`` in the global module fragment. For example, the following
+example:
.. code-block:: c++
@@ -1375,7 +1367,7 @@ For example, the following two examples are the same:
std::cout << "Hello.\n";
}
-with the following one:
+is the same as this example:
.. code-block:: c++
@@ -1391,17 +1383,17 @@ with the following one:
$ clang++ -std=c++20 -xc++-system-header --precompile iostream -o iostream.pcm
$ clang++ -std=c++20 -fmodule-file=iostream.pcm --precompile M.cppm -o M.cpp
-In the latter example, the Clang could find the BMI for the ``<iostream>``
-so it would try to replace the ``#include <iostream>`` to ``import <iostream>;`` automatically.
+In the latter example, Clang can find the BMI for ``<iostream>`` and so it
+tries to replace the ``#include <iostream>`` with ``import <iostream>;``
+automatically.
-Relationships between Clang modules
------------------------------------
+Differences between Clang modules and header units
+--------------------------------------------------
-Header units have pretty similar semantics with Clang modules.
-The semantics of both of them are like headers.
-
-In fact, we could even "mimic" the sytle of header units by Clang modules:
+Header units have similar semantics to Clang modules. The semantics of both are
+like headers. Therefore, header units can be mimicked by Clang modules as in
+the following example:
.. code-block:: c++
@@ -1414,46 +1406,45 @@ In fact, we could even "mimic" the sytle of header units by Clang modules:
$ clang++ -std=c++20 -fimplicit-modules -fmodule-map-file=.modulemap main.cpp
-It would be simpler if we are using libcxx:
+This example is simplified when using libc++:
.. code-block:: console
$ clang++ -std=c++20 main.cpp -fimplicit-modules -fimplicit-module-maps
-Since there is already one
-`module map <https://github.com/llvm/llvm-project/blob/main/libcxx/include/module.modulemap.in>`_
-in the source of libcxx.
-
-Then immediately leads to the question: why don't we implement header units through Clang header modules?
+because libc++ already supplies a
+`module map <https://github.com/llvm/llvm-project/blob/main/libcxx/include/module.modulemap.in>`_.
-The main reason for this is that Clang modules have more semantics like hierarchy or
-wrapping multiple headers together as a big module.
-However, these things are not part of Standard C++ Header units,
-and we want to avoid the impression that these additional semantics get interpreted as Standard C++ behavior.
+This raises the question: why are header units not implemented through Clang
+modules?
-Another reason is that there are proposals to introduce module mappers to the C++ standard
-(for example, https://wg21.link/p1184r2).
-If we decide to reuse Clang's modulemap, we may get in trouble once we need to introduce another module mapper.
+This is primarily because Clang modules have more hierarchical semantics when
+wrapping multiple headers together as one module, which is not supported by
+Standard C++ Header units. We want to avoid the impression that these
+additional semantics get interpreted as Standard C++ behavior.
-So the final answer for why we don't reuse the interface of Clang modules for header units is that
-there are some
diff erences between header units and Clang modules and that ignoring those
-
diff erences now would likely become a problem in the future.
+Another reason is that there are proposals to introduce module mappers to the
+C++ standard (for example, https://wg21.link/p1184r2). Reusing Clang's
+``modulemap`` may be more
diff icult if we need to introduce another module
+mapper.
-Discover Dependencies
-=====================
+Discovering Dependencies
+========================
-Prior to modules, all the translation units can be compiled parallelly.
-But it is not true for the module units. The presence of module units requires
-us to compile the translation units in a (topological) order.
+Without use of modules, all the translation units in a project can be compiled
+in parallel. However, the presence of module units requires compiling the
+translation units in a topological order.
-The clang-scan-deps scanner implemented
-`P1689 paper <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1689r5.html>`_
-to describe the order. Only named modules are supported now.
+The ``clang-scan-deps`` tool can extract dependency information and produce a
+JSON file conforming to the specification described in
+`P1689 <https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2022/p1689r5.html>`_.
+Only named modules are supported currently.
-We need a compilation database to use clang-scan-deps. See
+A compilation database is needed when using ``clang-scan-deps``. See
`JSON Compilation Database Format Specification <JSONCompilationDatabase.html>`_
-for example. Note that the ``output`` entry is necessary for clang-scan-deps
-to scan P1689 format. Here is an example:
+for more information about compilation databases. Note that the ``output``
+JSON attribute is necessary for ``clang-scan-deps`` to scan using the P1689
+format. For example:
.. code-block:: c++
@@ -1533,13 +1524,13 @@ And here is the compilation database:
}
]
-And we can get the dependency information in P1689 format by:
+To get the dependency information in P1689 format, use:
.. code-block:: console
$ clang-scan-deps -format=p1689 -compilation-database P1689.json
-And we will get:
+to get:
.. code-block:: text
@@ -1619,14 +1610,14 @@ And we will get:
See the P1689 paper for the meaning of the fields.
-And if the user want a finer-grained control for any reason, e.g., to scan the generated source files,
-the user can choose to get the dependency information per file. For example:
+Getting dependency information per file with finer-grained control (such as
+scanning generated source files) is possible. For example:
.. code-block:: console
$ clang-scan-deps -format=p1689 -- <path-to-compiler-executable>/clang++ -std=c++20 impl_part.cppm -c -o impl_part.o
-And we'll get:
+will produce:
.. code-block:: text
@@ -1652,22 +1643,23 @@ And we'll get:
"version": 1
}
-In this way, we can pass the single command line options after the ``--``.
-Then clang-scan-deps will extract the necessary information from the options.
-Note that we need to specify the path to the compiler executable instead of saying
-``clang++`` simply.
+Individual command line options can be specified after ``--``.
+``clang-scan-deps`` will extract the necessary information from the specified
+options. Note that the path to the compiler executable needs to be specified
+explicitly instead of using ``clang++`` directly.
-The users may want the scanner to get the transitional dependency information for headers.
-Otherwise, the users have to scan twice for the project, once for headers and once for modules.
-To address the requirement, clang-scan-deps will recognize the specified preprocessor options
-in the given command line and generate the corresponding dependency information. For example,
+Users may want the scanner to get the transitional dependency information for
+headers. Otherwise, the project has to be scanned twice, once for headers and
+once for modules. To address this, ``clang-scan-deps`` will recognize the
+specified preprocessor options in the given command line and generate the
+corresponding dependency information. For example:
.. code-block:: console
$ clang-scan-deps -format=p1689 -- ../bin/clang++ -std=c++20 impl_part.cppm -c -o impl_part.o -MD -MT impl_part.ddi -MF impl_part.dep
$ cat impl_part.dep
-We will get:
+will produce:
.. code-block:: text
@@ -1679,41 +1671,41 @@ We will get:
/usr/include/bits/types/__locale_t.h \
...
-When clang-scan-deps detects ``-MF`` option, clang-scan-deps will try to write the
+When ``clang-scan-deps`` detects the ``-MF`` option, it will try to write the
dependency information for headers to the file specified by ``-MF``.
Possible Issues: Failed to find system headers
----------------------------------------------
-In case the users encounter errors like ``fatal error: 'stddef.h' file not found``,
-probably the specified ``<path-to-compiler-executable>/clang++`` refers to a symlink
-instead a real binary. There are 4 potential solutions to the problem:
-
-* (1) End users can resolve the issue by pointing the specified compiler executable to
- the real binary instead of the symlink.
-* (2) End users can invoke ``<path-to-compiler-executable>/clang++ -print-resource-dir``
- to get the corresponding resource directory for your compiler and add that directory
- to the include search paths manually in the build scripts.
-* (3) Build systems that use a compilation database as the input for clang-scan-deps
- scanner, the build system can add the flag ``--resource-dir-recipe invoke-compiler`` to
- the clang-scan-deps scanner to calculate the resources directory dynamically.
- The calculation happens only once for a unique ``<path-to-compiler-executable>/clang++``.
-* (4) For build systems that invokes the clang-scan-deps scanner per file, repeatedly
- calculating the resource directory may be inefficient. In such cases, the build
- system can cache the resource directory by itself and pass ``-resource-dir <resource-dir>``
- explicitly in the command line options:
+If encountering an error like ``fatal error: 'stddef.h' file not found``,
+the specified ``<path-to-compiler-executable>/clang++`` probably refers to a
+symlink instead a real binary. There are four potential solutions to the
+problem:
-.. code-block:: console
+1. Point the specified compiler executable to the real binary instead of the
+ symlink.
+2. Invoke ``<path-to-compiler-executable>/clang++ -print-resource-dir`` to get
+ the corresponding resource directory for your compiler and add that
+ directory to the include search paths manually in the build scripts.
+3. For build systems that use a compilation database as the input for
+ ``clang-scan-deps``, the build system can add the
+ ``--resource-dir-recipe invoke-compiler`` option when executing
+ ``clang-scan-deps`` to calculate the resource directory dynamically.
+ The calculation happens only once for a unique ``<path-to-compiler-executable>/clang++``.
+4. For build systems that invoke ``clang-scan-deps`` per file, repeatedly
+ calculating the resource directory may be inefficient. In such cases, the
+ build system can cache the resource directory and specify
+ ``-resource-dir <resource-dir>`` explicitly, as in:
+
+ .. code-block:: console
- $ clang-scan-deps -format=p1689 -- <path-to-compiler-executable>/clang++ -std=c++20 -resource-dir <resource-dir> mod.cppm -c -o mod.o
+ $ clang-scan-deps -format=p1689 -- <path-to-compiler-executable>/clang++ -std=c++20 -resource-dir <resource-dir> mod.cppm -c -o mod.o
Import modules with clang-repl
==============================
-We're able to import C++20 named modules with clang-repl.
-
-Let's start with a simple example:
+``clang-repl`` supports importing C++20 named modules. For example:
.. code-block:: c++
@@ -1723,7 +1715,7 @@ Let's start with a simple example:
return "Hello Interpreter for Modules!";
}
-We still need to compile the named module in ahead.
+The named module still needs to be compiled ahead of time.
.. code-block:: console
@@ -1731,10 +1723,9 @@ We still need to compile the named module in ahead.
$ clang++ M.pcm -c -o M.o
$ clang++ -shared M.o -o libM.so
-Note that we need to compile the module unit into a dynamic library so that the clang-repl
-can load the object files of the module units.
-
-Then we are able to import module ``M`` in clang-repl.
+Note that the module unit needs to be compiled as a dynamic library so that
+``clang-repl`` can load the object files of the module units. Then it is
+possible to import module ``M`` in clang-repl.
.. code-block:: console
@@ -1753,17 +1744,18 @@ Possible Questions
How modules speed up compilation
--------------------------------
-A classic theory for the reason why modules speed up the compilation is:
-if there are ``n`` headers and ``m`` source files and each header is included by each source file,
-then the complexity of the compilation is ``O(n*m)``;
-But if there are ``n`` module interfaces and ``m`` source files, the complexity of the compilation is
-``O(n+m)``. So, using modules would be a big win when scaling.
-In a simpler word, we could get rid of many redundant compilations by using modules.
+A classic theory for the reason why modules speed up the compilation is: if
+there are ``n`` headers and ``m`` source files and each header is included by
+each source file, then the complexity of the compilation is ``O(n*m)``.
+However, if there are ``n`` module interfaces and ``m`` source files, the
+complexity of the compilation is ``O(n+m)``. Therefore, using modules would be
+a significant improvement at scale. More simply, use of modules causes many of
+the redundant compilations to no longer be necessary.
-Roughly, this theory is correct. But the problem is that it is too rough.
-The behavior depends on the optimization level, as we will illustrate below.
+While this is accurate at a high level, this depends greatly on the
+optimization level, as illustrated below.
-First is ``O0``. The compilation process is described in the following graph.
+First is ``-O0``. The compilation process is described in the following graph.
.. code-block:: none
@@ -1771,13 +1763,13 @@ First is ``O0``. The compilation process is described in the following graph.
│ │ │ │
└---parsing----sema----codegen--┴----- transformations ---- codegen ----┴---- codegen --┘
- ┌---------------------------------------------------------------------------------------┐
+ ├---------------------------------------------------------------------------------------┐
| │
| source file │
| │
└---------------------------------------------------------------------------------------┘
- ┌--------┐
+ ├--------┐
│ │
│imported│
│ │
@@ -1785,18 +1777,17 @@ First is ``O0``. The compilation process is described in the following graph.
│ │
└--------┘
-Here we can see that the source file (could be a non-module unit or a module unit) would get processed by the
-whole pipeline.
-But the imported code would only get involved in semantic analysis, which is mainly about name lookup,
-overload resolution and template instantiation.
-All of these processes are fast relative to the whole compilation process.
-More importantly, the imported code only needs to be processed once in frontend code generation,
-as well as the whole middle end and backend.
-So we could get a big win for the compilation time in O0.
+In this case, the source file (which could be a non-module unit or a module
+unit) would get processed by the entire pipeline. However, the imported code
+would only get involved in semantic analysis, which, for the most part, is name
+lookup, overload resolution, and template instantiation. All of these processes
+are fast relative to the whole compilation process. More importantly, the
+imported code only needs to be processed once during frontend code generation,
+as well as the whole middle end and backend. So we could get a big win for the
+compilation time in ``-O0``.
-But with optimizations, things are
diff erent:
-
-(we omit ``code generation`` part for each end due to the limited space)
+But with optimizations, things are
diff erent (the ``code generation`` part for
+each end is omitted due to limited space):
.. code-block:: none
@@ -1804,12 +1795,12 @@ But with optimizations, things are
diff erent:
│ │ │ │
└--- parsing ---- sema -----┴--- optimizations --- IPO ---- optimizations---┴--- optimizations -┘
- ┌-----------------------------------------------------------------------------------------------┐
+ ├-----------------------------------------------------------------------------------------------┐
│ │
│ source file │
│ │
└-----------------------------------------------------------------------------------------------┘
- ┌---------------------------------------┐
+ ├---------------------------------------┐
│ │
│ │
│ imported code │
@@ -1817,27 +1808,29 @@ But with optimizations, things are
diff erent:
│ │
└---------------------------------------┘
-It would be very unfortunate if we end up with worse performance after using modules.
-The main concern is that when we compile a source file, the compiler needs to see the function body
-of imported module units so that it can perform IPO (InterProcedural Optimization, primarily inlining
-in practice) to optimize functions in current source file with the help of the information provided by
-the imported module units.
-In other words, the imported code would be processed again and again in importee units
-by optimizations (including IPO itself).
-The optimizations before IPO and the IPO itself are the most time-consuming part in whole compilation process.
-So from this perspective, we might not be able to get the improvements described in the theory.
-But we could still save the time for optimizations after IPO and the whole backend.
-
-Overall, at ``O0`` the implementations of functions defined in a module will not impact module users,
-but at higher optimization levels the definitions of such functions are provided to user compilations for the
-purposes of optimization (but definitions of these functions are still not included in the use's object file)-
-this means the build speedup at higher optimization levels may be lower than expected given ``O0`` experience,
-but does provide by more optimization opportunities.
+It would be very unfortunate if we end up with worse performance when using
+modules. The main concern is that when a source file is compiled, the compiler
+needs to see the body of imported module units so that it can perform IPO
+(InterProcedural Optimization, primarily inlining in practice) to optimize
+functions in the current source file with the help of the information provided
+by the imported module units. In other words, the imported code would be
+processed again and again in importee units by optimizations (including IPO
+itself). The optimizations before IPO and IPO itself are the most time-consuming
+part in whole compilation process. So from this perspective, it might not be
+possible to get the compile time improvements described, but there could be
+time savings for optimizations after IPO and the whole backend.
+
+Overall, at ``-O0`` the implementations of functions defined in a module will
+not impact module users, but at higher optimization levels the definitions of
+such functions are provided to user compilations for the purposes of
+optimization (but definitions of these functions are still not included in the
+use's object file). This means the build speedup at higher optimization levels
+may be lower than expected given ``-O0`` experience, but does provide more
+optimization opportunities.
Interoperability with Clang Modules
-----------------------------------
-We **wish** to support clang modules and standard c++ modules at the same time,
-but the mixed using form is not well used/tested yet.
-
-Please file new github issues as you find interoperability problems.
+We **wish** to support Clang modules and standard C++ modules at the same time,
+but the mixing them together is not well used/tested yet. Please file new
+GitHub issues as you find interoperability problems.
More information about the cfe-commits
mailing list