[clang] a5aeba7 - [Docs] Multilib design

Michael Platings via cfe-commits cfe-commits at lists.llvm.org
Tue Jun 13 22:47:27 PDT 2023


Author: Michael Platings
Date: 2023-06-14T06:46:42+01:00
New Revision: a5aeba737694f56daacfbe8f78601cbd14f5d359

URL: https://github.com/llvm/llvm-project/commit/a5aeba737694f56daacfbe8f78601cbd14f5d359
DIFF: https://github.com/llvm/llvm-project/commit/a5aeba737694f56daacfbe8f78601cbd14f5d359.diff

LOG: [Docs] Multilib design

Reviewed By: peter.smith, MaskRay

Differential Revision: https://reviews.llvm.org/D143587

Added: 
    clang/docs/Multilib.rst

Modified: 
    clang/docs/index.rst

Removed: 
    


################################################################################
diff  --git a/clang/docs/Multilib.rst b/clang/docs/Multilib.rst
new file mode 100644
index 0000000000000..78a300a60f8bc
--- /dev/null
+++ b/clang/docs/Multilib.rst
@@ -0,0 +1,311 @@
+========
+Multilib
+========
+
+Introduction
+============
+
+This document describes how multilib is implemented in Clang.
+
+What is multilib and why might you care?
+If you're :doc:`cross compiling<CrossCompilation>` then you can't use native
+system headers and libraries. To address this, you can use a combination of
+``--sysroot``, ``-isystem`` and ``-L`` options to point Clang at suitable
+directories for your target.
+However, when there are many possible directories to choose from, it's not
+necessarily obvious which one to pick.
+Multilib allows a toolchain designer to imbue the toolchain with the ability to
+pick a suitable directory automatically, based on the options the user provides
+to Clang. For example, if the user specifies
+``--target=arm-none-eabi -mcpu=cortex-m4`` the toolchain can choose a directory
+containing headers and libraries suitable for Armv7E-M, because it knows that's
+a suitable architecture for Arm Cortex-M4.
+Multilib can also choose between libraries for the same architecture based on
+other options. For example if the user specifies ``-fno-exceptions`` then a
+toolchain could select libraries built without exception support, thereby
+reducing the size of the resulting binary.
+
+Design
+======
+
+Clang supports GCC's ``-print-multi-lib`` and ``-print-multi-directory``
+options. These are described in
+`GCC Developer Options <https://gcc.gnu.org/onlinedocs/gcc-12.2.0/gcc/Developer-Options.html>`_.
+
+There are two ways to configure multilib in Clang: hard-coded or via a
+configuration file.
+
+Hard-coded Multilib
+===================
+
+The available libraries can be hard-coded in Clang. Typically this is done
+using the ``MultilibBuilder`` interface in
+``clang/include/clang/Driver/MultilibBuilder.h``.
+There are many examples of this in ``lib/Driver/ToolChains/Gnu.cpp``.
+The remainder of this document will not focus on this type of multilib.
+
+EXPERIMENTAL Multilib via configuration file
+============================================
+
+Some Clang toolchains support loading multilib configuration from a
+``multilib.yaml`` configuration file.
+
+A ``multilib.yaml`` configuration file specifies which multilib variants are
+available, their relative location, what compilation options were used to build
+them, and the criteria by which they are selected.
+
+Multilib processing
+===================
+
+Clang goes through the following steps to use multilib from a configuration
+file:
+#. Normalize command line options. Clang can accept the same
+   information via 
diff erent options - for example,
+   ``--target=arm-none-eabi -march=armv7-m`` and
+   ``--target=armv7m-none-eabi`` are equivalent.
+   Clang normalizes the command line before passing them to the multilib system.
+   To see what flags are emitted for a given set of command line options, use
+   the ``-print-multi-flags-experimental`` command line option
+   along with the rest of the options you want to use.
+#. Load ``multilib.yaml`` from sysroot.
+#. Generate additional flags. ``multilib.yaml`` contains a ``Mappings`` section,
+   which specifies how to generate additional flags based on the flags derived
+   from command line options. Flags are matched using regular expressions.
+   These regular expressions shall use the POSIX extended regular expression
+   syntax.
+#. Match flags against multilib variants. If the generated flags are a superset
+   of the flags specified for a multilib variant then the variant is considered
+   a match.
+   If more than one variant matches then a toolchain may opt to either use only
+   the *last* matching multilib variant, or may use all matching variants,
+   thereby :ref:`layering<Multilib layering>` them.
+#. Generate ``-isystem`` and ``-L`` options. Iterate in reverse order over
+   the matching multilib variants, and generate ``-isystem`` and ``-L``
+   options based on the multilib variant's directory.
+
+Multilib layering
+=================
+
+When Clang selects multilib variants, it may find that more than one variant
+matches.
+
+It is up to the ToolChain subclass to decide what to do in this case.
+There are two options permitted:
+#. Use only the *last* matching multilib variant. This option exists primarily
+   for compatibility with the previous multilib design.
+#. Use all matching variants, thereby layering them.
+
+This decision is hard-coded per ToolChain subclass. The latter option is
+preferred for ToolChain subclasses without backwards compatibility
+requirements.
+
+If the latter option is chosen then ``-isystem`` and ``-L`` options will be
+generated for each matching multilib variant, in reverse order.
+
+This means that the compiler or linker will find files in the last matching
+multilib variant that has the given file.
+This behaviour permits multilib variants with only a partial set of files.
+This means a toolchain can be distributed with one base multilib variant
+containing all system headers and includes, and more specialised multilib
+variants containing only files that are 
diff erent to those in the base variant.
+
+For example, a multilib variant could be compiled with ``-fno-exceptions``.
+This option doesn't affect the content of header files, nor does it affect the
+C libraries. Therefore if multilib layering is supported by the ToolChain
+subclass and a suitable base multilib variant is present then the
+``-fno-exceptions`` multilib variant need only contain C++ libraries.
+
+It is the responsibility of layered multilib authors to ensure that headers and
+libraries in each layer are complete enough to mask any incompatibilities.
+
+Stability
+=========
+
+Multilib via configuration file shall be considered an experimental feature
+until LLVM 18, at which point ``-print-multi-flags-experimental``
+should be renamed to ``-print-multi-flags``.
+A toolchain can opt in to using this feature by including a ``multilib.yaml``
+file in its distribution, once support for it is added in relevant ToolChain
+subclasses.
+Once stability is reached, flags emitted by ``-print-multi-flags``
+should not be removed or changed, although new flags may be added.
+
+Restrictions
+============
+
+Despite the name, multilib is used to locate both ``include`` and ``lib``
+directories. Therefore it is important that consistent options are passed to
+the Clang driver when both compiling and linking. Otherwise inconsistent
+``include`` and ``lib`` directories may be used, and the results will be
+undefined.
+
+EXPERIMENTAL multilib.yaml
+==========================
+
+The below example serves as a small of a possible multilib, and documents
+the available options.
+
+For a more comprehensive example see
+``clang/test/Driver/baremetal-multilib.yaml`` in the ``llvm-project`` sources.
+
+.. code-block:: yaml
+  # multilib.yaml
+
+  # This format is experimental and is likely to change!
+
+  # Syntax is YAML 1.2
+
+  # This required field defines the version of the multilib.yaml format.
+  # Clang will emit an error if this number is greater than its current multilib
+  # version or if its major version 
diff ers, but will accept lesser minor
+  # versions.
+  MultilibVersion: 1.0
+
+  # The rest of this file is in two parts:
+  # 1. A list of multilib variants.
+  # 2. A list of regular expressions that may match flags generated from
+  #    command line options, and further flags that shall be added if the
+  #    regular expression matches.
+  # It is acceptable for the file to contain properties not documented here,
+  # and these will be ignored by Clang.
+
+  # List of multilib variants. Required.
+  # The ordering of items in the variants list is important if more than one
+  # variant can match the same set of flags. See the docs on multilib layering
+  # for more info.
+  Variants:
+
+  # Example of a multilib variant targeting Arm v6-M.
+  # Dir is the relative location of the directory containing the headers
+  # and/or libraries.
+  # Exactly how Dir is used is left up to the ToolChain subclass to define, but
+  # typically it will be joined to the sysroot.
+  - Dir: thumb/v6-m
+    # List of one or more normalized command line options, as generated by Clang
+    # from the command line options or from Mappings below.
+    # Here, if the flags are a superset of {target=thumbv6m-none-unknown-eabi}
+    # then this multilib variant will be considered a match.
+    Flags: [--target=thumbv6m-none-unknown-eabi]
+
+  # Similarly, a multilib variant targeting Arm v7-M with an FPU (floating
+  # point unit).
+  - Dir: thumb/v7-m
+    # Here, the flags generated by Clang must be a superset of
+    # {--target=thumbv7m-none-eabi, -mfpu=fpv4-sp-d16} for this multilib variant
+    # to be a match.
+    Flags: [--target=thumbv7m-none-eabi, -mfpu=fpv4-sp-d16]
+
+
+  # The second section of the file is a list of regular expressions that are
+  # used to map from flags generated from command line options to custom flags.
+  # This is optional.
+  # Each regular expression must match a whole flag string.
+  # Flags in the "Flags" list will be added if any flag generated from command
+  # line options matches the regular expression.
+  Mappings:
+
+  # Set a "--target=thumbv7m-none-eabi" flag if the regular expression matches
+  # any of the flags generated from the command line options.
+  # Match is a POSIX extended regular expression string.
+  - Match: --target=thumbv([7-9]|[1-9][0-9]+).*
+    # Flags is a list of one or more strings.
+    Flags: [--target=thumbv7m-none-eabi]
+
+Design principles
+=================
+
+Stable interface
+----------------
+
+``multilib.yaml`` and ``-print-multi-flags-experimental`` are new
+interfaces to Clang. In order for them to be usable over time and across LLVM
+versions their interfaces should be stable.
+The new multilib system will be considered experimental in LLVM 17, but in
+LLVM 18 it will be stable. In particular this is important to which multilib
+selection flags Clang generates from command line options. Once a flag is
+generated by a released version of Clang it may be used in ``multilib.yaml``
+files that exist independently of the LLVM release cycle, and therefore
+ceasing to generate the flag would be a breaking change and should be
+avoided.
+
+However, an exception is the normalization of ``-march``.
+``-march`` for Arm architectures contains a list of enabled and disabled
+extensions and this list is likely to grow. Therefore ``-march`` flags are
+unstable.
+
+Incomplete interface
+--------------------
+
+The new multilib system does multilib selection based on only a limited set of
+command line options, and limits which flags can be used for multilib
+selection. This is in order to avoid committing to too large an interface.
+Later LLVM versions can add support for multilib selection from more command
+line options as needed.
+
+Extensible
+----------
+
+It is likely that the configuration format will need to evolve in future to
+adapt to new requirements.
+Using a format like YAML that supports key-value pairs helps here as it's
+trivial to add new keys alongside existing ones.
+
+Backwards compatibility
+-----------------------
+
+New versions of Clang should be able to use configuration written for earlier
+Clang versions.
+To avoid behaving in a way that may be subtly incorrect, Clang should be able
+to detect if the configuration is too new and emit an error.
+
+Forwards compatibility
+----------------------
+
+As an author of a multilib configuration, it should be possible to design the
+configuration in such a way that it is likely to work well with future Clang
+versions. For example, if a future version of Clang is likely to add support
+for newer versions of an architecture and the architecture is known to be
+designed for backwards compatibility then it should be possible to express
+compatibility for such architecture versions in the multilib configuration.
+
+Not GNU spec files
+------------------
+
+The GNU spec files standard is large and complex and there's little desire to
+import that complexity to LLVM. It's also heavily oriented towards processing
+command line argument strings which is hard to do correctly, hence the large
+amount of logic dedicated to that task in the Clang driver. While compatibility
+with GNU would bring benefits, the cost in this case is deemed too high.
+
+Avoid re-inventing feature detection in the configuration
+---------------------------------------------------------
+
+A large amount of logic in the Clang driver is dedicated to inferring which
+architectural features are available based on the given command line options.
+It is neither desirable nor practical to repeat such logic in each multilib
+configuration. Instead the configuration should be able to benefit from the
+heavy lifting Clang already does to detect features.
+
+Low maintenance
+---------------
+
+Multilib is a relatively small feature in the scheme of things so supporting it
+should accordingly take little time. Where possible this should be achieved by
+implementing it in terms of existing features in the LLVM codebase.
+
+Minimal additional API surface
+------------------------------
+
+The greater the API surface, the greater the 
diff iculty of keeping it stable.
+Where possible the additional API surface should be kept small by defining it
+in relation to existing APIs. An example of this is keeping a simple
+relationship between flag names and command line options where possible.
+Since the command line options are part of a stable API they are unlikely
+to change, and therefore the flag names get the same stability.
+
+Low compile-time overhead
+-------------------------
+
+If the process of selecting multilib directories must be done on every
+invocation of the Clang driver then it must have a negligible impact on
+overall compile time.

diff  --git a/clang/docs/index.rst b/clang/docs/index.rst
index e572f706c01f5..9fc7c6fff701e 100644
--- a/clang/docs/index.rst
+++ b/clang/docs/index.rst
@@ -100,6 +100,7 @@ Design Documents
    CodeOwners
    InternalsManual
    DriverInternals
+   Multilib
    OffloadingDesign
    PCHInternals
    ItaniumMangleAbiTags


        


More information about the cfe-commits mailing list