[clang] [Docs] Some updates to the Clang user's manual (PR #151702)
Aaron Ballman via cfe-commits
cfe-commits at lists.llvm.org
Fri Aug 1 08:07:30 PDT 2025
https://github.com/AaronBallman updated https://github.com/llvm/llvm-project/pull/151702
>From ccd6021fc9bf523e3dafbf7dff80027c47410204 Mon Sep 17 00:00:00 2001
From: Aaron Ballman <aaron at aaronballman.com>
Date: Fri, 1 Aug 2025 10:08:35 -0400
Subject: [PATCH 1/7] [Docs] Some updates to the Clang user's manual
* Fills out the terminology section
* Removes the basic usage section (we should bring it back someday
though!)
* Updates the list of supported language versions
* Adds information about what versions of Clang are officially supported
* Moves some extensions into the intentionally unsupported extensions
section.
There are likely far more updates that could be done, but this seemed
worth posting just to get things moving.
---
clang/docs/UsersManual.rst | 105 +++++++++++++++++++++----------------
1 file changed, 59 insertions(+), 46 deletions(-)
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index af0a8746d45e7..06a867ef38fa4 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -36,7 +36,7 @@ language-specific information, please see the corresponding language
specific section:
- :ref:`C Language <c>`: K&R C, ANSI C89, ISO C90, ISO C94 (C89+AMD1), ISO
- C99 (+TC1, TC2, TC3).
+ C99 (+TC1, TC2, TC3), C11, C17, C23, and C2y.
- :ref:`Objective-C Language <objc>`: ObjC 1, ObjC 2, ObjC 2.1, plus
variants depending on base language.
- :ref:`C++ Language <cxx>`
@@ -60,29 +60,46 @@ features that depend on what CPU architecture or operating system is
being compiled for. Please see the :ref:`Target-Specific Features and
Limitations <target_features>` section for more details.
-The rest of the introduction introduces some basic :ref:`compiler
-terminology <terminology>` that is used throughout this manual and
-contains a basic :ref:`introduction to using Clang <basicusage>` as a
-command line compiler.
-
.. _terminology:
Terminology
-----------
+* Lexer -- the part of the compiler responsible for converting source code into
+ abstract representations called tokens.
+* Preprocessor -- the part of the compiler responsible for in-place textual
+ replacement of source constructs. When the lexer is required to produce a
+ token, it will run the preprocessor while determining which token to produce.
+ In other words, when the lexer encounters something like `#include` or a macro
+ name, the preprocessor will be used to perform the inclusion or expand the
+ macro name into its replacement list, and return the resulting non-preprocessor
+ token.
+* Parser -- the part of the compiler responsible for determining syntactic
+ correctness of the source code. The parser will request tokens from the lexer
+ and after performing semantic analysis of the production, generates an
+ abstract representation of the source called an AST.
+* Diagnostic -- a message to the user about properties of the source code. For
+ example, errors or warnings and their associated notes.
+* Undefined behavior -- behavior for which the standard imposes no requirements
+ on how the code behaves. Generally speaking, undefined behavior is a bug in
+ the user's code. However, it can also be a place for the compiler to define
+ the behavior, called an extension.
+* Optimizer -- the part of the compiler responsible for transforming user code
+ into faster user code, without changing the semantics of how the code behaves.
+ Note, the optimizer assumes the code has no undefined behavior, so if the code
+ does contain undefined behavior, it will often behave differently depending on
+ which optimization level is enabled.
+* Front end -- the Lexer, Preprocessor, Parser, semantic analysis, and LLVM IR
+ code generation parts of the compiler.
+* Backend -- the parts of the compiler which run after LLVM IR code generation,
+ such as the optimizer.
+
+Support
+-------
+Clang releases happen roughly `every six months <https://llvm.org/docs/HowToReleaseLLVM.html#annual-release-schedule>`_.
+Only the current public release is officially supported. Bug-fix releases for
+the current release will be produced on an as-needed basis, but bug fixes are
+not backported to releases older than the current one.
-Front end, parser, backend, preprocessor, undefined behavior,
-diagnostic, optimizer
-
-.. _basicusage:
-
-Basic Usage
------------
-
-Intro to how to use a C compiler for newbies.
-
-compile + link compile then link debug info enabling optimizations
-picking a language to use, defaults to C17 by default. Autosenses based
-on extension. using a makefile
Command Line Options
====================
@@ -3797,8 +3814,8 @@ This environment variable does not affect the options added by the config files.
C Language Features
===================
-The support for standard C in clang is feature-complete except for the
-C99 floating-point pragmas.
+The support for standard C in Clang is mostly feature-complete, see the `C
+status page <https://clang.llvm.org/c_status.html>`_ for more details.
Extensions supported by clang
-----------------------------
@@ -3883,23 +3900,10 @@ GCC extensions not implemented yet
----------------------------------
clang tries to be compatible with gcc as much as possible, but some gcc
-extensions are not implemented yet:
+extensions are not implemented:
- clang does not support decimal floating point types (``_Decimal32`` and
friends) yet.
-- clang does not support nested functions; this is a complex feature
- which is infrequently used, so it is unlikely to be implemented
- anytime soon. In C++11 it can be emulated by assigning lambda
- functions to local variables, e.g:
-
- .. code-block:: cpp
-
- auto const local_function = [&](int parameter) {
- // Do something
- };
- ...
- local_function(1);
-
- clang only supports global register variables when the register specified
is non-allocatable (e.g. the stack pointer). Support for general global
register variables is unlikely to be implemented soon because it requires
@@ -3914,18 +3918,13 @@ extensions are not implemented yet:
that because clang pretends to be like GCC 4.2, and this extension
was introduced in 4.3, the glibc headers will not try to use this
extension with clang at the moment.
-- clang does not support the gcc extension for forward-declaring
- function parameters; this has not shown up in any real-world code
- yet, though, so it might never be implemented.
This is not a complete list; if you find an unsupported extension
-missing from this list, please send an e-mail to cfe-dev. This list
-currently excludes C++; see :ref:`C++ Language Features <cxx>`. Also, this
-list does not include bugs in mostly-implemented features; please see
-the `bug
-tracker <https://bugs.llvm.org/buglist.cgi?quicksearch=product%3Aclang+component%3A-New%2BBugs%2CAST%2CBasic%2CDriver%2CHeaders%2CLLVM%2BCodeGen%2Cparser%2Cpreprocessor%2CSemantic%2BAnalyzer>`_
-for known existing bugs (FIXME: Is there a section for bug-reporting
-guidelines somewhere?).
+missing from this list, please file a `feature request <https://github.com/llvm/llvm-project/issues/>`_.
+This list currently excludes C++; see :ref:`C++ Language Features <cxx>`. Also,
+this list does not include bugs in mostly-implemented features; please see the
+`issues list <https://github.com/llvm/llvm-project/issues/>`_ for known existing
+bugs.
Intentionally unsupported GCC extensions
----------------------------------------
@@ -3944,6 +3943,20 @@ Intentionally unsupported GCC extensions
variable) will likely never be accepted by Clang.
- clang does not support ``__builtin_apply`` and friends; this extension
is extremely obscure and difficult to implement reliably.
+- clang does not support the gcc extension for forward-declaring
+ function parameters.
+- clang does not support nested functions; this is a complex feature which is
+ infrequently used, so it is unlikely to be implemented. In C++11 it can be
+ emulated by assigning lambda functions to local variables, e.g:
+
+ .. code-block:: cpp
+
+ auto const local_function = [&](int parameter) {
+ // Do something
+ };
+ ...
+ local_function(1);
+
.. _c_ms:
@@ -3983,7 +3996,7 @@ C++ Language Features
clang fully implements all of standard C++98 except for exported
templates (which were removed in C++11), all of standard C++11,
-C++14, and C++17, and most of C++20.
+C++14, and C++17, and most of C++20 and C++23.
See the `C++ support in Clang <https://clang.llvm.org/cxx_status.html>`_ page
for detailed information on C++ feature support across Clang versions.
>From 4da6e09b3555daab4fcfc92667f52d6c1c241e4e Mon Sep 17 00:00:00 2001
From: Aaron Ballman <aaron at aaronballman.com>
Date: Fri, 1 Aug 2025 10:27:01 -0400
Subject: [PATCH 2/7] Update based on review feedback
---
clang/docs/UsersManual.rst | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 06a867ef38fa4..5abd11abef1c9 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -77,6 +77,9 @@ Terminology
correctness of the source code. The parser will request tokens from the lexer
and after performing semantic analysis of the production, generates an
abstract representation of the source called an AST.
+* Sema -- the part of the compiler responsible for determining semantic
+ correctness of the source code. It is closely related to the parser and is
+ where many diagnostics are produced.
* Diagnostic -- a message to the user about properties of the source code. For
example, errors or warnings and their associated notes.
* Undefined behavior -- behavior for which the standard imposes no requirements
@@ -88,11 +91,14 @@ Terminology
Note, the optimizer assumes the code has no undefined behavior, so if the code
does contain undefined behavior, it will often behave differently depending on
which optimization level is enabled.
-* Front end -- the Lexer, Preprocessor, Parser, semantic analysis, and LLVM IR
- code generation parts of the compiler.
+* Frontend -- the Lexer, Preprocessor, Parser, and Sema parts of the compiler.
+* Middle-end -- converts the AST into LLVM IR, adds debug information, etc.
* Backend -- the parts of the compiler which run after LLVM IR code generation,
such as the optimizer.
+See the :doc:`InternalsManual` for more details about the internal construction
+of the compiler.
+
Support
-------
Clang releases happen roughly `every six months <https://llvm.org/docs/HowToReleaseLLVM.html#annual-release-schedule>`_.
>From b206c19f9ebf63bd38ca5b496261cbbf3a6e7d7c Mon Sep 17 00:00:00 2001
From: Aaron Ballman <aaron at aaronballman.com>
Date: Fri, 1 Aug 2025 10:34:47 -0400
Subject: [PATCH 3/7] Update the optimizer terminology based on review feedback
---
clang/docs/UsersManual.rst | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index 5abd11abef1c9..ecce39b8c0a30 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -86,11 +86,11 @@ Terminology
on how the code behaves. Generally speaking, undefined behavior is a bug in
the user's code. However, it can also be a place for the compiler to define
the behavior, called an extension.
-* Optimizer -- the part of the compiler responsible for transforming user code
- into faster user code, without changing the semantics of how the code behaves.
- Note, the optimizer assumes the code has no undefined behavior, so if the code
- does contain undefined behavior, it will often behave differently depending on
- which optimization level is enabled.
+* Optimizer -- the part of the compiler responsible for transforming code to
+ have better performance characteristics without changing the semantics of how
+ the code behaves. Note, the optimizer assumes the code has no undefined
+ behavior, so if the code does contain undefined behavior, it will often behave
+ differently depending on which optimization level is enabled.
* Frontend -- the Lexer, Preprocessor, Parser, and Sema parts of the compiler.
* Middle-end -- converts the AST into LLVM IR, adds debug information, etc.
* Backend -- the parts of the compiler which run after LLVM IR code generation,
>From 03feb55dc759c425f53c1f6981f262df80cfc896 Mon Sep 17 00:00:00 2001
From: Aaron Ballman <aaron at aaronballman.com>
Date: Fri, 1 Aug 2025 10:35:49 -0400
Subject: [PATCH 4/7] Add C++ standards
---
clang/docs/UsersManual.rst | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index ecce39b8c0a30..cfdacaf229d56 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -39,7 +39,8 @@ specific section:
C99 (+TC1, TC2, TC3), C11, C17, C23, and C2y.
- :ref:`Objective-C Language <objc>`: ObjC 1, ObjC 2, ObjC 2.1, plus
variants depending on base language.
-- :ref:`C++ Language <cxx>`
+- :ref:`C++ Language <cxx>`: C++98, C++03, C++11, C++14, C++17, C++20, C++23,
+ and C++2c.
- :ref:`Objective C++ Language <objcxx>`
- :ref:`OpenCL Kernel Language <opencl>`: OpenCL C 1.0, 1.1, 1.2, 2.0, 3.0,
and C++ for OpenCL 1.0 and 2021.
>From 3792dde486f0ecccf9feb48f09805951fb115cb4 Mon Sep 17 00:00:00 2001
From: Aaron Ballman <aaron at aaronballman.com>
Date: Fri, 1 Aug 2025 10:37:11 -0400
Subject: [PATCH 5/7] Remove some repeated mentions of ISO
---
clang/docs/UsersManual.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index cfdacaf229d56..ed51246555285 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -35,8 +35,8 @@ which includes :ref:`C <c>`, :ref:`Objective-C <objc>`, :ref:`C++ <cxx>`, and
language-specific information, please see the corresponding language
specific section:
-- :ref:`C Language <c>`: K&R C, ANSI C89, ISO C90, ISO C94 (C89+AMD1), ISO
- C99 (+TC1, TC2, TC3), C11, C17, C23, and C2y.
+- :ref:`C Language <c>`: K&R C, ANSI C89, ISO C90, C94 (C89+AMD1), C99 (+TC1,
+ TC2, TC3), C11, C17, C23, and C2y.
- :ref:`Objective-C Language <objc>`: ObjC 1, ObjC 2, ObjC 2.1, plus
variants depending on base language.
- :ref:`C++ Language <cxx>`: C++98, C++03, C++11, C++14, C++17, C++20, C++23,
>From af7feb144393aca7934a0ca038fa95b2ca037f51 Mon Sep 17 00:00:00 2001
From: Aaron Ballman <aaron at aaronballman.com>
Date: Fri, 1 Aug 2025 11:05:59 -0400
Subject: [PATCH 6/7] Update based on review feedback
---
clang/docs/UsersManual.rst | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index ed51246555285..d51d6fc39e002 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -40,7 +40,7 @@ specific section:
- :ref:`Objective-C Language <objc>`: ObjC 1, ObjC 2, ObjC 2.1, plus
variants depending on base language.
- :ref:`C++ Language <cxx>`: C++98, C++03, C++11, C++14, C++17, C++20, C++23,
- and C++2c.
+ and C++26.
- :ref:`Objective C++ Language <objcxx>`
- :ref:`OpenCL Kernel Language <opencl>`: OpenCL C 1.0, 1.1, 1.2, 2.0, 3.0,
and C++ for OpenCL 1.0 and 2021.
@@ -95,7 +95,7 @@ Terminology
* Frontend -- the Lexer, Preprocessor, Parser, and Sema parts of the compiler.
* Middle-end -- converts the AST into LLVM IR, adds debug information, etc.
* Backend -- the parts of the compiler which run after LLVM IR code generation,
- such as the optimizer.
+ such as the optimizer and generation of assembly code.
See the :doc:`InternalsManual` for more details about the internal construction
of the compiler.
>From b654a02731d737466dfdf2de9a2e7f56e4bb1aa0 Mon Sep 17 00:00:00 2001
From: Aaron Ballman <aaron at aaronballman.com>
Date: Fri, 1 Aug 2025 11:07:02 -0400
Subject: [PATCH 7/7] More updates based on review feedback
---
clang/docs/UsersManual.rst | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/clang/docs/UsersManual.rst b/clang/docs/UsersManual.rst
index d51d6fc39e002..c7039290ec6d5 100644
--- a/clang/docs/UsersManual.rst
+++ b/clang/docs/UsersManual.rst
@@ -92,8 +92,11 @@ Terminology
the code behaves. Note, the optimizer assumes the code has no undefined
behavior, so if the code does contain undefined behavior, it will often behave
differently depending on which optimization level is enabled.
-* Frontend -- the Lexer, Preprocessor, Parser, and Sema parts of the compiler.
-* Middle-end -- converts the AST into LLVM IR, adds debug information, etc.
+* Frontend -- the Lexer, Preprocessor, Parser, Sema, and LLVM IR code generation
+ parts of the compiler.
+* Middle-end -- a term used for the of the subset of the backend that does
+ (typically not target specific) optimizations prior to assembly code
+ generation.
* Backend -- the parts of the compiler which run after LLVM IR code generation,
such as the optimizer and generation of assembly code.
More information about the cfe-commits
mailing list