[clang] [Docs] Some updates to the Clang user's manual (PR #151702)
Erich Keane via cfe-commits
cfe-commits at lists.llvm.org
Fri Aug 1 07:53:27 PDT 2025
================
@@ -60,29 +61,52 @@ features that depend on what CPU architecture or operating system is
being compiled for. Please see the :ref:`Target-Specific Features and
Limitations <target_features>` section for more details.
-The rest of the introduction introduces some basic :ref:`compiler
-terminology <terminology>` that is used throughout this manual and
-contains a basic :ref:`introduction to using Clang <basicusage>` as a
-command line compiler.
-
.. _terminology:
Terminology
-----------
+* Lexer -- the part of the compiler responsible for converting source code into
+ abstract representations called tokens.
+* Preprocessor -- the part of the compiler responsible for in-place textual
+ replacement of source constructs. When the lexer is required to produce a
+ token, it will run the preprocessor while determining which token to produce.
+ In other words, when the lexer encounters something like `#include` or a macro
+ name, the preprocessor will be used to perform the inclusion or expand the
+ macro name into its replacement list, and return the resulting non-preprocessor
+ token.
+* Parser -- the part of the compiler responsible for determining syntactic
+ correctness of the source code. The parser will request tokens from the lexer
+ and after performing semantic analysis of the production, generates an
+ abstract representation of the source called an AST.
+* Sema -- the part of the compiler responsible for determining semantic
+ correctness of the source code. It is closely related to the parser and is
+ where many diagnostics are produced.
+* Diagnostic -- a message to the user about properties of the source code. For
+ example, errors or warnings and their associated notes.
+* Undefined behavior -- behavior for which the standard imposes no requirements
+ on how the code behaves. Generally speaking, undefined behavior is a bug in
+ the user's code. However, it can also be a place for the compiler to define
+ the behavior, called an extension.
+* Optimizer -- the part of the compiler responsible for transforming code to
+ have better performance characteristics without changing the semantics of how
+ the code behaves. Note, the optimizer assumes the code has no undefined
+ behavior, so if the code does contain undefined behavior, it will often behave
+ differently depending on which optimization level is enabled.
+* Frontend -- the Lexer, Preprocessor, Parser, and Sema parts of the compiler.
+* Middle-end -- converts the AST into LLVM IR, adds debug information, etc.
----------------
erichkeane wrote:
> Ok, so for ‘middle-end’ I don’t think spelling it as ‘middleend’ is quite established yet (and personally it looks a bit weird because of the ‘ee’); English spelling and consistency...
Heh, I don't disagree :D Its just the strange inconsistency. Its probably better as inconsistency as I see it now.
https://github.com/llvm/llvm-project/pull/151702
More information about the cfe-commits
mailing list