[llvm-branch-commits] [clang] release/21.x: [NFC][Clang][Docs] Update Pointer Authentication documentation (#152596) (PR #154240)
    via llvm-branch-commits 
    llvm-branch-commits at lists.llvm.org
       
    Mon Aug 18 17:51:41 PDT 2025
    
    
  
llvmbot wrote:
<!--LLVM PR SUMMARY COMMENT-->
@llvm/pr-subscribers-clang
Author: None (llvmbot)
<details>
<summary>Changes</summary>
Backport 62d2a8e6823de0310ba3a8b014ddcb2db356a1bb
Requested by: @<!-- -->EugeneZelenko
---
Patch is 60.99 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/154240.diff
1 Files Affected:
- (modified) clang/docs/PointerAuthentication.rst (+1163-32) 
``````````diff
diff --git a/clang/docs/PointerAuthentication.rst b/clang/docs/PointerAuthentication.rst
index 913291c954447..96eb498bc48b6 100644
--- a/clang/docs/PointerAuthentication.rst
+++ b/clang/docs/PointerAuthentication.rst
@@ -47,16 +47,16 @@ This document serves four purposes:
 - It documents several language extensions that are useful on targets using
   pointer authentication.
 
-- It will eventually present a theory of operation for the security mitigation,
-  describing the basic requirements for correctness, various weaknesses in the
-  mechanism, and ways in which programmers can strengthen its protections
-  (including recommendations for language implementors).
+- It presents a theory of operation for the security mitigation, describing the
+  basic requirements for correctness, various weaknesses in the mechanism, and
+  ways in which programmers can strengthen its protections (including
+  recommendations for language implementors).
 
-- It will eventually document the language ABIs currently used for C, C++,
-  Objective-C, and Swift on arm64e, although these are not yet stable on any
-  target.
+- It documents the stable ABI of the C, C++, and Objective-C languages on arm64e
+  platforms.
 
-Basic Concepts
+
+Basic concepts
 --------------
 
 The simple address of an object or function is a **raw pointer**.  A raw
@@ -125,7 +125,7 @@ independently for I and D keys.)
   interfaces or as primitives in a compiler IR because they expose raw
   pointers.  Raw pointers require special attention in the language
   implementation to avoid the accidental creation of exploitable code
-  sequences.
+  sequences; see the section on `Attackable code sequences`_.
 
 The following details are all implementation-defined:
 
@@ -167,10 +167,15 @@ a cryptographic signature, other implementations may be possible.  See
     signing key, and stores it in the high bits as the signature. ``auth``
     removes the signature, computes the same hash, and compares the result with
     the stored signature.  ``strip`` removes the signature without
-    authenticating it.  While ``aut*`` instructions do not themselves trap on
-    failure in Armv8.3 PAuth, they do with the later optional FPAC extension.
-    An implementation can also choose to emulate this trapping behavior by
-    emitting additional instructions around ``aut*``.
+    authenticating it.  The ``aut`` instructions in the baseline Armv8.3 PAuth
+    feature do not guarantee to trap on authentication failure; instead, they
+    simply corrupt the pointer so that later uses will likely trap. Unless the
+    "later use" follows immediately and cannot be recovered from (e.g. with a
+    signal handler), this does not provide adequate protection against
+    `authentication oracles`_, so implementations must emit additional
+    instructions to force an immediate trap. This is unnecessary if the
+    processor provides the optional ``FPAC`` extension, which guarantees an
+    immediate trap.
 
   - ``sign_generic`` corresponds to the ``pacga`` instruction, which takes two
     64-bit values and produces a 64-bit cryptographic hash. Implementations of
@@ -234,7 +239,7 @@ implementation-defined.
 
 .. _Signing schemas:
 
-Signing Schemas
+Signing schemas
 ~~~~~~~~~~~~~~~
 
 Correct use of pointer authentication requires the signing code and the
@@ -255,33 +260,172 @@ signing schema breaks down even more simply:
 It is important that the signing schema be independently derived at all signing
 and authentication sites.  Preferably, the schema should be hard-coded
 everywhere it is needed, but at the very least, it must not be derived by
-inspecting information stored along with the pointer.
+inspecting information stored along with the pointer.  See the section on
+`Attacks on pointer authentication`_ for more information.
 
-Language Features
------------------
 
-There is currently one main pointer authentication language feature:
+Language features
+-----------------
 
-- The language provides the ``<ptrauth.h>`` intrinsic interface for manually
-  signing and authenticating pointers in code.  These can be used in
+There are three levels of the pointer authentication language feature:
+
+- The language implementation automatically signs and authenticates function
+  pointers (and certain data pointers) across a variety of standard situations,
+  including return addresses, function pointers, and C++ virtual functions. The
+  intent is for all pointers to code in program memory to be signed in some way
+  and for all branches to code in program text to authenticate those
+  signatures. In addition to the code pointers themselves, we also use pointer
+  authentication to protect data values that directly or indirectly influence
+  control flow or program integrity, or can provide attackers with some other
+  powerful program compromise.
+
+- The language also provides extensions to override the default rules used by
+  the language implementation.  For example, the ``__ptrauth`` type qualifier
+  can be used to change how pointers or pointer sized integers are signed when
+  they are stored in a particular variable or field; this provides much stronger
+  protection than is guaranteed by the default rules for C function and data
+  pointers.
+
+- Finally, the language provides the ``<ptrauth.h>`` intrinsic interface for
+  manually signing and authenticating pointers in code.  These can be used in
   circumstances where very specific behavior is required.
 
+Language implementation
+~~~~~~~~~~~~~~~~~~~~~~~
+
+For the most part, pointer authentication is an unobserved detail of the
+implementation of the programming language.  Any element of the language
+implementation that would perform an indirect branch to a pointer is implicitly
+altered so that the pointer is signed when first constructed and authenticated
+when the branch is performed.  This includes:
+
+- indirect-call features in the programming language, such as C function
+  pointers, C++ virtual functions, C++ member function pointers, the "blocks"
+  C extension, and so on;
+
+- returning from a function, no matter how it is called; and
+
+- indirect calls introduced by the implementation, such as branches through the
+  global offset table (GOT) used to implement direct calls to functions defined
+  outside of the current shared object.
+
+For more information about this, see the `Language ABI`_ section.
+
+However, some aspects of the implementation are observable by the programmer or
+otherwise require special notice.
+
+C data pointers
+^^^^^^^^^^^^^^^
+
+The current implementation in Clang does not sign pointers to ordinary data by
+default. For a partial explanation of the reasoning behind this, see the
+`Theory of Operation`_ section.
+
+A specific data pointer which is more security-sensitive than most can be
+signed using the `__ptrauth qualifier`_ or using the ``<ptrauth.h>``
+intrinsics.
+
+C function pointers
+^^^^^^^^^^^^^^^^^^^
+
+The C standard imposes restrictions on the representation and semantics of
+function pointer types which make it difficult to achieve satisfactory
+signature diversity in the default language rules.  See `Attacks on pointer
+authentication`_ for more information about signature diversity.  Programmers
+should strongly consider using the ``__ptrauth`` qualifier to improve the
+protections for important function pointers, such as the components of of
+a hand-rolled "v-table"; see the section on the `__ptrauth qualifier`_ for
+details.
+
+The value of a pointer to a C function includes a signature, even when the
+value is cast to a non-function-pointer type like ``void*`` or ``intptr_t``. On
+implementations that use high bits to store the signature, this means that
+relational comparisons and hashes will vary according to the exact signature
+value, which is likely to change between executions of a program.  In some
+implementations, it may also vary based on the exact function pointer type.
+
+Null pointers
+^^^^^^^^^^^^^
+
+In principle, an implementation could derive the signed null pointer value
+simply by applying the standard signing algorithm to the raw null pointer
+value. However, for likely signing algorithms, this would mean that the signed
+null pointer value would no longer be statically known, which would have many
+negative consequences.  For one, it would become substantially more expensive
+to emit null pointer values or to perform null-pointer checks.  For another,
+the pervasive (even if technically unportable) assumption that null pointers
+are bitwise zero would be invalidated, making it substantially more difficult
+to adopt pointer authentication, as well as weakening common optimizations for
+zero-initialized memory such as the use of ``.bzz`` sections.  Therefore it is
+beneficial to treat null pointers specially by giving them their usual
+representation.  On AArch64, this requires additional code when working with
+possibly-null pointers, such as when copying a pointer field that has been
+signed with address diversity.
+
+While this representation of nulls is the safest option for the general case,
+there are some situations in which a null pointer may have important semantic
+or security impact. For that purpose Clang has the concept of a pointer
+authentication schema that signs and authenticates null values.
+
+Return addresses
+^^^^^^^^^^^^^^^^
+
+The current implementation in Clang implicitly signs the return addresses in
+function calls.  While the value of the return address is technically an
+implementation detail of a function, there are some important libraries and
+development tools which rely on manually walking the chain of stack frames.
+These tools must be updated to correctly account for pointer authentication,
+either by stripping signatures (if security is not important for the tool, e.g.
+if it is capturing a stack trace during a crash) or properly authenticating
+them.  More information about how these values are signed is available in the
+`Language ABI`_ section.
+
+C++ virtual functions
+^^^^^^^^^^^^^^^^^^^^^
+
+The current implementation in Clang signs virtual function pointers with
+a discriminator derived from the full signature of the overridden method,
+including the method name and parameter types.  It is possible to write C++
+code that relies on v-table layout remaining constant despite changes to
+a method signature; for example, a parameter might be a ``typedef`` that
+resolves to a different type based on a build setting.  Such code violates
+C++'s One Definition Rule (ODR), but that violation is not normally detected;
+however, pointer authentication will detect it.
 
-Language Extensions
+Language extensions
 ~~~~~~~~~~~~~~~~~~~
 
-Feature Testing
+Feature testing
 ^^^^^^^^^^^^^^^
 
 Whether the current target uses pointer authentication can be tested for with
 a number of different tests.
 
-- ``__has_feature(ptrauth_intrinsics)`` is true if ``<ptrauth.h>`` provides its
-  normal interface.  This may be true even on targets where pointer
-  authentication is not enabled by default.
+- ``__PTRAUTH__`` macro is defined if ``<ptrauth.h>`` provides its normal
+  interface. This implies support for the pointer authentication intrinsics
+  and the ``__ptrauth`` qualifier.
 
-__ptrauth Qualifier
-^^^^^^^^^^^^^^^^^^^
+- ``__has_feature(ptrauth_returns)`` is true if the target uses pointer
+  authentication to protect return addresses.
+
+- ``__has_feature(ptrauth_calls)`` is true if the target uses pointer
+  authentication to protect indirect branches.  On arm64e this implies
+  ``__has_feature(ptrauth_returns)``, ``__has_feature(ptrauth_intrinsics)``,
+  and the ``__PTRAUTH__`` macro.
+
+- For backwards compatibility purposes ``__has_feature(ptrauth_intrinsics)``
+  and ``__has_feature(ptrauth_qualifier)`` are available on arm64e targets.
+  These features are synonymous with each other, and are equivalent to testing
+  for the ``__PTRAUTH__`` macro definition. Use of these features should be
+  restricted to cases where backwards compatibility is required, and should be
+  paired with ``defined(__PTRAUTH__)``.
+
+
+Clang provides several other tests only for historical purposes; for current
+purposes they are all equivalent to ``ptrauth_calls``.
+
+``__ptrauth`` qualifier
+^^^^^^^^^^^^^^^^^^^^^^^
 
 ``__ptrauth(key, address, discriminator)`` is an extended type
 qualifier which causes so-qualified objects to hold pointers or pointer sized
@@ -293,6 +437,11 @@ type, either to a function or to an object, or a pointer sized integer.  It
 currently cannot be an Objective-C pointer type, a C++ reference type, or a
 block pointer type; these restrictions may be lifted in the future.
 
+The current implementation in Clang is known to not provide adequate safety
+guarantees against the creation of `signing oracles`_ when assigning data
+pointers to ``__ptrauth``-qualified gl-values.  See the section on `safe
+derivation`_ for more information.
+
 The qualifier's operands are as follows:
 
 - ``key`` - an expression evaluating to a key value from ``<ptrauth.h>``; must
@@ -327,6 +476,57 @@ a discriminator determined as follows:
   is ``ptrauth_blend_discriminator(&x, discriminator)``; see
   `ptrauth_blend_discriminator`_.
 
+Non-triviality from address diversity
++++++++++++++++++++++++++++++++++++++
+
+Address diversity must impose additional restrictions in order to allow the
+implementation to correctly copy values.  In C++, a type qualified with address
+diversity is treated like a class type with non-trivial copy/move constructors
+and assignment operators, with the usual effect on containing classes and
+unions.  C does not have a standard concept of non-triviality, and so we must
+describe the basic rules here, with the intention of imitating the emergent
+rules of C++:
+
+- A type may be **non-trivial to copy**.
+
+- A type may also be **illegal to copy**. Types that are illegal to copy are
+  always non-trivial to copy.
+
+- A type may also be **address-sensitive**. This includes types that use self
+  referencing pointers, data protected by address diversified pointer
+  authentication, or other similar concepts.
+
+- A type qualified with a ``ptrauth`` qualifier or implicit authentication
+  schema that requires address diversity is non-trivial to copy and
+  address-sensitive.
+
+- An array type is illegal to copy, non-trivial to copy, or address-sensitive
+  if its element type is illegal to copy, non-trivial to copy, or
+  address-sensitive, respectively.
+
+- A struct type is illegal to copy, non-trivial to copy, or address-sensitive
+  if it has a field whose type is illegal to copy, non-trivial to copy, or
+  address-sensitive, respectively.
+
+- A union type is both illegal and non-trivial to copy if it has a field whose
+  type is non-trivial or illegal to copy.
+
+- A union type is address-sensitive if it has a field whose type is
+  address-sensitive.
+
+- A program is ill-formed if it uses a type that is illegal to copy as
+  a function parameter, argument, or return type.
+
+- A program is ill-formed if an expression requires a type to be copied that is
+  illegal to copy.
+
+- Otherwise, copying a type that is non-trivial to copy correctly copies its
+  subobjects.
+
+- Types that are address-sensitive must always be passed and returned
+  indirectly. Thus, changing the address-sensitivity of a type may be
+  ABI-breaking even if its size and alignment do not change.
+
 ``<ptrauth.h>``
 ~~~~~~~~~~~~~~~
 
@@ -433,7 +633,7 @@ Produce a signed pointer for the given raw pointer without applying any
 authentication or extra treatment.  This operation is not required to have the
 same behavior on a null pointer that the language implementation would.
 
-This is a treacherous operation that can easily result in signing oracles.
+This is a treacherous operation that can easily result in `signing oracles`_.
 Programs should use it seldom and carefully.
 
 ``ptrauth_auth_and_resign``
@@ -454,7 +654,29 @@ a null pointer that the language implementation would.
 The code sequence produced for this operation must not be directly attackable.
 However, if the discriminator values are not constant integers, their
 computations may still be attackable.  In the future, Clang should be enhanced
-to guaranteed non-attackability if these expressions are safely-derived.
+to guaranteed non-attackability if these expressions are
+:ref:`safely-derived<Safe derivation>`.
+
+``ptrauth_auth_function``
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+.. code-block:: c
+
+  ptrauth_auth_function(pointer, key, discriminator)
+
+Authenticate that ``pointer`` is signed with ``key`` and ``discriminator`` and
+re-sign it to the standard schema for a function pointer of its type.
+
+``pointer`` must have function pointer type.  The result will have the same
+type as ``pointer``.  This operation is not required to have the same behavior
+on a null pointer that the language implementation would.
+
+This operation makes the same attackability guarantees as
+``ptrauth_auth_and_resign``.
+
+If this operation appears syntactically as the function operand of a call,
+Clang guarantees that the call will directly authenticate the function value
+using the given schema rather than re-signing to the standard schema.
 
 ``ptrauth_auth_data``
 ^^^^^^^^^^^^^^^^^^^^^
@@ -500,12 +722,921 @@ type.  Implementations are not required to make all bits of the result equally
 significant; in particular, some implementations are known to not leave
 meaningful data in the low bits.
 
+Standard ``__ptrauth`` qualifiers
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+``<ptrauth.h>`` additionally provides several macros which expand to
+``__ptrauth`` qualifiers for common ABI situations.
+
+For convenience, these macros expand to nothing when pointer authentication is
+disabled.
+
+These macros can be found in the header; some details of these macros may be
+unstable or implementation-specific.
+
+
+Theory of operation
+-------------------
+
+The threat model of pointer authentication is as follows:
+
+- The attacker has the ability to read and write to a certain range of
+  addresses, possibly the entire address space.  However, they are constrained
+  by the normal rules of the process: for example, they cannot write to memory
+  that is mapped read-only, and if they access unmapped memory it will trigger
+  a trap.
+
+- The attacker has no ability to add arbitrary executable code to the program.
+  For example, the program does not include malicious code to begin with, and
+  the attacker cannot alter existing instructions, load a malicious shared
+  library, or remap writable pages as executable.  If the attacker wants to get
+  the process to perform a specific sequence of actions, they must somehow
+  subvert the normal control flow of the process.
+
+In both of the above paragraphs, it is merely assumed that the attacker's
+*current* capabilities are restricted; that is, their current exploit does not
+directly give them the power to do these things.  The attacker's immediate goal
+may well be to leverage their exploit to gain these capabilities, e.g. to load
+a malicious dynamic library into the process, even though the process does not
+directly contain code to do so.
+
+Note that any bug that fits the above threat model can be immediately exploited
+as a denial-of-service attack by simply performing an illegal access and
+crashing the program.  Pointer authentication cannot protect against this.
+While denial-of-service attacks are unfortunate, they are also unquestionably
+the best possible result of a bug this severe. Therefore, pointer authentication
+enthusiastically embraces the idea of halting the program on a pointer
+authentication failure rather than continuing in a possibly-compromised state.
+
+Pointer authentication is a form of control-flow integrity (CFI) enforcement.
+The basic security hypothesis behind CFI enforcement is that many bugs can only
+be usefully exploited (other than as a denial-of-service) by leveraging them to
+subvert the control flow of the program.  If this is true, then by inhibiting or
+limiting that subversion, it may be possible to largely mitigate the security
+consequences of those bugs by rendering them impractical (or, ideally,
+impossible) to exploit.
+
+Every indirect branch in a program has a purpose.  Using human intelligence, a
+programmer can describe where a particular branch *should* go according to this
+pu...
[truncated]
``````````
</details>
https://github.com/llvm/llvm-project/pull/154240
    
    
More information about the llvm-branch-commits
mailing list