[clang] [AArch64][PAC] Support ptrauth builtins and -fptrauth-intrinsics. (PR #65996)

David Spickett via cfe-commits cfe-commits at lists.llvm.org
Tue Sep 12 03:00:04 PDT 2023


================
@@ -0,0 +1,548 @@
+Pointer Authentication
+======================
+
+.. contents::
+   :local:
+
+Introduction
+------------
+
+Pointer authentication is a technology which offers strong probabilistic protection against exploiting a broad class of memory bugs to take control of program execution.  When adopted consistently in a language ABI, it provides a form of relatively fine-grained control flow integrity (CFI) check that resists both return-oriented programming (ROP) and jump-oriented programming (JOP) attacks.
+
+While pointer authentication can be implemented purely in software, direct hardware support (e.g. as provided by ARMv8.3) can dramatically lower the execution speed and code size costs.  Similarly, while pointer authentication can be implemented on any architecture, taking advantage of the (typically) excess addressing range of a target with 64-bit pointers minimizes the impact on memory performance and can allow interoperation with existing code (by disabling pointer authentication dynamically).  This document will generally attempt to present the pointer authentication feature independent of any hardware implementation or ABI.  Considerations that are implementation-specific are clearly identified throughout.
+
+Note that there are several different terms in use:
+
+- **Pointer authentication** is a target-independent language technology.
+
+- **ARMv8.3** is an AArch64 architecture revision of that provides hardware support for pointer authentication.  It is implemented on several shipping processors, including the Apple A12 and later.
+
+* **arm64e** is a specific ABI for (not yet fully stable) for implementing pointer authentication on ARMv8.3 on certain Apple operating systems.
+
+This document serves four purposes:
+
+- It describes the basic ideas of pointer authentication.
+
+- It documents several language extensions that are useful on targets using pointer authentication.
+
+- It presents a theory of operation for the security mitigation, describing the basic requirements for correctness, various weaknesses in the mechanism, and ways in which programmers can strengthen its protections (including recommendations for language implementors).
+
+- It will eventually document the language ABIs currently used for C, C++, Objective-C, and Swift on arm64e, although these are not yet stable on any target.
+
+Basic Concepts
+--------------
+
+The simple address of an object or function is a **raw pointer**.  A raw pointer can be **signed** to produce a **signed pointer**.  A signed pointer can be then **authenticated** in order to verify that it was **validly signed** and extract the original raw pointer.  These terms reflect the most likely implementation technique: computing and storing a cryptographic signature along with the pointer.  The security of pointer authentication does not rely on attackers not being able to separately overwrite the signature.
+
+An **abstract signing key** is a name which refers to a secret key which can used to sign and authenticate pointers.  The key value for a particular name is consistent throughout a process.
+
+A **discriminator** is an arbitrary value used to **diversify** signed pointers so that one validly-signed pointer cannot simply be copied over another.  A discriminator is simply opaque data of some implementation-defined size that is included in the signature as a salt.
+
+Nearly all aspects of pointer authentication use just these two primary operations:
+
+- ``sign(raw_pointer, key, discriminator)`` produces a signed pointer given a raw pointer, an abstract signing key, and a discriminator.
+
+- ``auth(signed_pointer, key, discriminator)`` produces a raw pointer given a signed pointer, an abstract signing key, and a discriminator.
+
+``auth(sign(raw_pointer, key, discriminator), key, discriminator)`` must succeed and produce ``raw_pointer``.  ``auth`` applied to a value that was ultimately produced in any other way is expected to immediately halt the program.  However, it is permitted for ``auth`` to fail to detect that a signed pointer was not produced in this way, in which case it may return anything; this is what makes pointer authentication a probabilistic mitigation rather than a perfect one.
+
+There are two secondary operations which are required only to implement certain intrinsics in ``<ptrauth.h>``:
+
+- ``strip(signed_pointer, key)`` produces a raw pointer given a signed pointer and a key it was presumptively signed with.  This is useful for certain kinds of tooling, such as crash backtraces; it should generally not be used in the basic language ABI except in very careful ways.
+
+- ``sign_generic(value)`` produces a cryptographic signature for arbitrary data, not necessarily a pointer.  This is useful for efficiently verifying that non-pointer data has not been tampered with.
+
+Whenever any of these operations is called for, the key value must be known statically.  This is because the layout of a signed pointer may vary according to the signing key.  (For example, in ARMv8.3, the layout of a signed pointer depends on whether TBI is enabled, which can be set independently for code and data pointers.)
----------------
DavidSpickett wrote:

Also I think you might be mixing up concerns here. TBI on or off changes the amount of bits the signature can use up in any given pointer. The choice of data or instruction pointer is whether you use the D or I instruction variant (https://developer.arm.com/documentation/dui0801/g/A64-General-Instructions/PACDA--PACDZA).

You could equally use PACDB to sign your data pointers, it's up to your ABI. So yes, a combination of ABI and platform ABI will change pointer layout, but I'm not sure the Arm example really makes much sense as it stands if you know the details.

My point is basically that the signing key doesn't tell you if TBI is enabled. Your platform ABI would tell you that. I see why it's passed to these operations, but I think the TBI note muddles the explanation some.

You could make it more obviously two different things:
```
There are other factors that can influence the layout, for example on ARMv8.3 whether Top Byte Ignore (TBI) is enabled or not. These are platform ABI choices....
```
And while TBI being possible to enable independently for data and code is nice, it's not needed to make the point that the pointer layout can be changed by it.

https://github.com/llvm/llvm-project/pull/65996


More information about the cfe-commits mailing list