[libcxx-commits] [libcxx] [libc++][hardening] Finish documenting hardening. (PR #92021)
Louis Dionne via libcxx-commits
libcxx-commits at lists.llvm.org
Tue Jun 4 11:50:03 PDT 2024
================
@@ -72,17 +75,340 @@ to control the level by passing **one** of the following options to the compiler
Notes for vendors
-----------------
-Vendors can set the default hardening mode by providing ``LIBCXX_HARDENING_MODE``
-as a configuration option, with the possible values of ``none``, ``fast``,
-``extensive`` and ``debug``. The default value is ``none`` which doesn't enable
-any hardening checks (this mode is sometimes called the ``unchecked`` mode).
+Vendors can set the default hardening mode by providing
+``LIBCXX_HARDENING_MODE`` as a configuration option, with the possible values of
+``none``, ``fast``, ``extensive`` and ``debug``. The default value is ``none``
+which doesn't enable any hardening checks (this mode is sometimes called the
+``unchecked`` mode).
This option controls both the hardening mode that the precompiled library is
built with and the default hardening mode that users will build with. If set to
``none``, the precompiled library will not contain any assertions, and user code
will default to building without assertions.
-Iterator bounds checking
-------------------------
+Vendors can also override the termination handler by :ref:`providing a custom
+header <override-assertion-handler>`.
-TODO(hardening)
+Assertion categories
+====================
+
+Inside the library, individual assertions are grouped into different
+*categories*. Each hardening mode enables a different set of assertion
+categories; categories provide an additional layer of abstraction that makes it
+easier to reason about the high-level semantics of a hardening mode.
+
+- ``valid-element-access`` -- checks that any attempts to access a container
+ element, whether through the container object or through an iterator, are
+ valid and do not attempt to go out of bounds or otherwise access
+ a non-existent element. This also includes operations that set up an imminent
+ invalid access (e.g. incrementing an end iterator). For iterator checks to
+ work, bounded iterators must be enabled in the ABI. Types like
+ ``std::optional`` and ``std::function`` are considered containers (with at
+ most one element) for the purposes of this check.
+
+- ``valid-input-range`` -- checks that ranges (whether expressed as an iterator
+ pair, an iterator and a sentinel, an iterator and a count, or
+ a ``std::range``) given as input to library functions are valid:
+ - the sentinel is reachable from the begin iterator;
+ - TODO(hardening): both iterators refer to the same container.
+
+ ("input" here refers to "an input given to an algorithm", not to an iterator
+ category)
+
+ Violating assertions in this category leads to an out-of-bounds access.
+
+- ``non-null`` -- checks that the pointer being dereferenced is not null. On
+ most modern platforms, the zero address does not refer to an actual location
+ in memory, so a null pointer dereference would not compromise the memory
+ security of a program (however, it is still undefined behavior that can result
+ in strange errors due to compiler optimizations).
+
+- ``non-overlapping-ranges`` -- for functions that take several ranges as
+ arguments, checks that those ranges do not overlap.
+
+- ``valid-deallocation`` -- checks that an attempt to deallocate memory is valid
+ (e.g. the given object was allocated by the given allocator). Violating this
+ category typically results in a memory leak.
+
+- ``valid-external-api-call`` -- checks that a call to an external API doesn't
+ fail in an unexpected manner. This includes triggering documented cases of
+ undefined behavior in an external library (like attempting to unlock an
+ unlocked mutex in pthreads). Any API external to the library falls under this
+ category (from system calls to compiler intrinsics). We generally don't expect
+ these failures to compromise memory safety or otherwise create an immediate
+ security issue.
+
+- ``compatible-allocator`` -- checks any operations that exchange nodes between
+ containers to make sure the containers have compatible allocators.
+
+- ``argument-within-domain`` -- checks that the given argument is within the
+ domain of valid arguments for the function. Violating this typically produces
+ an incorrect result (e.g. ``std::clamp`` returns the original value without
+ clamping it due to incorrect functors) or puts an object into an invalid state
+ (e.g. a string view where only a subset of elements is accessible). This
+ category is for assertions violating which doesn't cause any immediate issues
+ in the library -- whatever the consequences are, they will happen in the user
+ code.
+
+- ``pedantic`` -- checks preconditions that are imposed by the Standard, but
+ violating which happens to be benign in our implementation.
+
+- ``semantic-requirement`` -- checks that the given argument satisfies the
+ semantic requirements imposed by the Standard. Typically, there is no simple
+ way to completely prove that a semantic requirement is satisfied; thus, this
+ would often be a heuristic check and it might be quite expensive.
+
+- ``internal`` -- checks that internal invariants of the library hold. These
+ assertions don't depend on user input.
+
+- ``uncategorized`` -- for assertions that haven't been properly classified yet.
+ This category is an escape hatch used for some existing assertions in the
+ library; all new code should have its assertions properly classified.
+
+Mapping between the hardening modes and the assertion categories
+================================================================
+
+.. list-table::
+ :header-rows: 1
+ :widths: auto
+
+ * - Category name
+ - ``fast``
+ - ``extensive``
+ - ``debug``
+ * - ``valid-element-access``
+ - ✅
+ - ✅
+ - ✅
+ * - ``valid-input-range``
+ - ✅
+ - ✅
+ - ✅
+ * - ``non-null``
+ - ❌
+ - ✅
+ - ✅
+ * - ``non-overlapping-ranges``
+ - ❌
+ - ✅
+ - ✅
+ * - ``valid-deallocation``
+ - ❌
+ - ✅
+ - ✅
+ * - ``valid-external-api-call``
+ - ❌
+ - ✅
+ - ✅
+ * - ``compatible-allocator``
+ - ❌
+ - ✅
+ - ✅
+ * - ``argument-within-domain``
+ - ❌
+ - ✅
+ - ✅
+ * - ``pedantic``
+ - ❌
+ - ✅
+ - ✅
+ * - ``semantic-requirement``
+ - ❌
+ - ❌
+ - ✅
+ * - ``internal``
+ - ❌
+ - ❌
+ - ✅
+ * - ``uncategorized``
+ - ❌
+ - ✅
+ - ✅
+
+.. note::
+
+ At the moment, each subsequent hardening mode is a strict superset of the
+ previous one (in other words, each subsequent mode only enables additional
+ assertion categories without disabling any), but this won't necessarily be
+ true for any hardening modes that might be added in the future.
+
+Hardening assertion failure
+===========================
+
+In production modes (``fast`` and ``extensive``), a hardening assertion failure
+immediately ``_traps <https://llvm.org/docs/LangRef.html#llvm-trap-intrinsic>``
+the program. This is the safest approach that also minimizes the code size
+penalty as the failure handler maps to a single instruction. The downside is
+that the failure provides no additional details other than the stack trace
+(which might also be affected by optimizations).
+
+TODO(hardening): describe ``__builtin_verbose_trap`` once we can use it.
+
+In the ``debug`` mode, an assertion failure terminates the program in an
+unspecified manner and also outputs the associated error message to the error
+output. This is less secure and increases the size of the binary (among other
+things, it has to store the error message strings) but makes the failure easier
+to debug. It also allows us to test the error messages in our test suite.
+
+.. _override-assertion-handler:
+
+Overriding the assertion failure handler
+----------------------------------------
+
+Vendors can override the default termination handler mechanism by following
+these steps:
+
+- create a header file that provides a definition of a macro called
+ ``_LIBCPP_ASSERTION_HANDLER``. The macro will be invoked when a hardening
+ assertion fails, with a single parameter containing a null-terminated string
+ with the error message.
+- when configuring the library, provide the path to custom header (relative to
+ the root of the repository) via the CMake variable
+ ``LIBCXX_ASSERTION_HANDLER_FILE``.
+
+There is no existing mechanism for users to override the termination handler.
+
+ABI
+===
+
+Setting a hardening mode does **not** affect the ABI. Each mode uses the subset
+of checks available in the current ABI configuration which is determined by the
+platform.
+
+It is important to stress that whether a particular check is enabled depends on
+the combination of the selected hardening mode and the hardening-related ABI
+options. Some checks require changing the ABI from the "default" to store
+additional information in the library classes -- e.g. checking whether an
+iterator is valid upon dereference generally requires storing data about bounds
+inside the iterator object. Using ``std::span`` as an example, setting the
+hardening mode to ``fast`` will always enable the ``valid-element-access``
+checks when accessing elements via a ``std::span`` object, but whether
+dereferencing a ``std::span`` iterator does the equivalent check depends on the
+ABI configuration.
+
+ABI options
+-----------
+
+Vendors can use the following ABI options to enable additional hardening checks:
+
+- ``_LIBCPP_ABI_BOUNDED_ITERATORS`` -- changes the iterator type of select
+ containers (see below) to a bounded iterator that keeps track of whether it's
+ within the bounds of the original container and asserts valid bounds on every
+ dereference.
+
+ ABI impact: changes the iterator type of the relevant containers.
+
+ Supported containers:
+ - ``span``;
+ - ``string_view``.
+
+ABI tags
+--------
+
+We use ABI tags to allow translation units built with different hardening modes
+to interact with each other without causing ODR violations. Knowing how
+hardening modes are encoded into the ABI tags might be useful to examine
+a binary and determine whether it was built with hardening enabled.
+
+.. warning::
+ We don't commit to the encoding scheme used by the ABI tags being stable
+ between different releases of libc++. The tags themselves are never stable, by
+ design -- new releases increase the version number. The following describes
+ the state of the latest release and is for informational purposes only.
+
+The first character of an ABI tag encodes the hardening mode:
+
+- ``f`` -- [f]ast mode;
+- ``s`` -- extensive ("[s]afe") mode;
+- ``d`` -- [d]ebug mode;
+- ``n`` -- [n]one mode.
+
+Hardened containers status
+==========================
+
+.. list-table::
+ :header-rows: 1
+ :widths: auto
+
+ * - Name
+ - Member functions
+ - Iterators (ABI-dependent)
+ * - ``span``
+ - ✅
+ - ✅
+ * - ``string_view``
+ - ✅
+ - ✅
+ * - ``array``
+ - ✅
+ - ❌
+ * - ``vector``
+ - ✅
+ - ❌
+ * - ``string``
+ - ✅
+ - ❌
+ * - ``list``
+ - ✅
+ - ❌
+ * - ``forward_list``
+ - ❌
+ - ❌
+ * - ``deque``
+ - ✅
+ - ❌
+ * - ``mdspan``
+ - ✅
+ - ❌
+ * - ``optional``
+ - ✅
+ - N/A
+
+TODO(hardening): make this table exhaustive.
+
+Testing
----------------
ldionne wrote:
I think this should move to the `Contributing` documentation somewhere. This doesn't really belong in the user-facing documentation.
https://github.com/llvm/llvm-project/pull/92021
More information about the libcxx-commits
mailing list