[libcxx-commits] [libcxx] [libc++][hardening] Finish documenting	hardening. (PR #92021)
    Konstantin Varlamov via libcxx-commits 
    libcxx-commits at lists.llvm.org
       
    Wed Jun  5 18:28:49 PDT 2024
    
    
  
================
@@ -72,17 +75,340 @@ to control the level by passing **one** of the following options to the compiler
 Notes for vendors
 -----------------
 
-Vendors can set the default hardening mode by providing ``LIBCXX_HARDENING_MODE``
-as a configuration option, with the possible values of ``none``, ``fast``,
-``extensive`` and ``debug``. The default value is ``none`` which doesn't enable
-any hardening checks (this mode is sometimes called the ``unchecked`` mode).
+Vendors can set the default hardening mode by providing
+``LIBCXX_HARDENING_MODE`` as a configuration option, with the possible values of
+``none``, ``fast``, ``extensive`` and ``debug``. The default value is ``none``
+which doesn't enable any hardening checks (this mode is sometimes called the
+``unchecked`` mode).
 
 This option controls both the hardening mode that the precompiled library is
 built with and the default hardening mode that users will build with. If set to
 ``none``, the precompiled library will not contain any assertions, and user code
 will default to building without assertions.
 
-Iterator bounds checking
-------------------------
+Vendors can also override the termination handler by :ref:`providing a custom
+header <override-assertion-handler>`.
 
-TODO(hardening)
+Assertion categories
+====================
+
+Inside the library, individual assertions are grouped into different
+*categories*. Each hardening mode enables a different set of assertion
+categories; categories provide an additional layer of abstraction that makes it
+easier to reason about the high-level semantics of a hardening mode.
+
+- ``valid-element-access`` -- checks that any attempts to access a container
+  element, whether through the container object or through an iterator, are
+  valid and do not attempt to go out of bounds or otherwise access
+  a non-existent element. This also includes operations that set up an imminent
+  invalid access (e.g. incrementing an end iterator). For iterator checks to
+  work, bounded iterators must be enabled in the ABI. Types like
+  ``std::optional`` and ``std::function`` are considered containers (with at
+  most one element) for the purposes of this check.
+
+- ``valid-input-range`` -- checks that ranges (whether expressed as an iterator
+  pair, an iterator and a sentinel, an iterator and a count, or
+  a ``std::range``) given as input to library functions are valid:
+  - the sentinel is reachable from the begin iterator;
+  - TODO(hardening): both iterators refer to the same container.
+
+  ("input" here refers to "an input given to an algorithm", not to an iterator
+  category)
+
+  Violating assertions in this category leads to an out-of-bounds access.
+
+- ``non-null`` -- checks that the pointer being dereferenced is not null. On
+  most modern platforms, the zero address does not refer to an actual location
+  in memory, so a null pointer dereference would not compromise the memory
+  security of a program (however, it is still undefined behavior that can result
+  in strange errors due to compiler optimizations).
+
+- ``non-overlapping-ranges`` -- for functions that take several ranges as
+  arguments, checks that those ranges do not overlap.
+
+- ``valid-deallocation`` -- checks that an attempt to deallocate memory is valid
+  (e.g. the given object was allocated by the given allocator). Violating this
+  category typically results in a memory leak.
+
+- ``valid-external-api-call`` -- checks that a call to an external API doesn't
+  fail in an unexpected manner. This includes triggering documented cases of
+  undefined behavior in an external library (like attempting to unlock an
+  unlocked mutex in pthreads). Any API external to the library falls under this
+  category (from system calls to compiler intrinsics). We generally don't expect
+  these failures to compromise memory safety or otherwise create an immediate
+  security issue.
+
+- ``compatible-allocator`` -- checks any operations that exchange nodes between
+  containers to make sure the containers have compatible allocators.
+
+- ``argument-within-domain`` -- checks that the given argument is within the
+  domain of valid arguments for the function. Violating this typically produces
+  an incorrect result (e.g. ``std::clamp`` returns the original value without
+  clamping it due to incorrect functors) or puts an object into an invalid state
+  (e.g. a string view where only a subset of elements is accessible). This
+  category is for assertions violating which doesn't cause any immediate issues
+  in the library -- whatever the consequences are, they will happen in the user
+  code.
+
+- ``pedantic`` -- checks preconditions that are imposed by the Standard, but
+  violating which happens to be benign in our implementation.
+
+- ``semantic-requirement`` -- checks that the given argument satisfies the
+  semantic requirements imposed by the Standard. Typically, there is no simple
+  way to completely prove that a semantic requirement is satisfied; thus, this
+  would often be a heuristic check and it might be quite expensive.
+
+- ``internal`` -- checks that internal invariants of the library hold. These
+  assertions don't depend on user input.
+
+- ``uncategorized`` -- for assertions that haven't been properly classified yet.
+  This category is an escape hatch used for some existing assertions in the
+  library; all new code should have its assertions properly classified.
+
+Mapping between the hardening modes and the assertion categories
+================================================================
+
+.. list-table::
+    :header-rows: 1
+    :widths: auto
+
+    * - Category name
+      - ``fast``
+      - ``extensive``
+      - ``debug``
+    * - ``valid-element-access``
+      - ✅
+      - ✅
+      - ✅
+    * - ``valid-input-range``
+      - ✅
+      - ✅
+      - ✅
+    * - ``non-null``
+      - ❌
+      - ✅
+      - ✅
+    * - ``non-overlapping-ranges``
+      - ❌
+      - ✅
+      - ✅
+    * - ``valid-deallocation``
+      - ❌
+      - ✅
+      - ✅
+    * - ``valid-external-api-call``
+      - ❌
+      - ✅
+      - ✅
+    * - ``compatible-allocator``
+      - ❌
+      - ✅
+      - ✅
+    * - ``argument-within-domain``
+      - ❌
+      - ✅
+      - ✅
+    * - ``pedantic``
+      - ❌
+      - ✅
+      - ✅
+    * - ``semantic-requirement``
+      - ❌
+      - ❌
+      - ✅
+    * - ``internal``
+      - ❌
+      - ❌
+      - ✅
+    * - ``uncategorized``
+      - ❌
+      - ✅
+      - ✅
+
+.. note::
+
+  At the moment, each subsequent hardening mode is a strict superset of the
+  previous one (in other words, each subsequent mode only enables additional
+  assertion categories without disabling any), but this won't necessarily be
+  true for any hardening modes that might be added in the future.
+
+Hardening assertion failure
+===========================
+
+In production modes (``fast`` and ``extensive``), a hardening assertion failure
+immediately ``_traps <https://llvm.org/docs/LangRef.html#llvm-trap-intrinsic>``
+the program. This is the safest approach that also minimizes the code size
+penalty as the failure handler maps to a single instruction. The downside is
+that the failure provides no additional details other than the stack trace
+(which might also be affected by optimizations).
+
+TODO(hardening): describe ``__builtin_verbose_trap`` once we can use it.
+
+In the ``debug`` mode, an assertion failure terminates the program in an
+unspecified manner and also outputs the associated error message to the error
+output. This is less secure and increases the size of the binary (among other
+things, it has to store the error message strings) but makes the failure easier
+to debug. It also allows us to test the error messages in our test suite.
+
+.. _override-assertion-handler:
+
+Overriding the assertion failure handler
+----------------------------------------
+
+Vendors can override the default termination handler mechanism by following
+these steps:
+
+- create a header file that provides a definition of a macro called
+  ``_LIBCPP_ASSERTION_HANDLER``. The macro will be invoked when a hardening
+  assertion fails, with a single parameter containing a null-terminated string
+  with the error message.
+- when configuring the library, provide the path to custom header (relative to
+  the root of the repository) via the CMake variable
+  ``LIBCXX_ASSERTION_HANDLER_FILE``.
+
+There is no existing mechanism for users to override the termination handler.
+
+ABI
+===
+
+Setting a hardening mode does **not** affect the ABI. Each mode uses the subset
+of checks available in the current ABI configuration which is determined by the
+platform.
+
+It is important to stress that whether a particular check is enabled depends on
+the combination of the selected hardening mode and the hardening-related ABI
+options. Some checks require changing the ABI from the "default" to store
+additional information in the library classes -- e.g. checking whether an
+iterator is valid upon dereference generally requires storing data about bounds
+inside the iterator object. Using ``std::span`` as an example, setting the
+hardening mode to ``fast`` will always enable the ``valid-element-access``
+checks when accessing elements via a ``std::span`` object, but whether
+dereferencing a ``std::span`` iterator does the equivalent check depends on the
+ABI configuration.
+
+ABI options
+-----------
+
+Vendors can use the following ABI options to enable additional hardening checks:
+
+- ``_LIBCPP_ABI_BOUNDED_ITERATORS`` -- changes the iterator type of select
+  containers (see below) to a bounded iterator that keeps track of whether it's
+  within the bounds of the original container and asserts valid bounds on every
+  dereference.
+
+  ABI impact: changes the iterator type of the relevant containers.
+
+  Supported containers:
+  - ``span``;
+  - ``string_view``.
+
+ABI tags
+--------
+
+We use ABI tags to allow translation units built with different hardening modes
+to interact with each other without causing ODR violations. Knowing how
+hardening modes are encoded into the ABI tags might be useful to examine
+a binary and determine whether it was built with hardening enabled.
+
+.. warning::
+  We don't commit to the encoding scheme used by the ABI tags being stable
+  between different releases of libc++. The tags themselves are never stable, by
+  design -- new releases increase the version number. The following describes
+  the state of the latest release and is for informational purposes only.
+
+The first character of an ABI tag encodes the hardening mode:
+
+- ``f`` -- [f]ast mode;
+- ``s`` -- extensive ("[s]afe") mode;
+- ``d`` -- [d]ebug mode;
+- ``n`` -- [n]one mode.
+
+Hardened containers status
+==========================
+
+.. list-table::
+    :header-rows: 1
+    :widths: auto
+
+    * - Name
+      - Member functions
+      - Iterators (ABI-dependent)
+    * - ``span``
+      - ✅
+      - ✅
+    * - ``string_view``
+      - ✅
+      - ✅
+    * - ``array``
+      - ✅
+      - ❌
+    * - ``vector``
+      - ✅
+      - ❌
+    * - ``string``
+      - ✅
+      - ❌
+    * - ``list``
+      - ✅
+      - ❌
+    * - ``forward_list``
+      - ❌
+      - ❌
+    * - ``deque``
+      - ✅
+      - ❌
+    * - ``mdspan``
+      - ✅
+      - ❌
+    * - ``optional``
+      - ✅
+      - N/A
+
+TODO(hardening): make this table exhaustive.
----------------
var-const wrote:
Done.
Some of these are a little tricky:
- We do have some hardening in `valarray`, I can't immediately say if it covers everything or not. Went with a conservative "partial";
- I'm not sure how much hardening we need in tree-based containers since they don't provide direct access to memory. Once again, I conservatively state that they are not hardened (in the sense that there are no interesting hardening checks there)
- re. `any` and `variant` -- is it possible to use those incorrectly without getting a guaranteed exception?
https://github.com/llvm/llvm-project/pull/92021
    
    
More information about the libcxx-commits
mailing list