[libcxx-commits] [libcxx] [libcxx] adds a size-based representation for `vector`'s unstable ABI (PR #155330)

Louis Dionne via libcxx-commits libcxx-commits at lists.llvm.org
Fri Feb 6 07:08:13 PST 2026


================
@@ -0,0 +1,443 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef _LIBCPP___VECTOR_LAYOUT_H
+#define _LIBCPP___VECTOR_LAYOUT_H
+
+#include <__assert>
+#include <__config>
+#include <__memory/allocator_traits.h>
+#include <__memory/compressed_pair.h>
+#include <__memory/swap_allocator.h>
+#include <__split_buffer>
+#include <__type_traits/is_nothrow_constructible.h>
+#include <__utility/move.h>
+#include <__utility/swap.h>
+
+#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
+#  pragma GCC system_header
+#endif
+
+_LIBCPP_PUSH_MACROS
+#include <__undef_macros>
+
+_LIBCPP_BEGIN_NAMESPACE_STD
+
+/// Defines `std::vector`'s storage layout and any operations that are affected by a change in the
+/// layout.
+///
+/// Dynamically-sized arrays like `std::vector` have several different representations. libc++
+/// supports two different layouts for `std::vector`:
+///
+///   * pointer-based layout
+///   * size-based layout
+//
+/// We describe these layouts below. All vector representations have a pointer that points to where
+/// the memory is allocated (called `__begin_`).
+///
+/// **Pointer-based layout**
+///
+/// The pointer-based layout uses two more pointers in addition to `__begin_`. The second pointer
+/// (called `__end_`) past the end of the part of the buffer that holds valid elements. The pointer
+/// (called `__capacity_`) points past the end of the allocated buffer. This is the default
+/// representation for libc++ due to historical reasons.
+///
+/// The second pointer has three primary use-cases:
+///   * to compute the size of the vector; and
+///   * to construct the past-the-end iterator; and
+///   * to indicate where the next element should be appended.
+///
+/// The third pointer is used to compute the capacity of the vector, which lets the vector know how
+/// many elements can be added to the vector before a reallocation is necessary.
+///
+///    __begin_ = 0xE4FD0, __end_ = 0xE4FF0, __capacity_ = 0xE5000
+///                 0xE4FD0                             0xE4FF0           0xE5000
+///                    v                                   v                 v
+///    +---------------+--------+--------+--------+--------+--------+--------+---------------------+
+///    | ????????????? |   3174 |   5656 |    648 |    489 | ------ | ------ | ??????????????????? |
+///    +---------------+--------+--------+--------+--------+--------+--------+---------------------+
+///                    ^                                   ^                 ^
+///                __begin_                             __end_          __capacity_
+///
+///    Figure 1: A visual representation of a pointer-based `std::vector<short>`. This vector has
+///    four elements, with the capacity to store six.
+///
+/// This is the default layout for libc++.
+///
+/// **Size-based layout**
+///
+/// The size-based layout uses integers to track its size and capacity, and computes pointers to
+/// past-the-end of the valid range and the whole buffer only when it's necessary. This layout is
+/// opt-in, but yields a significant performance boost relative to the pointer-based layout (see
+/// below).
+///
+///    __begin_ = 0xE4FD0, __size_ = 4, __capacity_ = 6
+///                 0xE4FD0
+///                    v
+///    +---------------+--------+--------+--------+--------+--------+--------+---------------------+
+///    | ????????????? |   3174 |   5656 |    648 |    489 | ------ | ------ | ??????????????????? |
+///    +---------------+--------+--------+--------+--------+--------+--------+---------------------+
+///                    ^
+///                __begin_
+///
+///    Figure 2: A visual representation of this a pointer-based layout. Blank boxes are not a part
+///    of the vector's allocated buffer. Boxes with numbers are valid elements within the vector,
+///    and boxes with `xx` have been allocated, but aren't being used as elements right now.
----------------
ldionne wrote:

```suggestion
///    Figure 2: A visual representation of a size-based layout. Blank boxes are not a part
///    of the vector's allocated buffer. Boxes with numbers are valid elements within the vector,
///    and boxes with `---` have been allocated, but aren't being used as elements right now.
```

We should actually move this part to the above diagram instead since it explains how to read the diagrams:

> Blank boxes are not a part of the vector's allocated buffer. Boxes with numbers are valid elements within the vector, and boxes with `---` have been allocated, but aren't being used as elements right now.

That way, you read the first figure and read the caption, then when you get to the second figure we use the same conventions and there's no need to re-explain them.

https://github.com/llvm/llvm-project/pull/155330


More information about the libcxx-commits mailing list