[libcxx-commits] [libcxx] [libc++] Take advantage of trivial relocation in std::vector::erase (PR #116268)

Louis Dionne via libcxx-commits libcxx-commits at lists.llvm.org
Thu Nov 14 10:29:07 PST 2024


https://github.com/ldionne created https://github.com/llvm/llvm-project/pull/116268

In vector::erase(iter) and vector::erase(iter, iter), we can take
advantage of a type being trivially relocatable to open up a gap
in the vector and then relocate the tail of the vector into that gap.

The benefit is that relocating an object is often more efficient than
move-assigning and then destroying the original object. For types that
can be relocated trivially but that are complicated enough for the
compiler not to optimize by itself (like std::string), this provides
around a 2x performance speedup in vector::erase (see below).

This optimization requires stopping the usage of Clang's __is_trivially_relocatable
builtin, which doesn't currently honour assignment operators like
is_trivially_copyable does and can lead us to perform incorrect
optimizations.

It is also worth noting that __uninitialized_allocator_relocate has to
be modified so that we can relocate into an overlapping range. This
has an unfortunate impact on its exception safety guarantees, which
needs to be investigated further.

Previous implementation
```
--------------------------------------------------------------------------------------
Benchmark                                            Time             CPU   Iterations
--------------------------------------------------------------------------------------
BM_erase_iter_in_middle/vector_int/1024           24.9 ns         24.9 ns     28042962
BM_erase_iter_in_middle/vector_int/4096            107 ns          107 ns      6590592
BM_erase_iter_in_middle/vector_int/10240           271 ns          265 ns      2733478
BM_erase_iter_in_middle/vector_string/1024         349 ns          349 ns      2005886
BM_erase_iter_in_middle/vector_string/4096        1410 ns         1406 ns       498355
BM_erase_iter_in_middle/vector_string/10240       3449 ns         3449 ns       201989
BM_erase_iter_at_start/vector_int/1024            47.1 ns         47.1 ns     14836261
BM_erase_iter_at_start/vector_int/4096             204 ns          204 ns      3430414
BM_erase_iter_at_start/vector_int/10240            504 ns          504 ns      1391373
BM_erase_iter_at_start/vector_string/1024          684 ns          684 ns      1025160
BM_erase_iter_at_start/vector_string/4096         2855 ns         2806 ns       254080
BM_erase_iter_at_start/vector_string/10240        7060 ns         7060 ns        94134
```

New implementation
```
--------------------------------------------------------------------------------------
Benchmark                                            Time             CPU   Iterations
--------------------------------------------------------------------------------------
BM_erase_iter_in_middle/vector_int/1024           26.0 ns         25.9 ns     27127367
BM_erase_iter_in_middle/vector_int/4096            105 ns          105 ns      6515204
BM_erase_iter_in_middle/vector_int/10240           259 ns          258 ns      2800795
BM_erase_iter_in_middle/vector_string/1024         148 ns          147 ns      4725706
BM_erase_iter_in_middle/vector_string/4096         608 ns          606 ns      1168205
BM_erase_iter_in_middle/vector_string/10240       1523 ns         1520 ns       459909
BM_erase_iter_at_start/vector_int/1024            47.1 ns         47.1 ns     14762513
BM_erase_iter_at_start/vector_int/4096             205 ns          205 ns      3403130
BM_erase_iter_at_start/vector_int/10240            507 ns          507 ns      1382716
BM_erase_iter_at_start/vector_string/1024          300 ns          300 ns      2327546
BM_erase_iter_at_start/vector_string/4096         1205 ns         1205 ns       580855
BM_erase_iter_at_start/vector_string/10240        4296 ns         4296 ns       162956
```

>From 7f8c872ee48111fe2d7e174c5e30fc2fd48cc6b9 Mon Sep 17 00:00:00 2001
From: Louis Dionne <ldionne.2 at gmail.com>
Date: Thu, 14 Nov 2024 17:26:24 +0100
Subject: [PATCH 1/2] [libc++] Improve the tests for vector::erase

In particular, test everything with both a normal and a min_allocator,
add tests for a few corner cases and add tests with types that are
trivially relocatable. Also add tests that count the number of assignments
performed by vector::erase, since that is mandated by the Standard.

This patch is a preparation for optimizing vector::erase.
---
 libcxx/test/benchmarks/ContainerBenchmarks.h  |  30 ++
 libcxx/test/benchmarks/deque.bench.cpp        |  11 +
 .../benchmarks/vector_operations.bench.cpp    |  10 +
 .../vector/vector.modifiers/common.h          |  86 ++++++
 .../vector.modifiers/erase_iter.pass.cpp      | 196 ++++++-------
 .../vector.modifiers/erase_iter_iter.pass.cpp | 269 ++++++++++--------
 6 files changed, 361 insertions(+), 241 deletions(-)
 create mode 100644 libcxx/test/std/containers/sequences/vector/vector.modifiers/common.h

diff --git a/libcxx/test/benchmarks/ContainerBenchmarks.h b/libcxx/test/benchmarks/ContainerBenchmarks.h
index 742c848328604c..38e11777f488b7 100644
--- a/libcxx/test/benchmarks/ContainerBenchmarks.h
+++ b/libcxx/test/benchmarks/ContainerBenchmarks.h
@@ -11,6 +11,8 @@
 #define BENCHMARK_CONTAINER_BENCHMARKS_H
 
 #include <cassert>
+#include <iterator>
+#include <utility>
 
 #include "Utilities.h"
 #include "benchmark/benchmark.h"
@@ -149,6 +151,34 @@ void BM_EmplaceDuplicate(benchmark::State& st, Container c, GenInputs gen) {
   }
 }
 
+template <class Container, class GenInputs>
+void BM_erase_iter_in_middle(benchmark::State& st, Container, GenInputs gen) {
+  auto in = gen(st.range(0));
+  Container c(in.begin(), in.end());
+  assert(c.size() > 2);
+  for (auto _ : st) {
+    auto mid    = std::next(c.begin(), c.size() / 2);
+    auto tmp    = *mid;
+    auto result = c.erase(mid); // erase an element in the middle
+    benchmark::DoNotOptimize(result);
+    c.push_back(std::move(tmp)); // and then push it back at the end to avoid needing a new container
+  }
+}
+
+template <class Container, class GenInputs>
+void BM_erase_iter_at_start(benchmark::State& st, Container, GenInputs gen) {
+  auto in = gen(st.range(0));
+  Container c(in.begin(), in.end());
+  assert(c.size() > 2);
+  for (auto _ : st) {
+    auto it     = c.begin();
+    auto tmp    = *it;
+    auto result = c.erase(it); // erase the first element
+    benchmark::DoNotOptimize(result);
+    c.push_back(std::move(tmp)); // and then push it back at the end to avoid needing a new container
+  }
+}
+
 template <class Container, class GenInputs>
 void BM_Find(benchmark::State& st, Container c, GenInputs gen) {
   auto in = gen(st.range(0));
diff --git a/libcxx/test/benchmarks/deque.bench.cpp b/libcxx/test/benchmarks/deque.bench.cpp
index b8f3b76dd27ee6..ab0ba96b12ffca 100644
--- a/libcxx/test/benchmarks/deque.bench.cpp
+++ b/libcxx/test/benchmarks/deque.bench.cpp
@@ -9,6 +9,7 @@
 // UNSUPPORTED: c++03, c++11, c++14, c++17, c++20
 
 #include <deque>
+#include <string>
 
 #include "benchmark/benchmark.h"
 
@@ -41,4 +42,14 @@ BENCHMARK_CAPTURE(BM_ConstructFromRange, deque_size_t, std::deque<size_t>{}, get
 BENCHMARK_CAPTURE(BM_ConstructFromRange, deque_string, std::deque<std::string>{}, getRandomStringInputs)
     ->Arg(TestNumInputs);
 
+BENCHMARK_CAPTURE(BM_erase_iter_in_middle, deque_int, std::deque<int>{}, getRandomIntegerInputs<int>)
+    ->Range(TestNumInputs, TestNumInputs * 10);
+BENCHMARK_CAPTURE(BM_erase_iter_in_middle, deque_string, std::deque<std::string>{}, getRandomStringInputs)
+    ->Range(TestNumInputs, TestNumInputs * 10);
+
+BENCHMARK_CAPTURE(BM_erase_iter_at_start, deque_int, std::deque<int>{}, getRandomIntegerInputs<int>)
+    ->Range(TestNumInputs, TestNumInputs * 10);
+BENCHMARK_CAPTURE(BM_erase_iter_at_start, deque_string, std::deque<std::string>{}, getRandomStringInputs)
+    ->Range(TestNumInputs, TestNumInputs * 10);
+
 BENCHMARK_MAIN();
diff --git a/libcxx/test/benchmarks/vector_operations.bench.cpp b/libcxx/test/benchmarks/vector_operations.bench.cpp
index ce8ab233fc9817..1855861263324d 100644
--- a/libcxx/test/benchmarks/vector_operations.bench.cpp
+++ b/libcxx/test/benchmarks/vector_operations.bench.cpp
@@ -54,6 +54,16 @@ BENCHMARK_CAPTURE(BM_ConstructFromRange, vector_string, std::vector<std::string>
 
 BENCHMARK_CAPTURE(BM_Pushback_no_grow, vector_int, std::vector<int>{})->Arg(TestNumInputs);
 
+BENCHMARK_CAPTURE(BM_erase_iter_in_middle, vector_int, std::vector<int>{}, getRandomIntegerInputs<int>)
+    ->Range(TestNumInputs, TestNumInputs * 10);
+BENCHMARK_CAPTURE(BM_erase_iter_in_middle, vector_string, std::vector<std::string>{}, getRandomStringInputs)
+    ->Range(TestNumInputs, TestNumInputs * 10);
+
+BENCHMARK_CAPTURE(BM_erase_iter_at_start, vector_int, std::vector<int>{}, getRandomIntegerInputs<int>)
+    ->Range(TestNumInputs, TestNumInputs * 10);
+BENCHMARK_CAPTURE(BM_erase_iter_at_start, vector_string, std::vector<std::string>{}, getRandomStringInputs)
+    ->Range(TestNumInputs, TestNumInputs * 10);
+
 template <class T>
 void bm_grow(benchmark::State& state) {
   for (auto _ : state) {
diff --git a/libcxx/test/std/containers/sequences/vector/vector.modifiers/common.h b/libcxx/test/std/containers/sequences/vector/vector.modifiers/common.h
new file mode 100644
index 00000000000000..72cd47a50b2c0d
--- /dev/null
+++ b/libcxx/test/std/containers/sequences/vector/vector.modifiers/common.h
@@ -0,0 +1,86 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef TEST_STD_CONTAINERS_SEQUENCES_VECTOR_VECTOR_MODIFIERS_COMMON_H
+#define TEST_STD_CONTAINERS_SEQUENCES_VECTOR_VECTOR_MODIFIERS_COMMON_H
+
+#include "test_macros.h"
+
+#include <type_traits> // for __libcpp_is_trivially_relocatable
+
+#ifndef TEST_HAS_NO_EXCEPTIONS
+struct Throws {
+  Throws() : v_(0) {}
+  Throws(int v) : v_(v) {}
+  Throws(const Throws& rhs) : v_(rhs.v_) {
+    if (sThrows)
+      throw 1;
+  }
+  Throws(Throws&& rhs) : v_(rhs.v_) {
+    if (sThrows)
+      throw 1;
+  }
+  Throws& operator=(const Throws& rhs) {
+    v_ = rhs.v_;
+    return *this;
+  }
+  Throws& operator=(Throws&& rhs) {
+    v_ = rhs.v_;
+    return *this;
+  }
+  int v_;
+  static bool sThrows;
+};
+
+bool Throws::sThrows = false;
+#endif
+
+struct Tracker {
+  int copy_assignments = 0;
+  int move_assignments = 0;
+};
+
+struct TrackedAssignment {
+  Tracker* tracker_;
+  TEST_CONSTEXPR_CXX14 explicit TrackedAssignment(Tracker* tracker) : tracker_(tracker) {}
+
+  TrackedAssignment(TrackedAssignment const&) = default;
+  TrackedAssignment(TrackedAssignment&&)      = default;
+
+  TEST_CONSTEXPR_CXX14 TrackedAssignment& operator=(TrackedAssignment const&) {
+    tracker_->copy_assignments++;
+    return *this;
+  }
+  TEST_CONSTEXPR_CXX14 TrackedAssignment& operator=(TrackedAssignment&&) {
+    tracker_->move_assignments++;
+    return *this;
+  }
+};
+
+struct NonTriviallyRelocatable {
+  int value_;
+  TEST_CONSTEXPR NonTriviallyRelocatable() : value_(0) {}
+  TEST_CONSTEXPR explicit NonTriviallyRelocatable(int v) : value_(v) {}
+  TEST_CONSTEXPR NonTriviallyRelocatable(NonTriviallyRelocatable const& other) : value_(other.value_) {}
+  TEST_CONSTEXPR NonTriviallyRelocatable(NonTriviallyRelocatable&& other) : value_(other.value_) {}
+  TEST_CONSTEXPR_CXX14 NonTriviallyRelocatable& operator=(NonTriviallyRelocatable const& other) {
+    value_ = other.value_;
+    return *this;
+  }
+  TEST_CONSTEXPR_CXX14 NonTriviallyRelocatable& operator=(NonTriviallyRelocatable&& other) {
+    value_ = other.value_;
+    return *this;
+  }
+
+  TEST_CONSTEXPR_CXX14 friend bool operator==(NonTriviallyRelocatable const& a, NonTriviallyRelocatable const& b) {
+    return a.value_ == b.value_;
+  }
+};
+LIBCPP_STATIC_ASSERT(!std::__libcpp_is_trivially_relocatable<NonTriviallyRelocatable>::value, "");
+
+#endif // TEST_STD_CONTAINERS_SEQUENCES_VECTOR_VECTOR_MODIFIERS_COMMON_H
diff --git a/libcxx/test/std/containers/sequences/vector/vector.modifiers/erase_iter.pass.cpp b/libcxx/test/std/containers/sequences/vector/vector.modifiers/erase_iter.pass.cpp
index 549f29a8f7ba11..4a089fd7a4c4fc 100644
--- a/libcxx/test/std/containers/sequences/vector/vector.modifiers/erase_iter.pass.cpp
+++ b/libcxx/test/std/containers/sequences/vector/vector.modifiers/erase_iter.pass.cpp
@@ -11,135 +11,80 @@
 // iterator erase(const_iterator position);
 
 #include <vector>
-#include <iterator>
 #include <cassert>
+#include <memory>
 
 #include "asan_testing.h"
+#include "common.h"
 #include "min_allocator.h"
 #include "MoveOnly.h"
 #include "test_macros.h"
 
-#ifndef TEST_HAS_NO_EXCEPTIONS
-struct Throws {
-  Throws() : v_(0) {}
-  Throws(int v) : v_(v) {}
-  Throws(const Throws& rhs) : v_(rhs.v_) {
-    if (sThrows)
-      throw 1;
-  }
-  Throws(Throws&& rhs) : v_(rhs.v_) {
-    if (sThrows)
-      throw 1;
-  }
-  Throws& operator=(const Throws& rhs) {
-    v_ = rhs.v_;
-    return *this;
-  }
-  Throws& operator=(Throws&& rhs) {
-    v_ = rhs.v_;
-    return *this;
-  }
-  int v_;
-  static bool sThrows;
-};
-
-bool Throws::sThrows = false;
-#endif
-
-TEST_CONSTEXPR_CXX20 bool tests() {
-  {
-    int a1[] = {1, 2, 3, 4, 5};
-    std::vector<int> l1(a1, a1 + 5);
-    l1.erase(l1.begin());
-    assert(is_contiguous_container_asan_correct(l1));
-    assert(l1 == std::vector<int>(a1 + 1, a1 + 5));
-  }
+template <template <class> class Allocator, class T>
+TEST_CONSTEXPR_CXX20 void tests() {
   {
-    int a1[] = {1, 2, 3, 4, 5};
-    int e1[] = {1, 3, 4, 5};
-    std::vector<int> l1(a1, a1 + 5);
-    l1.erase(l1.begin() + 1);
-    assert(is_contiguous_container_asan_correct(l1));
-    assert(l1 == std::vector<int>(e1, e1 + 4));
-  }
-  {
-    int a1[] = {1, 2, 3};
-    std::vector<int> l1(a1, a1 + 3);
-    std::vector<int>::const_iterator i = l1.begin();
-    assert(is_contiguous_container_asan_correct(l1));
-    ++i;
-    std::vector<int>::iterator j = l1.erase(i);
-    assert(l1.size() == 2);
-    assert(std::distance(l1.begin(), l1.end()) == 2);
-    assert(*j == 3);
-    assert(*l1.begin() == 1);
-    assert(*std::next(l1.begin()) == 3);
-    assert(is_contiguous_container_asan_correct(l1));
-    j = l1.erase(j);
-    assert(j == l1.end());
-    assert(l1.size() == 1);
-    assert(std::distance(l1.begin(), l1.end()) == 1);
-    assert(*l1.begin() == 1);
-    assert(is_contiguous_container_asan_correct(l1));
-    j = l1.erase(l1.begin());
-    assert(j == l1.end());
-    assert(l1.size() == 0);
-    assert(std::distance(l1.begin(), l1.end()) == 0);
-    assert(is_contiguous_container_asan_correct(l1));
-  }
+    T arr[]             = {T(1), T(2), T(3), T(4), T(5)};
+    using Vector        = std::vector<T, Allocator<T> >;
+    using Iterator      = typename Vector::iterator;
+    using ConstIterator = typename Vector::const_iterator;
 
-  // Make sure vector::erase works with move-only types
-  // When non-trivial
-  {
-    std::vector<MoveOnly> v;
-    v.emplace_back(1);
-    v.emplace_back(2);
-    v.emplace_back(3);
-    v.erase(v.begin());
-    assert(v.size() == 2);
-    assert(v[0] == MoveOnly(2));
-    assert(v[1] == MoveOnly(3));
-  }
-  // When trivial
-  {
-    std::vector<TrivialMoveOnly> v;
-    v.emplace_back(1);
-    v.emplace_back(2);
-    v.emplace_back(3);
-    v.erase(v.begin());
-    assert(v.size() == 2);
-    assert(v[0] == TrivialMoveOnly(2));
-    assert(v[1] == TrivialMoveOnly(3));
+    {
+      Vector v(arr, arr + 5);
+      Iterator it = v.erase(v.cbegin());
+      assert(v == Vector(arr + 1, arr + 5));
+      assert(it == v.begin());
+      assert(is_contiguous_container_asan_correct(v));
+    }
+    {
+      T expected[] = {T(1), T(3), T(4), T(5)};
+      Vector v(arr, arr + 5);
+      Iterator it = v.erase(v.cbegin() + 1);
+      assert(v == Vector(expected, expected + 4));
+      assert(it == v.begin() + 1);
+      assert(is_contiguous_container_asan_correct(v));
+    }
+    {
+      T expected[] = {T(1), T(2), T(3), T(4)};
+      Vector v(arr, arr + 5);
+      Iterator it = v.erase(v.cbegin() + 4);
+      assert(v == Vector(expected, expected + 4));
+      assert(it == v.end());
+      assert(is_contiguous_container_asan_correct(v));
+    }
   }
 
-#if TEST_STD_VER >= 11
+  // Make sure vector::erase works with move-only types
   {
-    int a1[] = {1, 2, 3};
-    std::vector<int, min_allocator<int>> l1(a1, a1 + 3);
-    std::vector<int, min_allocator<int>>::const_iterator i = l1.begin();
-    assert(is_contiguous_container_asan_correct(l1));
-    ++i;
-    std::vector<int, min_allocator<int>>::iterator j = l1.erase(i);
-    assert(l1.size() == 2);
-    assert(std::distance(l1.begin(), l1.end()) == 2);
-    assert(*j == 3);
-    assert(*l1.begin() == 1);
-    assert(*std::next(l1.begin()) == 3);
-    assert(is_contiguous_container_asan_correct(l1));
-    j = l1.erase(j);
-    assert(j == l1.end());
-    assert(l1.size() == 1);
-    assert(std::distance(l1.begin(), l1.end()) == 1);
-    assert(*l1.begin() == 1);
-    assert(is_contiguous_container_asan_correct(l1));
-    j = l1.erase(l1.begin());
-    assert(j == l1.end());
-    assert(l1.size() == 0);
-    assert(std::distance(l1.begin(), l1.end()) == 0);
-    assert(is_contiguous_container_asan_correct(l1));
+    // When non-trivial
+    {
+      std::vector<MoveOnly, Allocator<MoveOnly> > v;
+      v.emplace_back(1);
+      v.emplace_back(2);
+      v.emplace_back(3);
+      v.erase(v.begin());
+      assert(v.size() == 2);
+      assert(v[0] == MoveOnly(2));
+      assert(v[1] == MoveOnly(3));
+    }
+    // When trivial
+    {
+      std::vector<TrivialMoveOnly, Allocator<TrivialMoveOnly> > v;
+      v.emplace_back(1);
+      v.emplace_back(2);
+      v.emplace_back(3);
+      v.erase(v.begin());
+      assert(v.size() == 2);
+      assert(v[0] == TrivialMoveOnly(2));
+      assert(v[1] == TrivialMoveOnly(3));
+    }
   }
-#endif
+}
 
+TEST_CONSTEXPR_CXX20 bool tests() {
+  tests<std::allocator, int>();
+  tests<std::allocator, NonTriviallyRelocatable>();
+  tests<min_allocator, int>();
+  tests<min_allocator, NonTriviallyRelocatable>();
   return true;
 }
 
@@ -163,5 +108,26 @@ int main(int, char**) {
   }
 #endif
 
+  // Make sure we satisfy the complexity requirement in terms of the number of times the assignment
+  // operator is called.
+  {
+    Tracker tracker;
+    std::vector<TrackedAssignment> v;
+
+    // Set up the vector with 5 elements.
+    for (int i = 0; i != 5; ++i) {
+      v.emplace_back(&tracker);
+    }
+    assert(tracker.copy_assignments == 0);
+    assert(tracker.move_assignments == 0);
+
+    // Erase element [1] from it. Elements [2] [3] [4] should be shifted, so we should
+    // see 3 move assignments (and nothing else).
+    v.erase(v.begin() + 1);
+    assert(v.size() == 4);
+    assert(tracker.copy_assignments == 0);
+    assert(tracker.move_assignments == 3);
+  }
+
   return 0;
 }
diff --git a/libcxx/test/std/containers/sequences/vector/vector.modifiers/erase_iter_iter.pass.cpp b/libcxx/test/std/containers/sequences/vector/vector.modifiers/erase_iter_iter.pass.cpp
index 4091e71d814e38..f972034681ef38 100644
--- a/libcxx/test/std/containers/sequences/vector/vector.modifiers/erase_iter_iter.pass.cpp
+++ b/libcxx/test/std/containers/sequences/vector/vector.modifiers/erase_iter_iter.pass.cpp
@@ -11,86 +11,107 @@
 // iterator erase(const_iterator first, const_iterator last);
 
 #include <vector>
-#include <iterator>
 #include <cassert>
+#include <memory>
+#include <string>
 
 #include "asan_testing.h"
+#include "common.h"
 #include "min_allocator.h"
 #include "MoveOnly.h"
 #include "test_macros.h"
 
-#ifndef TEST_HAS_NO_EXCEPTIONS
-struct Throws {
-  Throws() : v_(0) {}
-  Throws(int v) : v_(v) {}
-  Throws(const Throws& rhs) : v_(rhs.v_) {
-    if (sThrows)
-      throw 1;
-  }
-  Throws(Throws&& rhs) : v_(rhs.v_) {
-    if (sThrows)
-      throw 1;
-  }
-  Throws& operator=(const Throws& rhs) {
-    v_ = rhs.v_;
-    return *this;
-  }
-  Throws& operator=(Throws&& rhs) {
-    v_ = rhs.v_;
-    return *this;
-  }
-  int v_;
-  static bool sThrows;
-};
+template <template <class> class Allocator, class T>
+TEST_CONSTEXPR_CXX20 void tests() {
+  {
+    T arr[]             = {T(1), T(2), T(3)};
+    using Vector        = std::vector<T, Allocator<T> >;
+    using Iterator      = typename Vector::iterator;
+    using ConstIterator = typename Vector::const_iterator;
 
-bool Throws::sThrows = false;
-#endif
+    // Erase an empty range [first, last): last should be returned
+    {
+      {
+        Vector v;
+        Iterator i = v.erase(v.end(), v.end());
+        assert(v.empty());
+        assert(i == v.end());
+        assert(is_contiguous_container_asan_correct(v));
+      }
+      {
+        Vector v(arr, arr + 3);
+        ConstIterator first = v.cbegin(), last = v.cbegin();
+        Iterator i = v.erase(first, last);
+        assert(v == Vector(arr, arr + 3));
+        assert(i == last);
+        assert(is_contiguous_container_asan_correct(v));
+      }
+      {
+        Vector v(arr, arr + 3);
+        ConstIterator first = v.cbegin() + 1, last = v.cbegin() + 1;
+        Iterator i = v.erase(first, last);
+        assert(v == Vector(arr, arr + 3));
+        assert(i == last);
+        assert(is_contiguous_container_asan_correct(v));
+      }
+      {
+        Vector v(arr, arr + 3);
+        ConstIterator first = v.cbegin(), last = v.cbegin();
+        Iterator i = v.erase(first, last);
+        assert(v == Vector(arr, arr + 3));
+        assert(i == last);
+        assert(is_contiguous_container_asan_correct(v));
+      }
+    }
 
-TEST_CONSTEXPR_CXX20 bool tests() {
-  int a1[] = {1, 2, 3};
-  {
-    std::vector<int> l1(a1, a1 + 3);
-    assert(is_contiguous_container_asan_correct(l1));
-    std::vector<int>::iterator i = l1.erase(l1.cbegin(), l1.cbegin());
-    assert(l1.size() == 3);
-    assert(std::distance(l1.cbegin(), l1.cend()) == 3);
-    assert(i == l1.begin());
-    assert(is_contiguous_container_asan_correct(l1));
-  }
-  {
-    std::vector<int> l1(a1, a1 + 3);
-    assert(is_contiguous_container_asan_correct(l1));
-    std::vector<int>::iterator i = l1.erase(l1.cbegin(), std::next(l1.cbegin()));
-    assert(l1.size() == 2);
-    assert(std::distance(l1.cbegin(), l1.cend()) == 2);
-    assert(i == l1.begin());
-    assert(l1 == std::vector<int>(a1 + 1, a1 + 3));
-    assert(is_contiguous_container_asan_correct(l1));
-  }
-  {
-    std::vector<int> l1(a1, a1 + 3);
-    assert(is_contiguous_container_asan_correct(l1));
-    std::vector<int>::iterator i = l1.erase(l1.cbegin(), std::next(l1.cbegin(), 2));
-    assert(l1.size() == 1);
-    assert(std::distance(l1.cbegin(), l1.cend()) == 1);
-    assert(i == l1.begin());
-    assert(l1 == std::vector<int>(a1 + 2, a1 + 3));
-    assert(is_contiguous_container_asan_correct(l1));
-  }
-  {
-    std::vector<int> l1(a1, a1 + 3);
-    assert(is_contiguous_container_asan_correct(l1));
-    std::vector<int>::iterator i = l1.erase(l1.cbegin(), std::next(l1.cbegin(), 3));
-    assert(l1.size() == 0);
-    assert(std::distance(l1.cbegin(), l1.cend()) == 0);
-    assert(i == l1.begin());
-    assert(is_contiguous_container_asan_correct(l1));
+    // Erase non-empty ranges
+    {
+      // Starting at begin()
+      {
+        {
+          Vector v(arr, arr + 3);
+          Iterator i = v.erase(v.cbegin(), v.cbegin() + 1);
+          assert(v == Vector(arr + 1, arr + 3));
+          assert(i == v.begin());
+          assert(is_contiguous_container_asan_correct(v));
+        }
+        {
+          Vector v(arr, arr + 3);
+          Iterator i = v.erase(v.cbegin(), v.cbegin() + 2);
+          assert(v == Vector(arr + 2, arr + 3));
+          assert(i == v.begin());
+          assert(is_contiguous_container_asan_correct(v));
+        }
+        {
+          Vector v(arr, arr + 3);
+          Iterator i = v.erase(v.cbegin(), v.end());
+          assert(v.size() == 0);
+          assert(i == v.begin());
+          assert(is_contiguous_container_asan_correct(v));
+        }
+      }
+      {
+        Vector v(arr, arr + 3);
+        Iterator i = v.erase(v.cbegin() + 1, v.cbegin() + 2);
+        assert(v.size() == 2);
+        assert(v[0] == arr[0]);
+        assert(v[1] == arr[2]);
+        assert(i == v.begin() + 1);
+        assert(is_contiguous_container_asan_correct(v));
+      }
+      {
+        Vector v(arr, arr + 3);
+        Iterator i = v.erase(v.cbegin() + 1, v.cend());
+        assert(v == Vector(arr, arr + 1));
+        assert(i == v.begin() + 1);
+        assert(is_contiguous_container_asan_correct(v));
+      }
+    }
   }
   {
-    std::vector<std::vector<int> > outer(2, std::vector<int>(1));
-    assert(is_contiguous_container_asan_correct(outer));
-    assert(is_contiguous_container_asan_correct(outer[0]));
-    assert(is_contiguous_container_asan_correct(outer[1]));
+    using InnerVector = std::vector<T, Allocator<T> >;
+    using Vector      = std::vector<InnerVector, Allocator<InnerVector> >;
+    Vector outer(2, InnerVector(1));
     outer.erase(outer.begin(), outer.begin());
     assert(outer.size() == 2);
     assert(outer[0].size() == 1);
@@ -99,11 +120,12 @@ TEST_CONSTEXPR_CXX20 bool tests() {
     assert(is_contiguous_container_asan_correct(outer[0]));
     assert(is_contiguous_container_asan_correct(outer[1]));
   }
+
   // Make sure vector::erase works with move-only types
   {
     // When non-trivial
     {
-      std::vector<MoveOnly> v;
+      std::vector<MoveOnly, Allocator<MoveOnly> > v;
       v.emplace_back(1);
       v.emplace_back(2);
       v.emplace_back(3);
@@ -113,7 +135,7 @@ TEST_CONSTEXPR_CXX20 bool tests() {
     }
     // When trivial
     {
-      std::vector<TrivialMoveOnly> v;
+      std::vector<TrivialMoveOnly, Allocator<TrivialMoveOnly> > v;
       v.emplace_back(1);
       v.emplace_back(2);
       v.emplace_back(3);
@@ -122,67 +144,19 @@ TEST_CONSTEXPR_CXX20 bool tests() {
       assert(v[0] == TrivialMoveOnly(3));
     }
   }
-#if TEST_STD_VER >= 11
-  {
-    std::vector<int, min_allocator<int>> l1(a1, a1 + 3);
-    assert(is_contiguous_container_asan_correct(l1));
-    std::vector<int, min_allocator<int>>::iterator i = l1.erase(l1.cbegin(), l1.cbegin());
-    assert(l1.size() == 3);
-    assert(std::distance(l1.cbegin(), l1.cend()) == 3);
-    assert(i == l1.begin());
-    assert(is_contiguous_container_asan_correct(l1));
-  }
-  {
-    std::vector<int, min_allocator<int>> l1(a1, a1 + 3);
-    assert(is_contiguous_container_asan_correct(l1));
-    std::vector<int, min_allocator<int>>::iterator i = l1.erase(l1.cbegin(), std::next(l1.cbegin()));
-    assert(l1.size() == 2);
-    assert(std::distance(l1.cbegin(), l1.cend()) == 2);
-    assert(i == l1.begin());
-    assert((l1 == std::vector<int, min_allocator<int>>(a1 + 1, a1 + 3)));
-    assert(is_contiguous_container_asan_correct(l1));
-  }
-  {
-    std::vector<int, min_allocator<int>> l1(a1, a1 + 3);
-    assert(is_contiguous_container_asan_correct(l1));
-    std::vector<int, min_allocator<int>>::iterator i = l1.erase(l1.cbegin(), std::next(l1.cbegin(), 2));
-    assert(l1.size() == 1);
-    assert(std::distance(l1.cbegin(), l1.cend()) == 1);
-    assert(i == l1.begin());
-    assert((l1 == std::vector<int, min_allocator<int>>(a1 + 2, a1 + 3)));
-    assert(is_contiguous_container_asan_correct(l1));
-  }
-  {
-    std::vector<int, min_allocator<int>> l1(a1, a1 + 3);
-    assert(is_contiguous_container_asan_correct(l1));
-    std::vector<int, min_allocator<int>>::iterator i = l1.erase(l1.cbegin(), std::next(l1.cbegin(), 3));
-    assert(l1.size() == 0);
-    assert(std::distance(l1.cbegin(), l1.cend()) == 0);
-    assert(i == l1.begin());
-    assert(is_contiguous_container_asan_correct(l1));
-  }
-  {
-    std::vector<std::vector<int, min_allocator<int>>, min_allocator<std::vector<int, min_allocator<int>>>> outer(
-        2, std::vector<int, min_allocator<int>>(1));
-    assert(is_contiguous_container_asan_correct(outer));
-    assert(is_contiguous_container_asan_correct(outer[0]));
-    assert(is_contiguous_container_asan_correct(outer[1]));
-    outer.erase(outer.begin(), outer.begin());
-    assert(outer.size() == 2);
-    assert(outer[0].size() == 1);
-    assert(outer[1].size() == 1);
-    assert(is_contiguous_container_asan_correct(outer));
-    assert(is_contiguous_container_asan_correct(outer[0]));
-    assert(is_contiguous_container_asan_correct(outer[1]));
-  }
-#endif
+}
 
+TEST_CONSTEXPR_CXX20 bool tests() {
+  tests<std::allocator, int>();
+  tests<std::allocator, NonTriviallyRelocatable>();
+  tests<min_allocator, int>();
+  tests<min_allocator, NonTriviallyRelocatable>();
   return true;
 }
 
 int main(int, char**) {
   tests();
-#if TEST_STD_VER > 17
+#if TEST_STD_VER >= 20
   static_assert(tests());
 #endif
 
@@ -200,5 +174,48 @@ int main(int, char**) {
   }
 #endif
 
+  // Real world example with std::string, mostly intended to test trivial relocation
+  {
+    std::vector<std::string> v;
+
+    // fill the vector with half short string and half long strings
+    std::string short_string = "short";
+    std::string long_string(256, 'x');
+    for (int i = 0; i != 10; ++i) {
+      v.push_back(i % 2 == 0 ? short_string : long_string);
+    }
+
+    std::vector<std::string> original = v;
+
+    auto it = v.erase(v.cbegin() + 2, v.cbegin() + 8);
+    assert(v.size() == 4);
+    assert(v[0] == original[0]);
+    assert(v[1] == original[1]);
+    assert(v[2] == original[8]);
+    assert(v[3] == original[9]);
+    assert(it == v.begin() + 2);
+  }
+
+  // Make sure we satisfy the complexity requirement in terms of the number of times the assignment
+  // operator is called.
+  {
+    Tracker tracker;
+    std::vector<TrackedAssignment> v;
+
+    // Set up the vector with 5 elements.
+    for (int i = 0; i != 5; ++i) {
+      v.emplace_back(&tracker);
+    }
+    assert(tracker.copy_assignments == 0);
+    assert(tracker.move_assignments == 0);
+
+    // Erase elements [1] and [2] from it. Elements [3] [4] should be shifted, so we should
+    // see 2 move assignments (and nothing else).
+    v.erase(v.begin() + 1, v.begin() + 3);
+    assert(v.size() == 3);
+    assert(tracker.copy_assignments == 0);
+    assert(tracker.move_assignments == 2);
+  }
+
   return 0;
 }

>From 062af36a6ced756c1f83b72fb26647ad5e8b7109 Mon Sep 17 00:00:00 2001
From: Louis Dionne <ldionne.2 at gmail.com>
Date: Thu, 14 Nov 2024 13:54:18 +0100
Subject: [PATCH 2/2] [libc++] Take advantage of trivial relocation in
 std::vector::erase

In vector::erase(iter) and vector::erase(iter, iter), we can take
advantage of a type being trivially relocatable to open up a gap
in the vector and then relocate the tail of the vector into that gap.

The benefit is that relocating an object is often more efficient than
move-assigning and then destroying the original object. For types that
can be relocated trivially but that are complicated enough for the
compiler not to optimize by itself (like std::string), this provides
around a 2x performance speedup in vector::erase (see below).

This optimization requires stopping the usage of Clang's __is_trivially_relocatable
builtin, which doesn't currently honour assignment operators like
is_trivially_copyable does and can lead us to perform incorrect
optimizations.

It is also worth noting that __uninitialized_allocator_relocate has to
be modified so that we can relocate into an overlapping range. This
has an unfortunate impact on its exception safety guarantees, which
needs to be investigated further.

   Previous implementation
   --------------------------------------------------------------------------------------
   Benchmark                                            Time             CPU   Iterations
   --------------------------------------------------------------------------------------
   BM_erase_iter_in_middle/vector_int/1024           24.9 ns         24.9 ns     28042962
   BM_erase_iter_in_middle/vector_int/4096            107 ns          107 ns      6590592
   BM_erase_iter_in_middle/vector_int/10240           271 ns          265 ns      2733478
   BM_erase_iter_in_middle/vector_string/1024         349 ns          349 ns      2005886
   BM_erase_iter_in_middle/vector_string/4096        1410 ns         1406 ns       498355
   BM_erase_iter_in_middle/vector_string/10240       3449 ns         3449 ns       201989
   BM_erase_iter_at_start/vector_int/1024            47.1 ns         47.1 ns     14836261
   BM_erase_iter_at_start/vector_int/4096             204 ns          204 ns      3430414
   BM_erase_iter_at_start/vector_int/10240            504 ns          504 ns      1391373
   BM_erase_iter_at_start/vector_string/1024          684 ns          684 ns      1025160
   BM_erase_iter_at_start/vector_string/4096         2855 ns         2806 ns       254080
   BM_erase_iter_at_start/vector_string/10240        7060 ns         7060 ns        94134

   New implementation
   --------------------------------------------------------------------------------------
   Benchmark                                            Time             CPU   Iterations
   --------------------------------------------------------------------------------------
   BM_erase_iter_in_middle/vector_int/1024           26.0 ns         25.9 ns     27127367
   BM_erase_iter_in_middle/vector_int/4096            105 ns          105 ns      6515204
   BM_erase_iter_in_middle/vector_int/10240           259 ns          258 ns      2800795
   BM_erase_iter_in_middle/vector_string/1024         148 ns          147 ns      4725706
   BM_erase_iter_in_middle/vector_string/4096         608 ns          606 ns      1168205
   BM_erase_iter_in_middle/vector_string/10240       1523 ns         1520 ns       459909
   BM_erase_iter_at_start/vector_int/1024            47.1 ns         47.1 ns     14762513
   BM_erase_iter_at_start/vector_int/4096             205 ns          205 ns      3403130
   BM_erase_iter_at_start/vector_int/10240            507 ns          507 ns      1382716
   BM_erase_iter_at_start/vector_string/1024          300 ns          300 ns      2327546
   BM_erase_iter_at_start/vector_string/4096         1205 ns         1205 ns       580855
   BM_erase_iter_at_start/vector_string/10240        4296 ns         4296 ns       162956
---
 libcxx/docs/ReleaseNotes/20.rst               |  3 +
 libcxx/include/CMakeLists.txt                 |  1 +
 .../is_trivially_allocator_relocatable.h      | 49 ++++++++++++++++
 .../__memory/uninitialized_algorithms.h       | 53 ++++++------------
 .../__type_traits/is_trivially_relocatable.h  | 10 ++--
 libcxx/include/__vector/vector.h              | 56 +++++++++++--------
 libcxx/include/module.modulemap               |  1 +
 7 files changed, 106 insertions(+), 67 deletions(-)
 create mode 100644 libcxx/include/__memory/is_trivially_allocator_relocatable.h

diff --git a/libcxx/docs/ReleaseNotes/20.rst b/libcxx/docs/ReleaseNotes/20.rst
index 9039c6f046445b..fdb862aa69da9f 100644
--- a/libcxx/docs/ReleaseNotes/20.rst
+++ b/libcxx/docs/ReleaseNotes/20.rst
@@ -52,6 +52,9 @@ Improvements and New Features
 - The ``lexicographical_compare`` and ``ranges::lexicographical_compare`` algorithms have been optimized for trivially
   equality comparable types, resulting in a performance improvement of up to 40x.
 
+- The ``std::vector::erase`` function has been optimized for types that can be relocated trivially (such as ``std::string``),
+  yielding speed ups witnessed to be around 2x for these types (but subject to the use case).
+
 - The ``_LIBCPP_ENABLE_CXX20_REMOVED_TEMPORARY_BUFFER`` macro has been added to make ``std::get_temporary_buffer`` and
   ``std::return_temporary_buffer`` available.
 
diff --git a/libcxx/include/CMakeLists.txt b/libcxx/include/CMakeLists.txt
index 1610d1ee848a5f..b076a3ed00781b 100644
--- a/libcxx/include/CMakeLists.txt
+++ b/libcxx/include/CMakeLists.txt
@@ -551,6 +551,7 @@ set(files
   __memory/construct_at.h
   __memory/destruct_n.h
   __memory/inout_ptr.h
+  __memory/is_trivially_allocator_relocatable.h
   __memory/noexcept_move_assign_container.h
   __memory/out_ptr.h
   __memory/pointer_traits.h
diff --git a/libcxx/include/__memory/is_trivially_allocator_relocatable.h b/libcxx/include/__memory/is_trivially_allocator_relocatable.h
new file mode 100644
index 00000000000000..c8f46661993bc5
--- /dev/null
+++ b/libcxx/include/__memory/is_trivially_allocator_relocatable.h
@@ -0,0 +1,49 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef _LIBCPP___MEMORY_IS_TRIVIALLY_ALLOCATOR_RELOCATABLE_H
+#define _LIBCPP___MEMORY_IS_TRIVIALLY_ALLOCATOR_RELOCATABLE_H
+
+#include <__config>
+#include <__memory/allocator_traits.h>
+#include <__type_traits/integral_constant.h>
+#include <__type_traits/is_trivially_relocatable.h>
+#include <__type_traits/negation.h>
+
+#if !defined(_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER)
+#  pragma GCC system_header
+#endif
+
+_LIBCPP_BEGIN_NAMESPACE_STD
+
+// A type is trivially allocator relocatable if the allocator's move construction and destruction
+// don't do anything beyond calling the type's move constructor and destructor, and if the type
+// itself is trivially relocatable.
+
+template <class _Alloc, class _Type>
+struct __allocator_has_trivial_move_construct : _Not<__has_construct<_Alloc, _Type*, _Type&&> > {};
+
+template <class _Type>
+struct __allocator_has_trivial_move_construct<allocator<_Type>, _Type> : true_type {};
+
+template <class _Alloc, class _Tp>
+struct __allocator_has_trivial_destroy : _Not<__has_destroy<_Alloc, _Tp*> > {};
+
+template <class _Tp, class _Up>
+struct __allocator_has_trivial_destroy<allocator<_Tp>, _Up> : true_type {};
+
+template <class _Alloc, class _Tp>
+struct __is_trivially_allocator_relocatable
+    : integral_constant<bool,
+                        __allocator_has_trivial_move_construct<_Alloc, _Tp>::value &&
+                            __allocator_has_trivial_destroy<_Alloc, _Tp>::value &&
+                            __libcpp_is_trivially_relocatable<_Tp>::value> {};
+
+_LIBCPP_END_NAMESPACE_STD
+
+#endif // _LIBCPP___MEMORY_IS_TRIVIALLY_ALLOCATOR_RELOCATABLE_H
diff --git a/libcxx/include/__memory/uninitialized_algorithms.h b/libcxx/include/__memory/uninitialized_algorithms.h
index 71c7ed94fec13e..5e7bb314aa03ce 100644
--- a/libcxx/include/__memory/uninitialized_algorithms.h
+++ b/libcxx/include/__memory/uninitialized_algorithms.h
@@ -20,6 +20,7 @@
 #include <__memory/addressof.h>
 #include <__memory/allocator_traits.h>
 #include <__memory/construct_at.h>
+#include <__memory/is_trivially_allocator_relocatable.h>
 #include <__memory/pointer_traits.h>
 #include <__type_traits/enable_if.h>
 #include <__type_traits/extent.h>
@@ -591,30 +592,19 @@ __uninitialized_allocator_copy(_Alloc& __alloc, _Iter1 __first1, _Sent1 __last1,
   return std::__rewrap_iter(__first2, __result);
 }
 
-template <class _Alloc, class _Type>
-struct __allocator_has_trivial_move_construct : _Not<__has_construct<_Alloc, _Type*, _Type&&> > {};
-
-template <class _Type>
-struct __allocator_has_trivial_move_construct<allocator<_Type>, _Type> : true_type {};
-
-template <class _Alloc, class _Tp>
-struct __allocator_has_trivial_destroy : _Not<__has_destroy<_Alloc, _Tp*> > {};
-
-template <class _Tp, class _Up>
-struct __allocator_has_trivial_destroy<allocator<_Tp>, _Up> : true_type {};
-
 // __uninitialized_allocator_relocate relocates the objects in [__first, __last) into __result.
+//
 // Relocation means that the objects in [__first, __last) are placed into __result as-if by move-construct and destroy,
 // except that the move constructor and destructor may never be called if they are known to be equivalent to a memcpy.
 //
-// Preconditions:  __result doesn't contain any objects and [__first, __last) contains objects
+// This algorithm works even if part of the resulting range overlaps with [__first, __last), as long as __result itself
+// is not in [__first, last).
+//
+// Preconditions:
+//  - __result doesn't contain any objects and [__first, __last) contains objects
+//  - __result is not in [__first, __last)  sd
 // Postconditions: __result contains the objects from [__first, __last) and
 //                 [__first, __last) doesn't contain any objects
-//
-// The strong exception guarantee is provided if any of the following are true:
-// - is_nothrow_move_constructible<_ValueType>
-// - is_copy_constructible<_ValueType>
-// - __libcpp_is_trivially_relocatable<_ValueType>
 template <class _Alloc, class _ContiguousIterator>
 _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 void __uninitialized_allocator_relocate(
     _Alloc& __alloc, _ContiguousIterator __first, _ContiguousIterator __last, _ContiguousIterator __result) {
@@ -622,29 +612,18 @@ _LIBCPP_HIDE_FROM_ABI _LIBCPP_CONSTEXPR_SINCE_CXX14 void __uninitialized_allocat
   using _ValueType = typename iterator_traits<_ContiguousIterator>::value_type;
   static_assert(__is_cpp17_move_insertable<_Alloc>::value,
                 "The specified type does not meet the requirements of Cpp17MoveInsertable");
-  if (__libcpp_is_constant_evaluated() || !__libcpp_is_trivially_relocatable<_ValueType>::value ||
-      !__allocator_has_trivial_move_construct<_Alloc, _ValueType>::value ||
-      !__allocator_has_trivial_destroy<_Alloc, _ValueType>::value) {
-    auto __destruct_first = __result;
-    auto __guard          = std::__make_exception_guard(
-        _AllocatorDestroyRangeReverse<_Alloc, _ContiguousIterator>(__alloc, __destruct_first, __result));
-    auto __iter = __first;
-    while (__iter != __last) {
-#if _LIBCPP_HAS_EXCEPTIONS
-      allocator_traits<_Alloc>::construct(__alloc, std::__to_address(__result), std::move_if_noexcept(*__iter));
-#else
-      allocator_traits<_Alloc>::construct(__alloc, std::__to_address(__result), std::move(*__iter));
-#endif
-      ++__iter;
+  if (__libcpp_is_constant_evaluated() || !__is_trivially_allocator_relocatable<_Alloc, _ValueType>::value) {
+    while (__first != __last) {
+      allocator_traits<_Alloc>::construct(__alloc, std::__to_address(__result), std::move(*__first));
+      allocator_traits<_Alloc>::destroy(__alloc, std::__to_address(__first));
+      ++__first;
       ++__result;
     }
-    __guard.__complete();
-    std::__allocator_destroy(__alloc, __first, __last);
   } else {
     // Casting to void* to suppress clang complaining that this is technically UB.
-    __builtin_memcpy(static_cast<void*>(std::__to_address(__result)),
-                     std::__to_address(__first),
-                     sizeof(_ValueType) * (__last - __first));
+    __builtin_memmove(static_cast<void*>(std::__to_address(__result)),
+                      std::__to_address(__first),
+                      sizeof(_ValueType) * (__last - __first));
   }
 }
 
diff --git a/libcxx/include/__type_traits/is_trivially_relocatable.h b/libcxx/include/__type_traits/is_trivially_relocatable.h
index c0871731cc0016..8130ba3b96ba5c 100644
--- a/libcxx/include/__type_traits/is_trivially_relocatable.h
+++ b/libcxx/include/__type_traits/is_trivially_relocatable.h
@@ -23,14 +23,12 @@ _LIBCPP_BEGIN_NAMESPACE_STD
 
 // A type is trivially relocatable if a move construct + destroy of the original object is equivalent to
 // `memcpy(dst, src, sizeof(T))`.
-
-#if __has_builtin(__is_trivially_relocatable)
-template <class _Tp, class = void>
-struct __libcpp_is_trivially_relocatable : integral_constant<bool, __is_trivially_relocatable(_Tp)> {};
-#else
+//
+// Note that we don't use Clang's __is_trivially_relocatable builtin because it doesn't honor the presence
+// of non-trivial special members like assignment operators, or even a copy constructor, making it possible
+// to incorrectly optimize operations that should call user-provided operations instead.
 template <class _Tp, class = void>
 struct __libcpp_is_trivially_relocatable : is_trivially_copyable<_Tp> {};
-#endif
 
 template <class _Tp>
 struct __libcpp_is_trivially_relocatable<_Tp,
diff --git a/libcxx/include/__vector/vector.h b/libcxx/include/__vector/vector.h
index 0e1b90e53064b8..e1f62ff8dc4495 100644
--- a/libcxx/include/__vector/vector.h
+++ b/libcxx/include/__vector/vector.h
@@ -34,6 +34,7 @@
 #include <__memory/allocator.h>
 #include <__memory/allocator_traits.h>
 #include <__memory/compressed_pair.h>
+#include <__memory/is_trivially_allocator_relocatable.h>
 #include <__memory/noexcept_move_assign_container.h>
 #include <__memory/pointer_traits.h>
 #include <__memory/swap_allocator.h>
@@ -515,8 +516,37 @@ class _LIBCPP_TEMPLATE_VIS vector {
   }
 #endif
 
-  _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI iterator erase(const_iterator __position);
-  _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI iterator erase(const_iterator __first, const_iterator __last);
+  _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI iterator erase(const_iterator __position) {
+    _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
+        __position != end(), "vector::erase(iterator) called with a non-dereferenceable iterator");
+    return erase(__position, __position + 1);
+  }
+  _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI iterator erase(const_iterator __cfirst, const_iterator __clast) {
+    _LIBCPP_ASSERT_VALID_INPUT_RANGE(__cfirst <= __clast, "vector::erase(first, last) called with invalid range");
+
+    iterator __first = begin() + std::distance(cbegin(), __cfirst);
+    iterator __last  = begin() + std::distance(cbegin(), __clast);
+    if (__first == __last)
+      return __last;
+
+    auto __n = std::distance(__first, __last);
+
+    // When the value_type is trivially relocatable, we know that move-assignment followed by a destruction
+    // is equivalent to a memcpy, and we can elide calls to the move-assignment operator (which are mandated
+    // by the Standard) under the as-if rule. So instead, we destroy the range being erased and we relocate the
+    // tail of the vector into the created gap.
+    if (__is_trivially_allocator_relocatable<_Allocator, value_type>::value) {
+      std::__allocator_destroy(this->__alloc_, __first, __last);
+      std::__uninitialized_allocator_relocate(this->__alloc_, __last, end(), __first);
+    } else {
+      auto __new_end = std::move(__last, end(), __first);
+      std::__allocator_destroy(this->__alloc_, __new_end, end());
+    }
+
+    this->__end_ -= __n;
+    __annotate_shrink(size() + __n);
+    return __first;
+  }
 
   _LIBCPP_CONSTEXPR_SINCE_CXX20 _LIBCPP_HIDE_FROM_ABI void clear() _NOEXCEPT {
     size_type __old_size = size();
@@ -1125,28 +1155,6 @@ _LIBCPP_CONSTEXPR_SINCE_CXX20 inline
 #endif
 }
 
-template <class _Tp, class _Allocator>
-_LIBCPP_CONSTEXPR_SINCE_CXX20 inline _LIBCPP_HIDE_FROM_ABI typename vector<_Tp, _Allocator>::iterator
-vector<_Tp, _Allocator>::erase(const_iterator __position) {
-  _LIBCPP_ASSERT_VALID_ELEMENT_ACCESS(
-      __position != end(), "vector::erase(iterator) called with a non-dereferenceable iterator");
-  difference_type __ps = __position - cbegin();
-  pointer __p          = this->__begin_ + __ps;
-  this->__destruct_at_end(std::move(__p + 1, this->__end_, __p));
-  return __make_iter(__p);
-}
-
-template <class _Tp, class _Allocator>
-_LIBCPP_CONSTEXPR_SINCE_CXX20 typename vector<_Tp, _Allocator>::iterator
-vector<_Tp, _Allocator>::erase(const_iterator __first, const_iterator __last) {
-  _LIBCPP_ASSERT_VALID_INPUT_RANGE(__first <= __last, "vector::erase(first, last) called with invalid range");
-  pointer __p = this->__begin_ + (__first - begin());
-  if (__first != __last) {
-    this->__destruct_at_end(std::move(__p + (__last - __first), this->__end_, __p));
-  }
-  return __make_iter(__p);
-}
-
 template <class _Tp, class _Allocator>
 _LIBCPP_CONSTEXPR_SINCE_CXX20 void
 vector<_Tp, _Allocator>::__move_range(pointer __from_s, pointer __from_e, pointer __to) {
diff --git a/libcxx/include/module.modulemap b/libcxx/include/module.modulemap
index cd08b2810e437b..6b63c36cb2cab6 100644
--- a/libcxx/include/module.modulemap
+++ b/libcxx/include/module.modulemap
@@ -1533,6 +1533,7 @@ module std [system] {
     module destruct_n                         { header "__memory/destruct_n.h" }
     module fwd                                { header "__fwd/memory.h" }
     module inout_ptr                          { header "__memory/inout_ptr.h" }
+    module is_trivially_allocator_relocatable { header "__memory/is_trivially_allocator_relocatable.h" }
     module noexcept_move_assign_container     { header "__memory/noexcept_move_assign_container.h" }
     module out_ptr                            { header "__memory/out_ptr.h" }
     module pointer_traits                     { header "__memory/pointer_traits.h" }



More information about the libcxx-commits mailing list