[libcxx-commits] [libcxx] [libc++] Refactor the sequence container benchmarks (PR #119763)

Peng Liu via libcxx-commits libcxx-commits at lists.llvm.org
Fri Jan 17 06:21:13 PST 2025


================
@@ -0,0 +1,608 @@
+// -*- C++ -*-
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef TEST_BENCHMARKS_CONTAINERS_CONTAINER_BENCHMARKS_H
+#define TEST_BENCHMARKS_CONTAINERS_CONTAINER_BENCHMARKS_H
+
+#include <algorithm>
+#include <cstddef>
+#include <iterator>
+#include <ranges> // for std::from_range
+#include <string>
+#include <type_traits>
+#include <vector>
+
+#include "benchmark/benchmark.h"
+#include "test_iterators.h"
+#include "test_macros.h"
+#include "../GenerateInput.h"
+
+namespace ContainerBenchmarks {
+
+template <class Container>
+void DoNotOptimizeData(Container& c) {
+  if constexpr (requires { c.data(); }) {
+    benchmark::DoNotOptimize(c.data());
+  } else {
+    benchmark::DoNotOptimize(&c);
+  }
+}
+
+//
+// Sequence container operations
+//
+template <class Container>
+void BM_ctor_size(benchmark::State& st) {
+  auto size = st.range(0);
+
+  for (auto _ : st) {
+    Container c(size); // we assume the destructor doesn't dominate the benchmark
+    DoNotOptimizeData(c);
+  }
+}
+
+template <class Container, class Generator>
+void BM_ctor_size_value(benchmark::State& st, Generator gen) {
+  using ValueType = typename Container::value_type;
+  const auto size = st.range(0);
+  ValueType value = gen();
+  benchmark::DoNotOptimize(value);
+
+  for (auto _ : st) {
+    Container c(size, value); // we assume the destructor doesn't dominate the benchmark
+    DoNotOptimizeData(c);
+  }
+}
+
+template <class Container, class Generator>
+void BM_ctor_iter_iter(benchmark::State& st, Generator gen) {
+  using ValueType = typename Container::value_type;
+  const auto size = st.range(0);
+  std::vector<ValueType> in;
+  std::generate_n(std::back_inserter(in), size, gen);
+  const auto begin = in.begin();
+  const auto end   = in.end();
+  benchmark::DoNotOptimize(in);
+
+  for (auto _ : st) {
+    Container c(begin, end); // we assume the destructor doesn't dominate the benchmark
+    DoNotOptimizeData(c);
+  }
+}
+
+#if TEST_STD_VER >= 23
+template <class Container, class Generator>
+void BM_ctor_from_range(benchmark::State& st, Generator gen) {
+  using ValueType = typename Container::value_type;
+  const auto size = st.range(0);
+  std::vector<ValueType> in;
+  std::generate_n(std::back_inserter(in), size, gen);
+  benchmark::DoNotOptimize(in);
+
+  for (auto _ : st) {
+    Container c(std::from_range, in); // we assume the destructor doesn't dominate the benchmark
+    DoNotOptimizeData(c);
+  }
+}
+#endif
+
+template <class Container, class Generator>
+void BM_ctor_copy(benchmark::State& st, Generator gen) {
+  auto size = st.range(0);
+  Container in;
+  std::generate_n(std::back_inserter(in), size, gen);
+  DoNotOptimizeData(in);
+
+  for (auto _ : st) {
+    Container c(in); // we assume the destructor doesn't dominate the benchmark
+    DoNotOptimizeData(c);
+    DoNotOptimizeData(in);
+  }
+}
+
+template <class Container, class Generator>
+void BM_assignment(benchmark::State& st, Generator gen) {
+  auto size = st.range(0);
+  Container in1, in2;
+  std::generate_n(std::back_inserter(in1), size, gen);
+  std::generate_n(std::back_inserter(in2), size, gen);
+  DoNotOptimizeData(in1);
+  DoNotOptimizeData(in2);
+
+  // Assign from one of two containers in succession to avoid
+  // hitting a self-assignment corner-case
+  Container c(in1);
+  bool toggle = false;
+  for (auto _ : st) {
+    c      = toggle ? in1 : in2;
+    toggle = !toggle;
+    DoNotOptimizeData(c);
+    DoNotOptimizeData(in1);
+    DoNotOptimizeData(in2);
+  }
+}
+
+// Benchmark Container::assign(input-iter, input-iter) when the container already contains
+// the same number of elements that we're assigning. The intent is to check whether the
+// implementation basically creates a new container from scratch or manages to reuse the
+// pre-existing storage.
+template <typename Container, class Generator>
+void BM_assign_input_iter_full(benchmark::State& st, Generator gen) {
+  using ValueType = typename Container::value_type;
+  auto size       = st.range(0);
+  std::vector<ValueType> in1, in2;
+  std::generate_n(std::back_inserter(in1), size, gen);
+  std::generate_n(std::back_inserter(in2), size, gen);
+  DoNotOptimizeData(in1);
+  DoNotOptimizeData(in2);
+
+  Container c(in1.begin(), in1.end());
+  bool toggle = false;
+  for (auto _ : st) {
+    std::vector<ValueType>& in = toggle ? in1 : in2;
+    auto first                 = in.data();
+    auto last                  = in.data() + in.size();
+    c.assign(cpp17_input_iterator(first), cpp17_input_iterator(last));
+    toggle = !toggle;
+    DoNotOptimizeData(c);
+  }
+}
+
+template <class Container, class Generator>
+void BM_insert_begin(benchmark::State& st, Generator gen) {
+  using ValueType = typename Container::value_type;
+  const int size  = st.range(0);
+  std::vector<ValueType> in;
+  std::generate_n(std::back_inserter(in), size, gen);
+  DoNotOptimizeData(in);
+
+  Container c(in.begin(), in.end());
+  DoNotOptimizeData(c);
+
+  ValueType value = gen();
+  benchmark::DoNotOptimize(value);
+
+  for (auto _ : st) {
+    c.insert(c.begin(), value);
+    DoNotOptimizeData(c);
+
+    c.erase(std::prev(c.end())); // avoid growing indefinitely
+  }
+}
+
+template <class Container, class Generator>
+  requires std::random_access_iterator<typename Container::iterator>
+void BM_insert_middle(benchmark::State& st, Generator gen) {
+  using ValueType = typename Container::value_type;
+  const int size  = st.range(0);
+  std::vector<ValueType> in;
+  std::generate_n(std::back_inserter(in), size, gen);
+  DoNotOptimizeData(in);
+
+  Container c(in.begin(), in.end());
+  DoNotOptimizeData(c);
+
+  ValueType value = gen();
+  benchmark::DoNotOptimize(value);
+
+  for (auto _ : st) {
+    auto mid = c.begin() + (size / 2); // requires random-access iterators in order to make sense
+    c.insert(mid, value);
+    DoNotOptimizeData(c);
+
+    c.erase(c.end() - 1); // avoid growing indefinitely
+  }
+}
+
+// Insert at the start of a vector in a scenario where the vector already
+// has enough capacity to hold all the elements we are inserting.
+template <class Container, class Generator>
+void BM_insert_begin_input_iter_with_reserve_no_realloc(benchmark::State& st, Generator gen) {
+  using ValueType = typename Container::value_type;
+  const int size  = st.range(0);
+  std::vector<ValueType> in;
+  std::generate_n(std::back_inserter(in), size, gen);
+  DoNotOptimizeData(in);
+  auto first = in.data();
+  auto last  = in.data() + in.size();
+
+  const int small = 100; // arbitrary
+  Container c;
+  c.reserve(size + small); // ensure no reallocation
+  std::generate_n(std::back_inserter(c), small, gen);
+
+  for (auto _ : st) {
+    c.insert(c.begin(), cpp17_input_iterator(first), cpp17_input_iterator(last));
+    DoNotOptimizeData(c);
+
+    st.PauseTiming();
+    c.erase(c.begin() + small, c.end()); // avoid growing indefinitely
+    st.ResumeTiming();
+  }
+}
+
+// Insert at the start of a vector in a scenario where the vector already
+// has almost enough capacity to hold all the elements we are inserting,
+// but does need to reallocate.
+template <class Container, class Generator>
+void BM_insert_begin_input_iter_with_reserve_almost_no_realloc(benchmark::State& st, Generator gen) {
+  using ValueType = typename Container::value_type;
+  const int size  = st.range(0);
+  std::vector<ValueType> in;
+  std::generate_n(std::back_inserter(in), size, gen);
+  DoNotOptimizeData(in);
+  auto first = in.data();
+  auto last  = in.data() + in.size();
+
+  const int overflow = size / 10; // 10% of elements won't fit in the vector when we insert
+  Container c;
+  c.reserve(size);
+  std::generate_n(std::back_inserter(c), overflow, gen);
+
+  for (auto _ : st) {
+    c.insert(c.begin(), cpp17_input_iterator(first), cpp17_input_iterator(last));
+    DoNotOptimizeData(c);
+
+    st.PauseTiming();
+    c.erase(c.begin() + overflow, c.end()); // avoid growing indefinitely
+    st.ResumeTiming();
+  }
+}
+
+// Insert at the start of a vector in a scenario where the vector can fit a few
+// more elements, but needs to reallocate almost immediately to fit the remaining
+// elements.
+template <class Container, class Generator>
+void BM_insert_begin_input_iter_with_reserve_near_full(benchmark::State& st, Generator gen) {
+  using ValueType = typename Container::value_type;
+  const int size  = st.range(0);
+  std::vector<ValueType> in;
+  std::generate_n(std::back_inserter(in), size, gen);
+  DoNotOptimizeData(in);
+  auto first = in.data();
+  auto last  = in.data() + in.size();
+
+  const int overflow = 9 * (size / 10); // 90% of elements won't fit in the vector when we insert
+  Container c;
+  c.reserve(size);
+  std::generate_n(std::back_inserter(c), overflow, gen);
+
+  for (auto _ : st) {
+    c.insert(c.begin(), cpp17_input_iterator(first), cpp17_input_iterator(last));
+    DoNotOptimizeData(c);
+
+    st.PauseTiming();
+    c.erase(c.begin() + overflow, c.end()); // avoid growing indefinitely
+    st.ResumeTiming();
+  }
----------------
winner245 wrote:

I like the idea of supporting generator everywhere. I also like the detailed comments you added for these three tests, which clearly state the purposes of the tests. The `reserve_no_realloc` test LGTM. 

For the rest two tests `reserve_almost_no_realloc` and `reserve_near_full`, destroying the whole container each time is effectively avoided, which is great. IMO, both tests should work well for `deque`, but may not behave as expected for `vector`. 

For `deque`, these two tests operate as described in the comments. This is because insertion is performed immediately after deletion, and deletion in `deque` usually leads to capacity shrinking. So the insertion post deletion would result in reallocation every time, which serve the reallocation purpose of these tests. Meanwhile, the same `deque` is used, which effectively avoids destroying the container each time. So both tests should work well for `deque`. 

However, it seems that we still face a dilemma for `vector` here:

- On one hand, we want to avoid destroying the vector every time.
- On the other hand, we want to test the reallocation scenario each time.

It seems impossible to achieve both goals simultaneously for `vector` (unlike `deque`). This is because `vector` uses contiguous memory, so removing elements usually does not lead to shrinking of vector capacity. In this case, reallocation only occurs during the first insertion. Subsequent insertions to the same vector are performed after removing the number of elements being inserted, so no further reallocations take place (unlike `deque`). This means the two tests only test reallocation the first time and reduce to no reallocation subsequently for `vector`. 

Conversely, if we want reallocation to happen every time, we need a new `vector` each time with a capacity smaller than the amount to be inserted. However, this makes destroying the old `vector` inevitable.



https://github.com/llvm/llvm-project/pull/119763


More information about the libcxx-commits mailing list