[libcxx-commits] [libcxx] Optimize __assign_with_sentinel in std::vector (PR #113852)

Tue Nov 12 08:06:59 PST 2024

================
@@ -48,6 +48,76 @@ void BM_Assignment(benchmark::State& st, Container) {
   }
 }
 
+// Wrap any Iterator into an input iterator
+template <typename Iterator>
+class InputIterator {
+  using iter_traits = std::iterator_traits<Iterator>;
+
+public:
+  using iterator_category = std::input_iterator_tag;
+  using value_type        = typename iter_traits::value_type;
+  using difference_type   = typename iter_traits::difference_type;
+  using pointer           = typename iter_traits::pointer;
+  using reference         = typename iter_traits::reference;
+
+  InputIterator(Iterator it) : current_(it) {}
+
+  reference operator*() { return *current_; }
+  InputIterator& operator++() {
+    ++current_;
+    return *this;
+  }
+  InputIterator operator++(int) {
+    InputIterator tmp = *this;
+    ++(*this);
+    return tmp;
+  }
+
+  friend bool operator==(const InputIterator& lhs, const InputIterator& rhs) { return lhs.current_ == rhs.current_; }
+  friend bool operator!=(const InputIterator& lhs, const InputIterator& rhs) { return !(lhs == rhs); }
+
+private:
+  Iterator current_;
+};
+
+template <typename Iterator>
+InputIterator<Iterator> make_input_iterator(Iterator it) {
+  return InputIterator<Iterator>(it);
+}
+
+template <class Container,
+          class GenInputs,
+          typename std::enable_if<std::is_trivial<typename Container::value_type>::value>::type* = nullptr>
+void BM_AssignInputIterIter(benchmark::State& st, Container c, GenInputs gen) {
+  auto in = gen(st.range(1));
+  benchmark::DoNotOptimize(&in);
+  for (auto _ : st) {
+    st.PauseTiming();
+    c.resize(st.range(0));
+    benchmark::DoNotOptimize(&c);
+    st.ResumeTiming();
+    c.assign(make_input_iterator(in.begin()), make_input_iterator(in.end()));
+    benchmark::ClobberMemory();
----------------
winner245 wrote:

I am not sure if the question was "Why use `DoNotOptimize()` and `ClobberMemory()`?" If that is the case, here is my answer.

I used both `DoNotOptimize` and `ClobberMemory()` to ensure that the compiler does not optimize away any operations or memory allocations, so that my measurements accurately reflect the actual performance. These functions only add an extra layer of guarantee that the benchmark measures what it is supposed to. I noticed that all the existing benchmark tests in `ContainerBenchmarks.h` have used these functions, so I just followed the same approach. Additionally, I used the `optimization=speed` option when executing the benchmark tests. I believe the measurements I obtained now are accurate. 

I have also done some extra tests by removing `DoNotOptimize` and `ClobberMemory()` in my tests, in case you are curious about the results. 

- Before
![before1](https://github.com/user-attachments/assets/0db577a4-3859-4422-86dc-85c665bca555)


- After
![after1](https://github.com/user-attachments/assets/44734ae6-a2f1-4755-9262-afcd0959c401)

These results are consistent with my tests which use `DoNotOptimize` and `ClobberMemory()`.  I believe that we lose nothing by using these functions even if the compiler does not optimize away anything. These functions only make my benchmark tests more reliable.

If I did not interpret your question correctly, please feel free to clarify.

https://github.com/llvm/llvm-project/pull/113852