[libcxx-commits] [libcxx] Optimize std::__tree::__assign_multi to insert the provided range at the end of the tree every time (PR #131030)

via libcxx-commits libcxx-commits at lists.llvm.org
Thu Apr 10 09:42:58 PDT 2025


higher-performance wrote:

@philnik777 oh. That's because this is an algorithmic improvement, which you can't assess by relying on microbenchmarks. They miss the bigger picture and give you highly misleading results.

To elaborate: The number of comparisons is being cut down significantly, and comparisons can be *expensive*. Not merely because of the container size (side note: 2^24 is really not "extremely large" for a tree of `size_t` in 2025, it's just 16 million) but because elements are not always CPU-friendly `size_t`s, and the comparisons themselves can be arbitrarily expensive.

To illustrate, just try running this:
```
#include <stddef.h>

#include <chrono>
#include <iostream>
#include <set>
#include <string>
#include <utility>

using Clock = std::chrono::high_resolution_clock;

int main() {
  std::set<std::string> a, b;
  for (size_t i = 0; i < 1000000; ++i) {
    std::string s = std::to_string(i);
    s.insert(0, 3000, 'h');
    b.insert(std::move(s));
  }
  Clock::time_point start = Clock::now();
  a = b;
  std::chrono::duration<double> diff = Clock::now() - start;
  std::cout << "Time taken: " << diff.count() << std::endl;
}
```

On my machine it's 1.65 ms vs. 5.81 ms, which is a rather catastrophic > 3x performance hit.
You should see a catastrophic performance hit with the old code ().

https://github.com/llvm/llvm-project/pull/131030


More information about the libcxx-commits mailing list