[libcxx-commits] [libcxx] Optimize std::__tree::__assign_multi to insert the provided range at the end of the tree every time (PR #131030)
via libcxx-commits
libcxx-commits at lists.llvm.org
Thu Apr 10 09:42:58 PDT 2025
higher-performance wrote:
@philnik777 oh. That's because this is an algorithmic improvement, which you can't assess by relying on microbenchmarks. They miss the bigger picture and give you highly misleading results.
To elaborate: The number of comparisons is being cut down significantly, and comparisons can be *expensive*. Not merely because of the container size (side note: 2^24 is really not "extremely large" for a tree of `size_t` in 2025, it's just 16 million) but because elements are not always CPU-friendly `size_t`s, and the comparisons themselves can be arbitrarily expensive.
To illustrate, just try running this:
```
#include <stddef.h>
#include <chrono>
#include <iostream>
#include <set>
#include <string>
#include <utility>
using Clock = std::chrono::high_resolution_clock;
int main() {
std::set<std::string> a, b;
for (size_t i = 0; i < 1000000; ++i) {
std::string s = std::to_string(i);
s.insert(0, 3000, 'h');
b.insert(std::move(s));
}
Clock::time_point start = Clock::now();
a = b;
std::chrono::duration<double> diff = Clock::now() - start;
std::cout << "Time taken: " << diff.count() << std::endl;
}
```
On my machine it's 1.65 ms vs. 5.81 ms, which is a rather catastrophic > 3x performance hit.
You should see a catastrophic performance hit with the old code ().
https://github.com/llvm/llvm-project/pull/131030
More information about the libcxx-commits
mailing list