[libcxx-commits] [libcxx] [libc++] Optimize make_heap() and sift_down() (PR #121480)
Yang Kun via libcxx-commits
libcxx-commits at lists.llvm.org
Fri Jan 3 02:01:59 PST 2025
omikrun wrote:
> Please provide the benchmark you ran. You can see examples in `libcxx/test/benchmarks`.
Oh, these results are the time taken to call `std::make_heap` on array of 10 million random elements. I applied [these changes](https://github.com/llvm/llvm-project/pull/121480.diff) manually to the headers in my toolchain (MSYS2).
The source code of benchmarks just like this.
```cpp
#include <algorithm>
#include <chrono>
#include <numeric>
#include <print>
#include <random>
#include <vector>
void make_heap_before(std::vector<int>::iterator first, std::vector<int>::iterator last);
void make_heap_after(std::vector<int>::iterator first, std::vector<int>::iterator last);
class timer {
std::string_view m_msg;
std::chrono::high_resolution_clock::time_point m_start = std::chrono::high_resolution_clock::now();
public:
explicit timer(std::string_view msg) :
m_msg(msg) {}
~timer() {
std::println("{}: {}", m_msg, std::chrono::high_resolution_clock::now() - m_start);
}
};
int main() {
std::vector<int> vec1(10000000), vec2;
std::iota(vec1.begin(), vec1.end(), 0);
std::shuffle(vec1.begin(), vec1.end(), std::random_device{});
vec2 = vec1;
{
timer t{"Before"};
make_heap_before(vec1.begin(), vec1.end());
}
{
timer t{"After"};
make_heap_after(vec2.begin(), vec2.end());
}
return 0;
}
```
```cpp
// make_heap_before.cc
#include <algorithm>
#include <vector>
void make_heap_before(std::vector<int>::iterator first, std::vector<int>::iterator last) {
std::make_heap(first, last);
}
```
```cpp
// make_heap_after.cc
#include <algorithm>
#include <vector>
void make_heap_before(std::vector<int>::iterator first, std::vector<int>::iterator last) {
std::make_heap(first, last);
}
```
`make_heap_before.cc` was being compiled before I applied these changes and `make_heap_after.cc` was being compiled after I applied these changes.
I've also ran the benchmark program in libcxx/test/benchmarks/algorithms/make_heap.bench.cpp, here are partial results taken from the output.
```text
Running make_heap.bench.before.exe
Run on (16 X 3194 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x8)
L1 Instruction 32 KiB (x8)
L2 Unified 512 KiB (x8)
L3 Unified 16384 KiB (x1)
--------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------------------------------------------------
BM_MakeHeap_uint32_Random_1 1.02 ns 0.958 ns 783024128
BM_MakeHeap_uint32_Random_4 3.57 ns 3.66 ns 234881024
BM_MakeHeap_uint32_Random_16 3.85 ns 3.67 ns 195821568
BM_MakeHeap_uint32_Random_64 3.91 ns 3.91 ns 167772160
BM_MakeHeap_uint32_Random_256 3.95 ns 3.75 ns 195821568
BM_MakeHeap_uint32_Random_1024 3.95 ns 3.91 ns 195821568
BM_MakeHeap_uint32_Random_16384 4.07 ns 4.07 ns 195821568
BM_MakeHeap_uint32_Random_262144 4.52 ns 4.47 ns 146800640
```
```text
Running make_heap.bench.after.exe
Run on (16 X 3194 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x8)
L1 Instruction 32 KiB (x8)
L2 Unified 512 KiB (x8)
L3 Unified 16384 KiB (x1)
--------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------------------------------------------------
BM_MakeHeap_uint32_Random_1 1.00 ns 1.16 ns 618135552
BM_MakeHeap_uint32_Random_4 3.36 ns 3.45 ns 167772160
BM_MakeHeap_uint32_Random_16 3.12 ns 3.06 ns 234881024
BM_MakeHeap_uint32_Random_64 2.80 ns 2.93 ns 293601280
BM_MakeHeap_uint32_Random_256 2.56 ns 2.55 ns 293601280
BM_MakeHeap_uint32_Random_1024 2.50 ns 2.55 ns 293601280
BM_MakeHeap_uint32_Random_16384 2.57 ns 2.50 ns 262144000
BM_MakeHeap_uint32_Random_262144 3.18 ns 3.19 ns 293601280
```
I think it is better than the original ones.
https://github.com/llvm/llvm-project/pull/121480
More information about the libcxx-commits
mailing list