[libcxx-commits] [libcxx] [libcxx] Added segmented iterator for count_if (PR #105888)

via libcxx-commits libcxx-commits at lists.llvm.org
Mon Sep 2 01:47:18 PDT 2024


adeel10x wrote:

> @philnik777
> 
> **Benchmark Results**
> 
> No Segmented Iterators:
> 
> ```
> --------------------------------------------------------------------
> Benchmark                          Time             CPU   Iterations
> --------------------------------------------------------------------
> bm_deque_count_if/1             3.08 ns         3.08 ns    226984141
> bm_deque_count_if/2             4.46 ns         4.46 ns    156429484
> bm_deque_count_if/3             5.98 ns         5.98 ns    115885151
> bm_deque_count_if/4             7.52 ns         7.52 ns     93043742
> bm_deque_count_if/5             9.05 ns         9.05 ns     75287677
> bm_deque_count_if/6             10.4 ns         10.4 ns     66611545
> bm_deque_count_if/7             11.7 ns         11.7 ns     59514310
> bm_deque_count_if/8             13.2 ns         13.2 ns     53095296
> bm_deque_count_if/16            25.1 ns         25.1 ns     27443718
> bm_deque_count_if/64            90.9 ns         90.9 ns      7570199
> bm_deque_count_if/512            576 ns          575 ns      1199055
> bm_deque_count_if/4096          4512 ns         4510 ns       157707
> bm_deque_count_if/32768        35522 ns        35516 ns        18694
> bm_deque_count_if/262144      284234 ns       284217 ns         2443
> bm_deque_count_if/1048576    1137323 ns      1137278 ns          613
> ```
> 
> With Segmented Iterators:
> 
> ```
> --------------------------------------------------------------------
> Benchmark                          Time             CPU   Iterations
> --------------------------------------------------------------------
> bm_deque_count_if/1             3.28 ns         3.28 ns    211244390
> bm_deque_count_if/2             5.23 ns         5.23 ns    173814635
> bm_deque_count_if/3             6.44 ns         6.32 ns    137650491
> bm_deque_count_if/4             6.69 ns         6.69 ns    100575185
> bm_deque_count_if/5             7.61 ns         7.60 ns     88425507
> bm_deque_count_if/6             8.54 ns         8.54 ns     81768351
> bm_deque_count_if/7             9.37 ns         9.37 ns     73757989
> bm_deque_count_if/8             9.41 ns         9.41 ns     74463399
> bm_deque_count_if/16            13.5 ns         13.5 ns     51855512
> bm_deque_count_if/64            42.3 ns         42.3 ns     16525076
> bm_deque_count_if/512            322 ns          322 ns      2182643
> bm_deque_count_if/4096          2503 ns         2502 ns       279781
> bm_deque_count_if/32768        19837 ns        19834 ns        34882
> bm_deque_count_if/262144      158401 ns       158381 ns         4410
> bm_deque_count_if/1048576     635963 ns       635955 ns         1078
> ```

The results are for the **Release** build. 
Do you have any idea why the improvement in performance is so minimal? From what I've observed, using segmented iterators usually causes
 35-40x improvement.

https://github.com/llvm/llvm-project/pull/105888


More information about the libcxx-commits mailing list