[libcxx-commits] [libcxx] [libc++][ranges] optimize the performance of `ranges::starts_with` (PR #84570)
Xiaoyang Liu via libcxx-commits
libcxx-commits at lists.llvm.org
Mon Apr 8 01:01:03 PDT 2024
xiaoyang-sde wrote:
Here's the benchmark result of the unoptimized version of `ranges::starts_with` with vectorized `mismatch`:
```txt
2024-04-08T00:51:04-07:00
Running ./ranges_starts_with.libcxx.out
Run on (12 X 2496 MHz CPU s)
CPU Caches:
L1 Data 48 KiB (x6)
L1 Instruction 32 KiB (x6)
L2 Unified 1280 KiB (x6)
L3 Unified 18432 KiB (x1)
Load Average: 1.27, 2.29, 1.34
-----------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------------------------------------------------------
bm_starts_with_contiguous_iter_with_memcmp_optimization/16 3.09 ns 3.09 ns 212067553
bm_starts_with_contiguous_iter_with_memcmp_optimization/256 20.9 ns 20.9 ns 30603943
bm_starts_with_contiguous_iter_with_memcmp_optimization/4096 310 ns 310 ns 2176699
bm_starts_with_contiguous_iter_with_memcmp_optimization/65536 5394 ns 5394 ns 136959
bm_starts_with_contiguous_iter_with_memcmp_optimization/1048576 134816 ns 134817 ns 4916
bm_starts_with_contiguous_iter_with_memcmp_optimization/16777216 4792548 ns 4792472 ns 150
bm_starts_with_contiguous_iter/16 5.38 ns 5.38 ns 123790610
bm_starts_with_contiguous_iter/256 81.4 ns 81.4 ns 8541221
bm_starts_with_contiguous_iter/4096 1107 ns 1107 ns 615306
bm_starts_with_contiguous_iter/65536 16869 ns 16868 ns 41362
bm_starts_with_contiguous_iter/1048576 293850 ns 293845 ns 2276
bm_starts_with_contiguous_iter/16777216 6079325 ns 6079284 ns 108
bm_starts_with_random_access_iter/16 7.17 ns 7.04 ns 101903410
bm_starts_with_random_access_iter/256 100 ns 100 ns 6806306
bm_starts_with_random_access_iter/4096 1449 ns 1449 ns 482025
bm_starts_with_random_access_iter/65536 23967 ns 23967 ns 27789
bm_starts_with_random_access_iter/1048576 387731 ns 387727 ns 1752
bm_starts_with_random_access_iter/16777216 6773716 ns 6773727 ns 100
bm_starts_with_bidirectional_iter/16 6.95 ns 6.95 ns 97150439
bm_starts_with_bidirectional_iter/256 103 ns 101 ns 6666863
bm_starts_with_bidirectional_iter/4096 1459 ns 1459 ns 477527
bm_starts_with_bidirectional_iter/65536 23458 ns 23458 ns 29772
bm_starts_with_bidirectional_iter/1048576 405195 ns 405189 ns 1822
bm_starts_with_bidirectional_iter/16777216 6768824 ns 6768460 ns 99
bm_starts_with_forward_iter/16 6.86 ns 6.76 ns 99222099
bm_starts_with_forward_iter/256 99.3 ns 99.3 ns 6935465
bm_starts_with_forward_iter/4096 1455 ns 1455 ns 482635
bm_starts_with_forward_iter/65536 23077 ns 23077 ns 29802
bm_starts_with_forward_iter/1048576 379438 ns 379432 ns 1795
bm_starts_with_forward_iter/16777216 6722769 ns 6722515 ns 104
```
Here's the benchmark result of the unoptimized version of `ranges::starts_with` with vectorized `mismatch`:
```txt
2024-04-08T00:58:01-07:00
Running ./ranges_starts_with.libcxx.out
Run on (12 X 2496 MHz CPU s)
CPU Caches:
L1 Data 48 KiB (x6)
L1 Instruction 32 KiB (x6)
L2 Unified 1280 KiB (x6)
L3 Unified 18432 KiB (x1)
Load Average: 1.71, 2.34, 1.64
-----------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------------------------------------------------------
bm_starts_with_contiguous_iter_with_memcmp_optimization/16 1.46 ns 1.46 ns 470246496
bm_starts_with_contiguous_iter_with_memcmp_optimization/256 12.4 ns 12.4 ns 58289034
bm_starts_with_contiguous_iter_with_memcmp_optimization/4096 172 ns 172 ns 4097284
bm_starts_with_contiguous_iter_with_memcmp_optimization/65536 4276 ns 4276 ns 161096
bm_starts_with_contiguous_iter_with_memcmp_optimization/1048576 136488 ns 136487 ns 6118
bm_starts_with_contiguous_iter_with_memcmp_optimization/16777216 3955840 ns 3955785 ns 178
bm_starts_with_contiguous_iter/16 6.22 ns 6.22 ns 102345915
bm_starts_with_contiguous_iter/256 98.3 ns 98.3 ns 6888855
bm_starts_with_contiguous_iter/4096 1473 ns 1444 ns 479038
bm_starts_with_contiguous_iter/65536 23314 ns 23314 ns 29179
bm_starts_with_contiguous_iter/1048576 372080 ns 372078 ns 1873
bm_starts_with_contiguous_iter/16777216 7160219 ns 7149335 ns 103
bm_starts_with_random_access_iter/16 6.32 ns 6.32 ns 108101430
bm_starts_with_random_access_iter/256 108 ns 106 ns 6451084
bm_starts_with_random_access_iter/4096 1525 ns 1525 ns 451699
bm_starts_with_random_access_iter/65536 23463 ns 23463 ns 29878
bm_starts_with_random_access_iter/1048576 389128 ns 389126 ns 1882
bm_starts_with_random_access_iter/16777216 6599507 ns 6599458 ns 99
bm_starts_with_bidirectional_iter/16 6.54 ns 6.40 ns 92338661
bm_starts_with_bidirectional_iter/256 105 ns 105 ns 6695816
bm_starts_with_bidirectional_iter/4096 1519 ns 1519 ns 451521
bm_starts_with_bidirectional_iter/65536 23141 ns 23141 ns 30112
bm_starts_with_bidirectional_iter/1048576 398679 ns 398672 ns 1855
bm_starts_with_bidirectional_iter/16777216 6574845 ns 6574846 ns 107
bm_starts_with_forward_iter/16 6.43 ns 6.43 ns 103394439
bm_starts_with_forward_iter/256 105 ns 105 ns 6459829
bm_starts_with_forward_iter/4096 1529 ns 1529 ns 452248
bm_starts_with_forward_iter/65536 23380 ns 23380 ns 30325
bm_starts_with_forward_iter/1048576 399525 ns 399523 ns 1744
bm_starts_with_forward_iter/16777216 6918514 ns 6918404 ns 102
```
https://github.com/llvm/llvm-project/pull/84570
More information about the libcxx-commits
mailing list