[libcxx-commits] [PATCH] D131395: [libc++] Implement `lexicographical_compare_three_way`
Konstantin Varlamov via Phabricator via libcxx-commits
libcxx-commits at lists.llvm.org
Thu Dec 8 16:41:33 PST 2022
var-const added a comment.
@avogelsgesang I tried out the benchmarks, and it looks like you were probably accidentally running the benchmarks on the debug version of the build (unfortunately, it's a very easy mistake to make since it's the default). Using the debug build, I get timings very similar to what you saw earlier:
-----------------------------------------------------------------------------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------------------------------------------------------------------------
BM_lexicographical_compare_three_way<IntPtr>/1 25.7 ns 25.7 ns 27179505
BM_lexicographical_compare_three_way<IntPtr>/4 64.1 ns 64.0 ns 10934595
BM_lexicographical_compare_three_way<IntPtr>/16 316 ns 249 ns 2924673
BM_lexicographical_compare_three_way<IntPtr>/64 880 ns 864 ns 820595
BM_lexicographical_compare_three_way<IntPtr>/256 3328 ns 3324 ns 211011
BM_lexicographical_compare_three_way<IntPtr>/1024 13151 ns 13145 ns 53256
BM_lexicographical_compare_three_way<IntPtr>/4096 52541 ns 52531 ns 13321
BM_lexicographical_compare_three_way<IntPtr>/16384 210420 ns 210398 ns 3342
BM_lexicographical_compare_three_way<IntPtr>/65536 847864 ns 847372 ns 825
BM_lexicographical_compare_three_way<IntPtr>/262144 3394918 ns 3393657 ns 207
BM_lexicographical_compare_three_way<IntPtr>/1048576 13569716 ns 13563941 ns 51
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/1 37.7 ns 37.7 ns 18632581
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/4 85.6 ns 85.6 ns 8158698
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/16 286 ns 285 ns 2474206
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/64 1036 ns 1035 ns 676002
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/256 4018 ns 4017 ns 167247
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/1024 16138 ns 15993 ns 43945
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/4096 63604 ns 63602 ns 10994
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/16384 255664 ns 255574 ns 2749
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/65536 1025497 ns 1025498 ns 683
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/262144 4113879 ns 4111688 ns 170
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/1048576 16455950 ns 16454558 ns 43
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/1 29.5 ns 29.5 ns 23748940
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/4 99.7 ns 99.7 ns 6798031
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/16 300 ns 300 ns 2339940
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/64 1116 ns 1115 ns 632037
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/256 4347 ns 4344 ns 161197
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/1024 17271 ns 17270 ns 40467
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/4096 68959 ns 68958 ns 10135
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/16384 276437 ns 276348 ns 2533
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/65536 1111640 ns 1111475 ns 629
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/262144 4451482 ns 4450650 ns 157
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/1048576 17807151 ns 17807154 ns 39
With a similar observation where `random_access_iterator<int*>` is consistently slower than `int*` (which makes sense in an unoptimized build).
Rerunning the build with `-DCMAKE_BUILD_TYPE=Release`, however, makes the difference go away -- now the timings are within the margin of error (not to mention many times faster):
BM_lexicographical_compare_three_way<IntPtr>/1 0.447 ns 0.444 ns 1000000000
BM_lexicographical_compare_three_way<IntPtr>/4 1.99 ns 1.96 ns 361161702
BM_lexicographical_compare_three_way<IntPtr>/16 5.86 ns 5.81 ns 121777252
BM_lexicographical_compare_three_way<IntPtr>/64 21.6 ns 21.4 ns 32736439
BM_lexicographical_compare_three_way<IntPtr>/256 97.6 ns 96.2 ns 7379296
BM_lexicographical_compare_three_way<IntPtr>/1024 346 ns 343 ns 2028815
BM_lexicographical_compare_three_way<IntPtr>/4096 1352 ns 1339 ns 515878
BM_lexicographical_compare_three_way<IntPtr>/16384 5356 ns 5312 ns 132878
BM_lexicographical_compare_three_way<IntPtr>/65536 21379 ns 21179 ns 32955
BM_lexicographical_compare_three_way<IntPtr>/262144 85883 ns 84892 ns 8327
BM_lexicographical_compare_three_way<IntPtr>/1048576 359011 ns 355722 ns 1956
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/1 0.448 ns 0.444 ns 1000000000
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/4 1.96 ns 1.94 ns 359261768
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/16 5.89 ns 5.83 ns 121641817
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/64 21.5 ns 21.3 ns 32731847
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/256 97.2 ns 96.2 ns 7337449
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/1024 346 ns 344 ns 2040876
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/4096 1347 ns 1336 ns 516457
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/16384 5378 ns 5329 ns 131757
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/65536 21296 ns 21168 ns 32939
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/262144 85516 ns 84770 ns 8164
BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/1048576 362302 ns 358839 ns 1979
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/1 0.576 ns 0.568 ns 1000000000
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/4 2.27 ns 2.26 ns 309139488
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/16 7.85 ns 7.80 ns 87860227
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/64 37.9 ns 37.7 ns 18627871
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/256 131 ns 130 ns 5375353
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/1024 503 ns 501 ns 1386276
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/4096 1995 ns 1986 ns 352332
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/16384 7947 ns 7919 ns 88986
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/65536 31716 ns 31642 ns 22141
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/262144 127179 ns 126551 ns 5454
BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/1048576 517207 ns 514653 ns 1360
I think that explains it -- we were assuming we're seeing optimized results which wasn't actually the case. It also means the code is doing the right thing, so there's no actual issue, which is great!
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D131395/new/
https://reviews.llvm.org/D131395
More information about the libcxx-commits
mailing list