[libcxx-commits] [PATCH] D131395: [libc++] Implement `lexicographical_compare_three_way`

Konstantin Varlamov via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Thu Dec 8 16:41:33 PST 2022


var-const added a comment.

@avogelsgesang I tried out the benchmarks, and it looks like you were probably accidentally running the benchmarks on the debug version of the build (unfortunately, it's a very easy mistake to make since it's the default). Using the debug build, I get timings very similar to what you saw earlier:

  -----------------------------------------------------------------------------------------------------------------------
  Benchmark                                                                             Time             CPU   Iterations
  -----------------------------------------------------------------------------------------------------------------------
  BM_lexicographical_compare_three_way<IntPtr>/1                                     25.7 ns         25.7 ns     27179505
  BM_lexicographical_compare_three_way<IntPtr>/4                                     64.1 ns         64.0 ns     10934595
  BM_lexicographical_compare_three_way<IntPtr>/16                                     316 ns          249 ns      2924673
  BM_lexicographical_compare_three_way<IntPtr>/64                                     880 ns          864 ns       820595
  BM_lexicographical_compare_three_way<IntPtr>/256                                   3328 ns         3324 ns       211011
  BM_lexicographical_compare_three_way<IntPtr>/1024                                 13151 ns        13145 ns        53256
  BM_lexicographical_compare_three_way<IntPtr>/4096                                 52541 ns        52531 ns        13321
  BM_lexicographical_compare_three_way<IntPtr>/16384                               210420 ns       210398 ns         3342
  BM_lexicographical_compare_three_way<IntPtr>/65536                               847864 ns       847372 ns          825
  BM_lexicographical_compare_three_way<IntPtr>/262144                             3394918 ns      3393657 ns          207
  BM_lexicographical_compare_three_way<IntPtr>/1048576                           13569716 ns     13563941 ns           51
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/1             37.7 ns         37.7 ns     18632581
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/4             85.6 ns         85.6 ns      8158698
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/16             286 ns          285 ns      2474206
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/64            1036 ns         1035 ns       676002
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/256           4018 ns         4017 ns       167247
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/1024         16138 ns        15993 ns        43945
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/4096         63604 ns        63602 ns        10994
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/16384       255664 ns       255574 ns         2749
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/65536      1025497 ns      1025498 ns          683
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/262144     4113879 ns      4111688 ns          170
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/1048576   16455950 ns     16454558 ns           43
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/1               29.5 ns         29.5 ns     23748940
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/4               99.7 ns         99.7 ns      6798031
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/16               300 ns          300 ns      2339940
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/64              1116 ns         1115 ns       632037
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/256             4347 ns         4344 ns       161197
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/1024           17271 ns        17270 ns        40467
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/4096           68959 ns        68958 ns        10135
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/16384         276437 ns       276348 ns         2533
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/65536        1111640 ns      1111475 ns          629
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/262144       4451482 ns      4450650 ns          157
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/1048576     17807151 ns     17807154 ns           39

With a similar observation where `random_access_iterator<int*>` is consistently slower than `int*` (which makes sense in an unoptimized build).

Rerunning the build with `-DCMAKE_BUILD_TYPE=Release`, however, makes the difference go away -- now the timings are within the margin of error (not to mention many times faster):

  BM_lexicographical_compare_three_way<IntPtr>/1                                    0.447 ns        0.444 ns   1000000000
  BM_lexicographical_compare_three_way<IntPtr>/4                                     1.99 ns         1.96 ns    361161702
  BM_lexicographical_compare_three_way<IntPtr>/16                                    5.86 ns         5.81 ns    121777252
  BM_lexicographical_compare_three_way<IntPtr>/64                                    21.6 ns         21.4 ns     32736439
  BM_lexicographical_compare_three_way<IntPtr>/256                                   97.6 ns         96.2 ns      7379296
  BM_lexicographical_compare_three_way<IntPtr>/1024                                   346 ns          343 ns      2028815
  BM_lexicographical_compare_three_way<IntPtr>/4096                                  1352 ns         1339 ns       515878
  BM_lexicographical_compare_three_way<IntPtr>/16384                                 5356 ns         5312 ns       132878
  BM_lexicographical_compare_three_way<IntPtr>/65536                                21379 ns        21179 ns        32955
  BM_lexicographical_compare_three_way<IntPtr>/262144                               85883 ns        84892 ns         8327
  BM_lexicographical_compare_three_way<IntPtr>/1048576                             359011 ns       355722 ns         1956
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/1            0.448 ns        0.444 ns   1000000000
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/4             1.96 ns         1.94 ns    359261768
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/16            5.89 ns         5.83 ns    121641817
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/64            21.5 ns         21.3 ns     32731847
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/256           97.2 ns         96.2 ns      7337449
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/1024           346 ns          344 ns      2040876
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/4096          1347 ns         1336 ns       516457
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/16384         5378 ns         5329 ns       131757
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/65536        21296 ns        21168 ns        32939
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/262144       85516 ns        84770 ns         8164
  BM_lexicographical_compare_three_way<random_access_iterator<IntPtr>>/1048576     362302 ns       358839 ns         1979
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/1              0.576 ns        0.568 ns   1000000000
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/4               2.27 ns         2.26 ns    309139488
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/16              7.85 ns         7.80 ns     87860227
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/64              37.9 ns         37.7 ns     18627871
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/256              131 ns          130 ns      5375353
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/1024             503 ns          501 ns      1386276
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/4096            1995 ns         1986 ns       352332
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/16384           7947 ns         7919 ns        88986
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/65536          31716 ns        31642 ns        22141
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/262144        127179 ns       126551 ns         5454
  BM_lexicographical_compare_three_way<cpp17_input_iterator<IntPtr>>/1048576       517207 ns       514653 ns         1360

I think that explains it -- we were assuming we're seeing optimized results which wasn't actually the case. It also means the code is doing the right thing, so there's no actual issue, which is great!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D131395/new/

https://reviews.llvm.org/D131395



More information about the libcxx-commits mailing list