[all-commits] [llvm/llvm-project] 4eddbf: std::sort: add BlockQuickSort partitioning algorit...

Nilay Vaish via All-commits all-commits at lists.llvm.org
Thu Dec 22 14:47:39 PST 2022


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 4eddbf9f10a6d1881c93d84f4363d6d881daf848
      https://github.com/llvm/llvm-project/commit/4eddbf9f10a6d1881c93d84f4363d6d881daf848
  Author: Nilay Vaish <nilayvaish at google.com>
  Date:   2022-12-22 (Thu, 22 Dec 2022)

  Changed paths:
    M libcxx/include/__algorithm/sort.h
    M libcxx/include/__bits

  Log Message:
  -----------
  std::sort: add BlockQuickSort partitioning algorithm for arithmetic types

This diff modifies std::sort in two ways:

* for arithmetic types we update the core partitioning algorithm to use
BlockQuickSort for partitioning. The partition function was carefully
written to let the compiler generates SIMD instructions without actually
writing SIMD intrinsics in the loop. We see up to 50% better performance
for sorting arithmetic types. The use of the BlockQuickSort partitioning
has been limited to arithmetic types since the algorithm works well when
branch instructions can be avoided during partitioning. This usually not
true for types other than the arithmetic ones.

* for other types (tuples, strings) updates have been made to improve
performance by about 10%.  Performance numbers comparing std::sort (old)
and Bitset sort (new) on libcxx benchmark.

name                                                             old cpu/op  new cpu/op  delta
BM_Sort_uint32_Random_1                                          3.72ns ± 5%  3.78ns ±16%      ~     (p=0.819 n=36+34)
BM_Sort_uint32_Random_4                                          5.42ns ± 5%  5.29ns ± 7%    -2.42%  (p=0.000 n=35+31)
BM_Sort_uint32_Random_16                                         10.5ns ± 3%  11.9ns ±15%   +13.08%  (p=0.000 n=36+40)
BM_Sort_uint32_Random_64                                         18.6ns ± 7%  18.5ns ±15%    -0.95%  (p=0.002 n=33+40)
BM_Sort_uint32_Random_256                                        26.2ns ± 4%  21.3ns ± 8%   -18.89%  (p=0.000 n=37+34)
BM_Sort_uint32_Random_1024                                       33.4ns ± 5%  23.3ns ± 4%   -30.37%  (p=0.000 n=39+35)
BM_Sort_uint32_Random_16384                                      47.7ns ± 5%  26.7ns ± 5%   -44.06%  (p=0.000 n=39+35)
BM_Sort_uint32_Random_262144                                     62.6ns ± 3%  30.1ns ± 6%   -51.81%  (p=0.000 n=37+36)
BM_Sort_uint32_Ascending_1                                       3.71ns ± 3%  4.28ns ± 3%   +15.53%  (p=0.000 n=37+35)
BM_Sort_uint32_Ascending_4                                       1.47ns ± 3%  1.46ns ± 3%      ~     (p=0.083 n=36+37)
BM_Sort_uint32_Ascending_16                                      0.93ns ± 4%  1.02ns ± 3%    +9.32%  (p=0.000 n=36+36)
BM_Sort_uint32_Ascending_64                                      1.23ns ± 5%  1.51ns ± 3%   +22.56%  (p=0.000 n=34+36)
BM_Sort_uint32_Ascending_256                                     1.21ns ± 3%  1.57ns ± 4%   +29.77%  (p=0.000 n=33+35)
BM_Sort_uint32_Ascending_1024                                    1.03ns ± 4%  1.43ns ± 3%   +38.44%  (p=0.000 n=32+35)
BM_Sort_uint32_Ascending_16384                                   0.94ns ± 8%  1.36ns ± 5%   +44.09%  (p=0.000 n=32+35)
BM_Sort_uint32_Ascending_262144                                  0.93ns ± 3%  1.35ns ± 7%   +45.06%  (p=0.000 n=32+36)
BM_Sort_uint32_Descending_1                                      3.69ns ± 2%  4.27ns ± 3%   +15.73%  (p=0.000 n=31+36)
BM_Sort_uint32_Descending_4                                      1.74ns ± 2%  1.78ns ± 3%    +2.29%  (p=0.000 n=31+38)
BM_Sort_uint32_Descending_16                                     3.92ns ± 4%  4.20ns ± 4%    +7.13%  (p=0.000 n=32+38)
BM_Sort_uint32_Descending_64                                     2.09ns ± 4%  3.25ns ± 4%   +55.10%  (p=0.000 n=33+37)
BM_Sort_uint32_Descending_256                                    1.98ns ± 7%  2.93ns ± 4%   +47.95%  (p=0.000 n=34+36)
BM_Sort_uint32_Descending_1024                                   2.23ns ± 6%  2.64ns ± 3%   +18.22%  (p=0.000 n=34+38)
BM_Sort_uint32_Descending_16384                                  1.93ns ± 6%  2.43ns ± 4%   +25.99%  (p=0.000 n=34+35)
BM_Sort_uint32_Descending_262144                                 1.89ns ± 3%  2.38ns ± 4%   +25.41%  (p=0.000 n=33+35)
BM_Sort_uint32_SingleElement_1                                   3.67ns ± 2%  4.28ns ± 4%   +16.60%  (p=0.000 n=34+34)
BM_Sort_uint32_SingleElement_4                                   1.48ns ± 4%  1.48ns ± 5%      ~     (p=0.951 n=35+33)
BM_Sort_uint32_SingleElement_16                                  0.93ns ± 3%  1.02ns ± 4%    +9.51%  (p=0.000 n=36+33)
BM_Sort_uint32_SingleElement_64                                  0.76ns ± 3%  1.59ns ± 8%  +109.78%  (p=0.000 n=36+32)
BM_Sort_uint32_SingleElement_256                                 0.82ns ± 4%  1.45ns ± 5%   +76.62%  (p=0.000 n=37+34)
BM_Sort_uint32_SingleElement_1024                                0.77ns ± 4%  1.31ns ± 4%   +71.40%  (p=0.000 n=34+34)
BM_Sort_uint32_SingleElement_16384                               0.64ns ± 4%  1.24ns ± 6%   +93.29%  (p=0.000 n=35+36)
BM_Sort_uint32_SingleElement_262144                              0.63ns ± 3%  1.23ns ± 4%   +95.17%  (p=0.000 n=35+35)
BM_Sort_uint32_PipeOrgan_1                                       3.68ns ± 2%  4.42ns ± 3%   +20.31%  (p=0.000 n=34+36)
BM_Sort_uint32_PipeOrgan_4                                       1.54ns ± 3%  1.53ns ± 3%      ~     (p=0.128 n=34+36)
BM_Sort_uint32_PipeOrgan_16                                      2.22ns ± 3%  1.99ns ± 3%   -10.28%  (p=0.000 n=33+36)
BM_Sort_uint32_PipeOrgan_64                                      4.41ns ± 3%  3.39ns ± 4%   -23.17%  (p=0.000 n=35+37)
BM_Sort_uint32_PipeOrgan_256                                     2.75ns ± 5%  3.07ns ± 3%   +11.74%  (p=0.000 n=37+37)
BM_Sort_uint32_PipeOrgan_1024                                    3.58ns ± 2%  5.48ns ± 3%   +52.97%  (p=0.000 n=37+36)
BM_Sort_uint32_PipeOrgan_16384                                   4.10ns ± 3%  6.53ns ± 3%   +59.27%  (p=0.000 n=37+37)
BM_Sort_uint32_PipeOrgan_262144                                  4.90ns ± 3%  7.39ns ± 3%   +50.71%  (p=0.000 n=34+37)
BM_Sort_uint32_QuickSortAdversary_1                              3.68ns ± 2%  4.28ns ± 3%   +16.19%  (p=0.000 n=36+37)
BM_Sort_uint32_QuickSortAdversary_4                              1.46ns ± 4%  1.46ns ± 3%      ~     (p=0.736 n=35+38)
BM_Sort_uint32_QuickSortAdversary_16                             0.93ns ± 3%  1.02ns ± 4%    +9.69%  (p=0.000 n=36+37)
BM_Sort_uint32_QuickSortAdversary_64                             13.6ns ± 4%  17.9ns ± 8%   +31.37%  (p=0.000 n=36+35)
BM_Sort_uint32_QuickSortAdversary_256                            20.0ns ± 4%  25.7ns ± 4%   +28.69%  (p=0.000 n=36+35)
BM_Sort_uint32_QuickSortAdversary_1024                           28.3ns ± 6%  31.7ns ± 3%   +12.12%  (p=0.000 n=36+37)
BM_Sort_uint32_QuickSortAdversary_16384                          45.8ns ± 3%  50.6ns ± 4%   +10.32%  (p=0.000 n=38+36)
BM_Sort_uint32_QuickSortAdversary_262144                         61.6ns ± 4%  68.2ns ± 4%   +10.68%  (p=0.000 n=37+37)
BM_Sort_uint64_Random_1                                          3.71ns ± 4%  4.00ns ± 4%    +7.93%  (p=0.000 n=34+35)
BM_Sort_uint64_Random_4                                          5.52ns ± 8%  5.22ns ± 6%    -5.41%  (p=0.000 n=32+32)
BM_Sort_uint64_Random_16                                         10.7ns ±15%  10.2ns ± 7%      ~     (p=0.077 n=40+31)
BM_Sort_uint64_Random_64                                         19.0ns ±14%  18.2ns ±14%    -4.31%  (p=0.001 n=40+40)
BM_Sort_uint64_Random_256                                        25.7ns ± 9%  22.1ns ±15%   -13.82%  (p=0.000 n=33+40)
BM_Sort_uint64_Random_1024                                       32.4ns ± 6%  23.8ns ±16%   -26.64%  (p=0.000 n=33+40)
BM_Sort_uint64_Random_16384                                      46.8ns ± 3%  27.1ns ±16%   -42.15%  (p=0.000 n=33+40)
BM_Sort_uint64_Random_262144                                     61.3ns ± 4%  30.4ns ±16%   -50.34%  (p=0.000 n=34+40)
BM_Sort_uint64_Ascending_1                                       3.67ns ± 3%  3.87ns ±16%    +5.36%  (p=0.049 n=35+40)
BM_Sort_uint64_Ascending_4                                       1.46ns ± 3%  1.46ns ± 3%      ~     (p=0.130 n=37+31)
BM_Sort_uint64_Ascending_16                                      1.09ns ± 3%  0.91ns ± 6%   -16.79%  (p=0.000 n=38+32)
BM_Sort_uint64_Ascending_64                                      1.25ns ± 3%  1.29ns ± 5%    +3.11%  (p=0.000 n=38+34)
BM_Sort_uint64_Ascending_256                                     1.37ns ± 3%  1.42ns ± 3%    +3.07%  (p=0.000 n=39+35)
BM_Sort_uint64_Ascending_1024                                    1.12ns ± 3%  1.17ns ± 3%    +5.28%  (p=0.000 n=37+36)
BM_Sort_uint64_Ascending_16384                                   0.98ns ± 3%  1.09ns ± 3%   +10.95%  (p=0.000 n=36+37)
BM_Sort_uint64_Ascending_262144                                  0.98ns ± 3%  1.08ns ± 3%   +10.97%  (p=0.000 n=36+37)
BM_Sort_uint64_Descending_1                                      3.68ns ± 3%  3.67ns ± 3%      ~     (p=0.652 n=36+36)
BM_Sort_uint64_Descending_4                                      1.71ns ± 3%  1.73ns ± 3%    +1.50%  (p=0.000 n=33+34)
BM_Sort_uint64_Descending_16                                     4.96ns ± 2%  5.49ns ± 3%   +10.73%  (p=0.000 n=31+36)
BM_Sort_uint64_Descending_64                                     2.14ns ± 6%  3.03ns ± 3%   +41.72%  (p=0.000 n=32+35)
BM_Sort_uint64_Descending_256                                    2.03ns ± 4%  2.86ns ± 4%   +40.55%  (p=0.000 n=32+34)
BM_Sort_uint64_Descending_1024                                   2.20ns ± 2%  2.29ns ± 3%    +4.20%  (p=0.000 n=31+36)
BM_Sort_uint64_Descending_16384                                  1.89ns ± 3%  2.08ns ± 3%   +10.00%  (p=0.000 n=31+37)
BM_Sort_uint64_Descending_262144                                 1.92ns ± 3%  2.07ns ± 4%    +7.95%  (p=0.000 n=31+36)
BM_Sort_uint64_SingleElement_1                                   3.68ns ± 5%  3.67ns ± 3%      ~     (p=0.716 n=31+37)
BM_Sort_uint64_SingleElement_4                                   1.46ns ± 3%  1.46ns ± 3%      ~     (p=0.557 n=34+37)
BM_Sort_uint64_SingleElement_16                                  1.09ns ± 2%  0.91ns ± 3%   -16.93%  (p=0.000 n=33+36)
BM_Sort_uint64_SingleElement_64                                  0.83ns ± 4%  1.47ns ± 4%   +78.03%  (p=0.000 n=34+34)
BM_Sort_uint64_SingleElement_256                                 0.95ns ± 4%  1.28ns ± 4%   +35.17%  (p=0.000 n=35+35)
BM_Sort_uint64_SingleElement_1024                                0.76ns ± 3%  1.05ns ± 3%   +37.78%  (p=0.000 n=35+33)
BM_Sort_uint64_SingleElement_16384                               0.71ns ± 2%  0.98ns ± 5%   +38.43%  (p=0.000 n=34+33)
BM_Sort_uint64_SingleElement_262144                              0.72ns ± 3%  0.98ns ± 4%   +35.93%  (p=0.000 n=35+33)
BM_Sort_uint64_PipeOrgan_1                                       3.68ns ± 3%  3.68ns ± 3%      ~     (p=0.650 n=35+33)
BM_Sort_uint64_PipeOrgan_4                                       1.53ns ± 2%  1.54ns ± 4%      ~     (p=0.424 n=33+36)
BM_Sort_uint64_PipeOrgan_16                                      2.23ns ± 3%  2.06ns ± 4%    -7.68%  (p=0.000 n=34+35)
BM_Sort_uint64_PipeOrgan_64                                      5.46ns ± 2%  3.41ns ± 4%   -37.67%  (p=0.000 n=33+36)
BM_Sort_uint64_PipeOrgan_256                                     2.92ns ± 4%  2.91ns ± 3%      ~     (p=0.257 n=35+35)
BM_Sort_uint64_PipeOrgan_1024                                    3.72ns ± 3%  5.35ns ± 4%   +43.95%  (p=0.000 n=35+35)
BM_Sort_uint64_PipeOrgan_16384                                   4.12ns ± 3%  6.37ns ± 3%   +54.74%  (p=0.000 n=34+36)
BM_Sort_uint64_PipeOrgan_262144                                  4.99ns ± 3%  7.25ns ± 5%   +45.45%  (p=0.000 n=35+35)
BM_Sort_uint64_QuickSortAdversary_1                              3.67ns ± 2%  3.65ns ± 3%      ~     (p=0.071 n=35+37)
BM_Sort_uint64_QuickSortAdversary_4                              1.46ns ± 3%  1.46ns ± 3%      ~     (p=0.214 n=36+37)
BM_Sort_uint64_QuickSortAdversary_16                             1.09ns ± 3%  0.91ns ± 3%   -16.73%  (p=0.000 n=36+38)
BM_Sort_uint64_QuickSortAdversary_64                             13.7ns ± 3%  17.8ns ± 5%   +29.86%  (p=0.000 n=36+37)
BM_Sort_uint64_QuickSortAdversary_256                            20.0ns ± 3%  25.9ns ± 3%   +29.25%  (p=0.000 n=35+38)
BM_Sort_uint64_QuickSortAdversary_1024                           28.1ns ± 3%  31.0ns ± 4%   +10.35%  (p=0.000 n=33+37)
BM_Sort_uint64_QuickSortAdversary_16384                          45.8ns ± 2%  50.5ns ± 4%   +10.29%  (p=0.000 n=36+37)
BM_Sort_uint64_QuickSortAdversary_262144                         64.9ns ± 3%  69.5ns ± 3%    +7.15%  (p=0.000 n=36+36)
BM_Sort_pair<uint32, uint32>_Random_1                            4.03ns ± 5%  4.33ns ± 4%    +7.31%  (p=0.000 n=36+36)
BM_Sort_pair<uint32, uint32>_Random_4                            6.78ns ± 5%  6.71ns ± 4%    -1.09%  (p=0.040 n=35+35)
BM_Sort_pair<uint32, uint32>_Random_16                           25.2ns ± 6%  16.8ns ± 7%   -33.35%  (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_Random_64                           35.6ns ± 7%  27.2ns ± 8%   -23.73%  (p=0.000 n=34+36)
BM_Sort_pair<uint32, uint32>_Random_256                          43.5ns ±13%  34.0ns ± 8%   -21.78%  (p=0.000 n=32+34)
BM_Sort_pair<uint32, uint32>_Random_1024                         50.6ns ± 8%  40.8ns ± 5%   -19.35%  (p=0.000 n=32+32)
BM_Sort_pair<uint32, uint32>_Random_16384                        66.0ns ± 3%  55.9ns ± 6%   -15.24%  (p=0.000 n=32+32)
BM_Sort_pair<uint32, uint32>_Random_262144                       82.4ns ± 4%  72.0ns ± 5%   -12.64%  (p=0.000 n=32+31)
BM_Sort_pair<uint32, uint32>_Ascending_1                         4.00ns ± 2%  4.50ns ±16%   +12.59%  (p=0.000 n=33+40)
BM_Sort_pair<uint32, uint32>_Ascending_4                         2.22ns ± 3%  2.34ns ±16%    +5.46%  (p=0.041 n=33+40)
BM_Sort_pair<uint32, uint32>_Ascending_16                        2.33ns ± 4%  1.30ns ±15%   -44.33%  (p=0.000 n=34+40)
BM_Sort_pair<uint32, uint32>_Ascending_64                        1.39ns ± 4%  1.50ns ± 8%    +8.48%  (p=0.000 n=35+32)
BM_Sort_pair<uint32, uint32>_Ascending_256                       1.47ns ± 4%  1.56ns ± 3%    +5.96%  (p=0.000 n=37+31)
BM_Sort_pair<uint32, uint32>_Ascending_1024                      1.34ns ± 3%  1.35ns ± 4%    +1.22%  (p=0.000 n=38+31)
BM_Sort_pair<uint32, uint32>_Ascending_16384                     1.18ns ± 2%  1.18ns ± 3%      ~     (p=0.687 n=37+32)
BM_Sort_pair<uint32, uint32>_Ascending_262144                    1.18ns ± 3%  1.17ns ± 2%      ~     (p=0.153 n=38+34)
BM_Sort_pair<uint32, uint32>_Descending_1                        4.00ns ± 2%  4.29ns ± 3%    +7.22%  (p=0.000 n=37+36)
BM_Sort_pair<uint32, uint32>_Descending_4                        2.91ns ± 3%  2.92ns ± 3%      ~     (p=0.065 n=37+35)
BM_Sort_pair<uint32, uint32>_Descending_16                       4.96ns ± 4%  6.51ns ± 2%   +31.36%  (p=0.000 n=37+30)
BM_Sort_pair<uint32, uint32>_Descending_64                       3.13ns ± 2%  2.92ns ± 3%    -6.71%  (p=0.000 n=36+37)
BM_Sort_pair<uint32, uint32>_Descending_256                      2.56ns ± 3%  2.73ns ± 5%    +6.55%  (p=0.000 n=35+37)
BM_Sort_pair<uint32, uint32>_Descending_1024                     3.11ns ± 3%  2.34ns ± 4%   -24.85%  (p=0.000 n=36+35)
BM_Sort_pair<uint32, uint32>_Descending_16384                    2.84ns ± 3%  2.14ns ± 5%   -24.48%  (p=0.000 n=37+37)
BM_Sort_pair<uint32, uint32>_Descending_262144                   2.86ns ± 3%  2.15ns ± 3%   -25.08%  (p=0.000 n=36+35)
BM_Sort_pair<uint32, uint32>_SingleElement_1                     3.99ns ± 3%  4.28ns ± 3%    +7.08%  (p=0.000 n=33+35)
BM_Sort_pair<uint32, uint32>_SingleElement_4                     2.32ns ± 6%  2.30ns ± 3%    -0.77%  (p=0.032 n=32+35)
BM_Sort_pair<uint32, uint32>_SingleElement_16                    1.67ns ± 4%  1.27ns ± 4%   -24.13%  (p=0.000 n=32+35)
BM_Sort_pair<uint32, uint32>_SingleElement_64                    1.64ns ± 7%  1.83ns ± 4%   +11.54%  (p=0.000 n=31+35)
BM_Sort_pair<uint32, uint32>_SingleElement_256                   1.57ns ± 3%  1.90ns ± 3%   +21.46%  (p=0.000 n=31+36)
BM_Sort_pair<uint32, uint32>_SingleElement_1024                  1.49ns ±15%  1.63ns ± 3%    +9.42%  (p=0.000 n=40+37)
BM_Sort_pair<uint32, uint32>_SingleElement_16384                 1.29ns ±17%  1.57ns ± 3%   +21.51%  (p=0.000 n=33+36)
BM_Sort_pair<uint32, uint32>_SingleElement_262144                1.26ns ± 4%  1.56ns ± 4%   +24.11%  (p=0.000 n=33+36)
BM_Sort_pair<uint32, uint32>_PipeOrgan_1                         4.01ns ± 2%  4.28ns ± 3%    +6.68%  (p=0.000 n=32+35)
BM_Sort_pair<uint32, uint32>_PipeOrgan_4                         2.38ns ± 5%  2.42ns ± 4%    +1.61%  (p=0.000 n=34+35)
BM_Sort_pair<uint32, uint32>_PipeOrgan_16                        4.83ns ± 2%  2.71ns ± 7%   -43.96%  (p=0.000 n=34+34)
BM_Sort_pair<uint32, uint32>_PipeOrgan_64                        4.53ns ± 3%  3.89ns ± 7%   -14.11%  (p=0.000 n=35+33)
BM_Sort_pair<uint32, uint32>_PipeOrgan_256                       5.53ns ± 4%  2.81ns ± 4%   -49.13%  (p=0.000 n=36+33)
BM_Sort_pair<uint32, uint32>_PipeOrgan_1024                      6.49ns ± 4%  5.29ns ± 3%   -18.50%  (p=0.000 n=35+32)
BM_Sort_pair<uint32, uint32>_PipeOrgan_16384                     7.21ns ± 4%  5.97ns ± 3%   -17.24%  (p=0.000 n=36+33)
BM_Sort_pair<uint32, uint32>_PipeOrgan_262144                    7.98ns ± 5%  6.59ns ± 3%   -17.46%  (p=0.000 n=33+33)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_1                3.99ns ± 3%  4.27ns ± 3%    +6.95%  (p=0.000 n=36+34)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_4                2.40ns ± 3%  2.37ns ± 3%    -1.00%  (p=0.007 n=34+34)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_16               4.96ns ± 5%  2.72ns ± 7%   -45.07%  (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_64               7.24ns ± 4%  7.51ns ± 4%    +3.63%  (p=0.000 n=34+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_256              9.85ns ± 5%  7.12ns ± 4%   -27.70%  (p=0.000 n=34+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_1024             11.6ns ± 6%   8.8ns ± 5%   -23.86%  (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_16384            32.7ns ± 3%  20.8ns ± 4%   -36.26%  (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_262144           36.4ns ± 3%  24.0ns ± 4%   -34.12%  (p=0.000 n=34+36)
BM_Sort_tuple<uint32, uint64, uint32>_Random_1                   4.04ns ± 6%  4.34ns ± 4%    +7.55%  (p=0.000 n=37+37)
BM_Sort_tuple<uint32, uint64, uint32>_Random_4                   7.19ns ± 6%  7.26ns ± 5%    +0.99%  (p=0.042 n=36+38)
BM_Sort_tuple<uint32, uint64, uint32>_Random_16                  30.4ns ± 6%  21.8ns ± 7%   -28.28%  (p=0.000 n=34+37)
BM_Sort_tuple<uint32, uint64, uint32>_Random_64                  42.8ns ±11%  33.5ns ± 9%   -21.70%  (p=0.000 n=36+38)
BM_Sort_tuple<uint32, uint64, uint32>_Random_256                 49.9ns ± 6%  40.3ns ± 9%   -19.20%  (p=0.000 n=35+38)
BM_Sort_tuple<uint32, uint64, uint32>_Random_1024                56.3ns ± 3%  46.1ns ± 4%   -18.08%  (p=0.000 n=35+35)
BM_Sort_tuple<uint32, uint64, uint32>_Random_16384               72.2ns ± 5%  62.1ns ± 3%   -14.05%  (p=0.000 n=37+36)
BM_Sort_tuple<uint32, uint64, uint32>_Random_262144              88.7ns ± 6%  79.0ns ± 6%   -10.93%  (p=0.000 n=36+36)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_1                3.96ns ± 3%  4.36ns ± 3%    +9.96%  (p=0.000 n=34+37)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_4                2.39ns ± 2%  2.39ns ± 3%      ~     (p=0.604 n=36+37)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_16               3.04ns ± 4%  1.48ns ± 3%   -51.20%  (p=0.000 n=34+35)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_64               2.44ns ± 3%  2.30ns ± 5%    -5.61%  (p=0.000 n=36+35)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_256              2.35ns ± 3%  2.39ns ± 5%    +1.78%  (p=0.000 n=33+34)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_1024             2.12ns ± 5%  2.08ns ± 4%    -1.80%  (p=0.000 n=33+34)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_16384            2.02ns ± 3%  2.00ns ± 5%    -1.25%  (p=0.000 n=32+32)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_262144           2.06ns ± 5%  2.11ns ± 9%      ~     (p=0.618 n=32+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_1               3.97ns ± 2%  4.57ns ±16%   +15.19%  (p=0.000 n=32+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_4               3.64ns ± 3%  4.05ns ±15%   +11.05%  (p=0.000 n=33+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_16              5.68ns ± 5%  9.36ns ±16%   +64.92%  (p=0.000 n=35+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_64              4.27ns ± 4%  3.88ns ± 8%    -9.13%  (p=0.000 n=35+32)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_256             3.58ns ± 3%  3.76ns ±14%    +5.12%  (p=0.002 n=38+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_1024            4.16ns ± 3%  3.21ns ± 5%   -22.77%  (p=0.000 n=38+31)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_16384           3.90ns ± 4%  3.00ns ± 3%   -23.12%  (p=0.000 n=38+32)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_262144          4.52ns ± 3%  3.42ns ± 3%   -24.29%  (p=0.000 n=38+33)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_1            3.97ns ± 3%  4.31ns ± 3%    +8.78%  (p=0.000 n=39+34)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_4            2.54ns ± 2%  2.54ns ± 4%      ~     (p=0.341 n=38+36)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_16           2.39ns ± 3%  1.70ns ± 6%   -28.90%  (p=0.000 n=38+35)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_64           2.61ns ± 2%  3.23ns ± 3%   +24.07%  (p=0.000 n=35+35)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_256          2.83ns ± 2%  2.97ns ± 4%    +4.83%  (p=0.000 n=35+37)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_1024         2.44ns ± 4%  2.44ns ± 3%      ~     (p=0.481 n=36+36)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_16384        2.19ns ± 3%  2.37ns ± 6%    +8.01%  (p=0.000 n=36+37)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_262144       2.34ns ± 2%  2.36ns ± 5%    +1.11%  (p=0.001 n=36+36)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_1                3.96ns ± 2%  4.31ns ± 3%    +8.76%  (p=0.000 n=33+35)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_4                2.65ns ± 6%  2.67ns ± 4%      ~     (p=0.139 n=32+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_16               5.64ns ± 3%  3.56ns ± 3%   -36.80%  (p=0.000 n=31+35)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_64               6.12ns ±16%  5.04ns ± 4%   -17.64%  (p=0.000 n=40+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_256              6.78ns ± 6%  3.73ns ± 3%   -44.94%  (p=0.000 n=31+36)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_1024             8.36ns ±15%  6.51ns ± 4%   -22.13%  (p=0.000 n=40+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_16384            9.24ns ±15%  7.91ns ± 3%   -14.34%  (p=0.000 n=40+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_262144           10.7ns ± 3%   9.3ns ± 6%   -12.36%  (p=0.000 n=32+36)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_1       3.97ns ± 3%  4.31ns ± 3%    +8.63%  (p=0.000 n=32+35)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_4       2.79ns ± 3%  2.76ns ± 4%    -0.95%  (p=0.002 n=33+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_16      5.07ns ± 3%  3.69ns ± 4%   -27.35%  (p=0.000 n=35+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_64      9.26ns ± 3%  8.34ns ± 7%    -9.88%  (p=0.000 n=35+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_256     11.8ns ± 5%   9.7ns ± 3%   -17.83%  (p=0.000 n=37+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_1024    19.2ns ± 4%  14.5ns ±10%   -24.59%  (p=0.000 n=36+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_16384   45.5ns ± 4%  37.4ns ± 9%   -17.71%  (p=0.000 n=35+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_262144  50.0ns ± 4%  43.2ns ± 3%   -13.69%  (p=0.000 n=35+34)
BM_Sort_string_Random_1                                          4.66ns ± 6%  4.40ns ± 4%    -5.55%  (p=0.000 n=35+37)
BM_Sort_string_Random_4                                          14.9ns ± 3%  15.0ns ± 6%      ~     (p=0.863 n=36+38)
BM_Sort_string_Random_16                                         45.5ns ± 6%  35.8ns ± 8%   -21.37%  (p=0.000 n=36+36)
BM_Sort_string_Random_64                                         66.6ns ± 4%  58.2ns ± 3%   -12.69%  (p=0.000 n=36+37)
BM_Sort_string_Random_256                                        86.0ns ± 5%  77.4ns ± 3%   -10.01%  (p=0.000 n=37+37)
BM_Sort_string_Random_1024                                        106ns ± 3%    96ns ± 6%    -9.39%  (p=0.000 n=37+37)
BM_Sort_string_Random_16384                                       154ns ± 3%   141ns ± 5%    -8.03%  (p=0.000 n=35+36)
BM_Sort_string_Random_262144                                      213ns ± 4%   197ns ± 4%    -7.59%  (p=0.000 n=34+34)
BM_Sort_string_Ascending_1                                       4.59ns ± 2%  4.56ns ±17%    -0.60%  (p=0.002 n=32+40)
BM_Sort_string_Ascending_4                                       7.52ns ± 9%  7.54ns ±12%      ~     (p=0.554 n=37+40)
BM_Sort_string_Ascending_16                                      13.1ns ± 6%   8.8ns ±12%   -33.26%  (p=0.000 n=39+38)
BM_Sort_string_Ascending_64                                      14.8ns ±10%  14.5ns ±11%    -2.15%  (p=0.013 n=40+37)
BM_Sort_string_Ascending_256                                     14.0ns ± 6%  14.1ns ±10%      ~     (p=0.760 n=37+40)
BM_Sort_string_Ascending_1024                                    12.9ns ±10%  12.8ns ±20%      ~     (p=0.055 n=35+40)
BM_Sort_string_Ascending_16384                                   17.2ns ±13%  17.4ns ±21%      ~     (p=1.000 n=37+40)
BM_Sort_string_Ascending_262144                                  17.5ns ±12%  17.5ns ±25%      ~     (p=0.392 n=35+39)
BM_Sort_string_Descending_1                                      4.59ns ± 3%  4.34ns ± 3%    -5.51%  (p=0.000 n=32+33)
BM_Sort_string_Descending_4                                      10.1ns ± 5%   9.8ns ± 4%    -2.84%  (p=0.000 n=36+34)
BM_Sort_string_Descending_16                                     22.0ns ± 4%  39.6ns ± 4%   +79.84%  (p=0.000 n=36+33)
BM_Sort_string_Descending_64                                     21.4ns ±12%  21.3ns ±14%      ~     (p=0.542 n=37+39)
BM_Sort_string_Descending_256                                    19.4ns ±13%  18.9ns ±13%    -2.74%  (p=0.039 n=37+39)
BM_Sort_string_Descending_1024                                   22.7ns ± 5%  17.6ns ±15%   -22.52%  (p=0.000 n=35+40)
BM_Sort_string_Descending_16384                                  27.9ns ±14%  22.6ns ±10%   -19.11%  (p=0.000 n=40+37)
BM_Sort_string_Descending_262144                                 33.8ns ±14%  26.1ns ±21%   -22.74%  (p=0.000 n=39+38)
BM_Sort_string_SingleElement_1                                   4.58ns ± 2%  4.35ns ± 3%    -5.14%  (p=0.000 n=35+37)
BM_Sort_string_SingleElement_4                                   7.92ns ± 3%  7.92ns ± 7%      ~     (p=0.625 n=38+39)
BM_Sort_string_SingleElement_16                                  18.0ns ± 3%   7.9ns ± 6%   -56.23%  (p=0.000 n=36+35)
BM_Sort_string_SingleElement_64                                  20.3ns ± 5%  19.3ns ±15%    -4.83%  (p=0.000 n=34+38)
BM_Sort_string_SingleElement_256                                 19.4ns ± 7%  18.1ns ±14%    -6.67%  (p=0.000 n=36+39)
BM_Sort_string_SingleElement_1024                                19.3ns ± 9%  17.4ns ±17%    -9.40%  (p=0.000 n=35+40)
BM_Sort_string_SingleElement_16384                               17.5ns ±12%  16.2ns ±20%    -7.91%  (p=0.000 n=37+40)
BM_Sort_string_SingleElement_262144                              16.7ns ±18%  15.3ns ±27%    -8.56%  (p=0.000 n=40+40)
BM_Sort_string_PipeOrgan_1                                       4.60ns ± 2%  4.33ns ± 3%    -5.80%  (p=0.000 n=33+31)
BM_Sort_string_PipeOrgan_4                                       8.29ns ± 4%  8.17ns ± 8%    -1.50%  (p=0.004 n=39+36)
BM_Sort_string_PipeOrgan_16                                      22.9ns ± 3%  16.4ns ± 6%   -28.45%  (p=0.000 n=39+38)
BM_Sort_string_PipeOrgan_64                                      30.7ns ± 4%  28.9ns ± 7%    -6.05%  (p=0.000 n=38+37)
BM_Sort_string_PipeOrgan_256                                     38.1ns ± 3%  22.5ns ± 9%   -40.78%  (p=0.000 n=37+37)
BM_Sort_string_PipeOrgan_1024                                    45.4ns ± 4%  36.2ns ± 6%   -20.33%  (p=0.000 n=37+37)
BM_Sort_string_PipeOrgan_16384                                   56.2ns ± 4%  49.0ns ± 8%   -12.73%  (p=0.000 n=36+38)
BM_Sort_string_PipeOrgan_262144                                  77.8ns ±13%  62.8ns ±10%   -19.27%  (p=0.000 n=39+39)
BM_Sort_string_QuickSortAdversary_1                              4.80ns ±16%  4.34ns ± 4%    -9.56%  (p=0.000 n=39+34)
BM_Sort_string_QuickSortAdversary_4                              14.8ns ± 5%  14.7ns ± 4%    -0.80%  (p=0.037 n=33+33)
BM_Sort_string_QuickSortAdversary_16                             44.6ns ± 4%  34.8ns ± 5%   -21.98%  (p=0.000 n=35+34)
BM_Sort_string_QuickSortAdversary_64                             66.2ns ± 3%  58.1ns ± 4%   -12.32%  (p=0.000 n=36+35)
BM_Sort_string_QuickSortAdversary_256                            85.4ns ± 5%  76.9ns ± 6%    -9.99%  (p=0.000 n=36+36)
BM_Sort_string_QuickSortAdversary_1024                            106ns ± 4%    96ns ± 3%    -9.62%  (p=0.000 n=34+37)
BM_Sort_string_QuickSortAdversary_16384                           153ns ± 3%   141ns ± 4%    -8.22%  (p=0.000 n=34+37)
BM_Sort_string_QuickSortAdversary_262144                          211ns ± 5%   195ns ± 6%    -7.77%  (p=0.000 n=35+38)

Differential Revision: https://reviews.llvm.org/D122780




More information about the All-commits mailing list