[all-commits] [llvm/llvm-project] 4eddbf: std::sort: add BlockQuickSort partitioning algorit...
Nilay Vaish via All-commits
all-commits at lists.llvm.org
Thu Dec 22 14:47:39 PST 2022
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 4eddbf9f10a6d1881c93d84f4363d6d881daf848
https://github.com/llvm/llvm-project/commit/4eddbf9f10a6d1881c93d84f4363d6d881daf848
Author: Nilay Vaish <nilayvaish at google.com>
Date: 2022-12-22 (Thu, 22 Dec 2022)
Changed paths:
M libcxx/include/__algorithm/sort.h
M libcxx/include/__bits
Log Message:
-----------
std::sort: add BlockQuickSort partitioning algorithm for arithmetic types
This diff modifies std::sort in two ways:
* for arithmetic types we update the core partitioning algorithm to use
BlockQuickSort for partitioning. The partition function was carefully
written to let the compiler generates SIMD instructions without actually
writing SIMD intrinsics in the loop. We see up to 50% better performance
for sorting arithmetic types. The use of the BlockQuickSort partitioning
has been limited to arithmetic types since the algorithm works well when
branch instructions can be avoided during partitioning. This usually not
true for types other than the arithmetic ones.
* for other types (tuples, strings) updates have been made to improve
performance by about 10%. Performance numbers comparing std::sort (old)
and Bitset sort (new) on libcxx benchmark.
name old cpu/op new cpu/op delta
BM_Sort_uint32_Random_1 3.72ns ± 5% 3.78ns ±16% ~ (p=0.819 n=36+34)
BM_Sort_uint32_Random_4 5.42ns ± 5% 5.29ns ± 7% -2.42% (p=0.000 n=35+31)
BM_Sort_uint32_Random_16 10.5ns ± 3% 11.9ns ±15% +13.08% (p=0.000 n=36+40)
BM_Sort_uint32_Random_64 18.6ns ± 7% 18.5ns ±15% -0.95% (p=0.002 n=33+40)
BM_Sort_uint32_Random_256 26.2ns ± 4% 21.3ns ± 8% -18.89% (p=0.000 n=37+34)
BM_Sort_uint32_Random_1024 33.4ns ± 5% 23.3ns ± 4% -30.37% (p=0.000 n=39+35)
BM_Sort_uint32_Random_16384 47.7ns ± 5% 26.7ns ± 5% -44.06% (p=0.000 n=39+35)
BM_Sort_uint32_Random_262144 62.6ns ± 3% 30.1ns ± 6% -51.81% (p=0.000 n=37+36)
BM_Sort_uint32_Ascending_1 3.71ns ± 3% 4.28ns ± 3% +15.53% (p=0.000 n=37+35)
BM_Sort_uint32_Ascending_4 1.47ns ± 3% 1.46ns ± 3% ~ (p=0.083 n=36+37)
BM_Sort_uint32_Ascending_16 0.93ns ± 4% 1.02ns ± 3% +9.32% (p=0.000 n=36+36)
BM_Sort_uint32_Ascending_64 1.23ns ± 5% 1.51ns ± 3% +22.56% (p=0.000 n=34+36)
BM_Sort_uint32_Ascending_256 1.21ns ± 3% 1.57ns ± 4% +29.77% (p=0.000 n=33+35)
BM_Sort_uint32_Ascending_1024 1.03ns ± 4% 1.43ns ± 3% +38.44% (p=0.000 n=32+35)
BM_Sort_uint32_Ascending_16384 0.94ns ± 8% 1.36ns ± 5% +44.09% (p=0.000 n=32+35)
BM_Sort_uint32_Ascending_262144 0.93ns ± 3% 1.35ns ± 7% +45.06% (p=0.000 n=32+36)
BM_Sort_uint32_Descending_1 3.69ns ± 2% 4.27ns ± 3% +15.73% (p=0.000 n=31+36)
BM_Sort_uint32_Descending_4 1.74ns ± 2% 1.78ns ± 3% +2.29% (p=0.000 n=31+38)
BM_Sort_uint32_Descending_16 3.92ns ± 4% 4.20ns ± 4% +7.13% (p=0.000 n=32+38)
BM_Sort_uint32_Descending_64 2.09ns ± 4% 3.25ns ± 4% +55.10% (p=0.000 n=33+37)
BM_Sort_uint32_Descending_256 1.98ns ± 7% 2.93ns ± 4% +47.95% (p=0.000 n=34+36)
BM_Sort_uint32_Descending_1024 2.23ns ± 6% 2.64ns ± 3% +18.22% (p=0.000 n=34+38)
BM_Sort_uint32_Descending_16384 1.93ns ± 6% 2.43ns ± 4% +25.99% (p=0.000 n=34+35)
BM_Sort_uint32_Descending_262144 1.89ns ± 3% 2.38ns ± 4% +25.41% (p=0.000 n=33+35)
BM_Sort_uint32_SingleElement_1 3.67ns ± 2% 4.28ns ± 4% +16.60% (p=0.000 n=34+34)
BM_Sort_uint32_SingleElement_4 1.48ns ± 4% 1.48ns ± 5% ~ (p=0.951 n=35+33)
BM_Sort_uint32_SingleElement_16 0.93ns ± 3% 1.02ns ± 4% +9.51% (p=0.000 n=36+33)
BM_Sort_uint32_SingleElement_64 0.76ns ± 3% 1.59ns ± 8% +109.78% (p=0.000 n=36+32)
BM_Sort_uint32_SingleElement_256 0.82ns ± 4% 1.45ns ± 5% +76.62% (p=0.000 n=37+34)
BM_Sort_uint32_SingleElement_1024 0.77ns ± 4% 1.31ns ± 4% +71.40% (p=0.000 n=34+34)
BM_Sort_uint32_SingleElement_16384 0.64ns ± 4% 1.24ns ± 6% +93.29% (p=0.000 n=35+36)
BM_Sort_uint32_SingleElement_262144 0.63ns ± 3% 1.23ns ± 4% +95.17% (p=0.000 n=35+35)
BM_Sort_uint32_PipeOrgan_1 3.68ns ± 2% 4.42ns ± 3% +20.31% (p=0.000 n=34+36)
BM_Sort_uint32_PipeOrgan_4 1.54ns ± 3% 1.53ns ± 3% ~ (p=0.128 n=34+36)
BM_Sort_uint32_PipeOrgan_16 2.22ns ± 3% 1.99ns ± 3% -10.28% (p=0.000 n=33+36)
BM_Sort_uint32_PipeOrgan_64 4.41ns ± 3% 3.39ns ± 4% -23.17% (p=0.000 n=35+37)
BM_Sort_uint32_PipeOrgan_256 2.75ns ± 5% 3.07ns ± 3% +11.74% (p=0.000 n=37+37)
BM_Sort_uint32_PipeOrgan_1024 3.58ns ± 2% 5.48ns ± 3% +52.97% (p=0.000 n=37+36)
BM_Sort_uint32_PipeOrgan_16384 4.10ns ± 3% 6.53ns ± 3% +59.27% (p=0.000 n=37+37)
BM_Sort_uint32_PipeOrgan_262144 4.90ns ± 3% 7.39ns ± 3% +50.71% (p=0.000 n=34+37)
BM_Sort_uint32_QuickSortAdversary_1 3.68ns ± 2% 4.28ns ± 3% +16.19% (p=0.000 n=36+37)
BM_Sort_uint32_QuickSortAdversary_4 1.46ns ± 4% 1.46ns ± 3% ~ (p=0.736 n=35+38)
BM_Sort_uint32_QuickSortAdversary_16 0.93ns ± 3% 1.02ns ± 4% +9.69% (p=0.000 n=36+37)
BM_Sort_uint32_QuickSortAdversary_64 13.6ns ± 4% 17.9ns ± 8% +31.37% (p=0.000 n=36+35)
BM_Sort_uint32_QuickSortAdversary_256 20.0ns ± 4% 25.7ns ± 4% +28.69% (p=0.000 n=36+35)
BM_Sort_uint32_QuickSortAdversary_1024 28.3ns ± 6% 31.7ns ± 3% +12.12% (p=0.000 n=36+37)
BM_Sort_uint32_QuickSortAdversary_16384 45.8ns ± 3% 50.6ns ± 4% +10.32% (p=0.000 n=38+36)
BM_Sort_uint32_QuickSortAdversary_262144 61.6ns ± 4% 68.2ns ± 4% +10.68% (p=0.000 n=37+37)
BM_Sort_uint64_Random_1 3.71ns ± 4% 4.00ns ± 4% +7.93% (p=0.000 n=34+35)
BM_Sort_uint64_Random_4 5.52ns ± 8% 5.22ns ± 6% -5.41% (p=0.000 n=32+32)
BM_Sort_uint64_Random_16 10.7ns ±15% 10.2ns ± 7% ~ (p=0.077 n=40+31)
BM_Sort_uint64_Random_64 19.0ns ±14% 18.2ns ±14% -4.31% (p=0.001 n=40+40)
BM_Sort_uint64_Random_256 25.7ns ± 9% 22.1ns ±15% -13.82% (p=0.000 n=33+40)
BM_Sort_uint64_Random_1024 32.4ns ± 6% 23.8ns ±16% -26.64% (p=0.000 n=33+40)
BM_Sort_uint64_Random_16384 46.8ns ± 3% 27.1ns ±16% -42.15% (p=0.000 n=33+40)
BM_Sort_uint64_Random_262144 61.3ns ± 4% 30.4ns ±16% -50.34% (p=0.000 n=34+40)
BM_Sort_uint64_Ascending_1 3.67ns ± 3% 3.87ns ±16% +5.36% (p=0.049 n=35+40)
BM_Sort_uint64_Ascending_4 1.46ns ± 3% 1.46ns ± 3% ~ (p=0.130 n=37+31)
BM_Sort_uint64_Ascending_16 1.09ns ± 3% 0.91ns ± 6% -16.79% (p=0.000 n=38+32)
BM_Sort_uint64_Ascending_64 1.25ns ± 3% 1.29ns ± 5% +3.11% (p=0.000 n=38+34)
BM_Sort_uint64_Ascending_256 1.37ns ± 3% 1.42ns ± 3% +3.07% (p=0.000 n=39+35)
BM_Sort_uint64_Ascending_1024 1.12ns ± 3% 1.17ns ± 3% +5.28% (p=0.000 n=37+36)
BM_Sort_uint64_Ascending_16384 0.98ns ± 3% 1.09ns ± 3% +10.95% (p=0.000 n=36+37)
BM_Sort_uint64_Ascending_262144 0.98ns ± 3% 1.08ns ± 3% +10.97% (p=0.000 n=36+37)
BM_Sort_uint64_Descending_1 3.68ns ± 3% 3.67ns ± 3% ~ (p=0.652 n=36+36)
BM_Sort_uint64_Descending_4 1.71ns ± 3% 1.73ns ± 3% +1.50% (p=0.000 n=33+34)
BM_Sort_uint64_Descending_16 4.96ns ± 2% 5.49ns ± 3% +10.73% (p=0.000 n=31+36)
BM_Sort_uint64_Descending_64 2.14ns ± 6% 3.03ns ± 3% +41.72% (p=0.000 n=32+35)
BM_Sort_uint64_Descending_256 2.03ns ± 4% 2.86ns ± 4% +40.55% (p=0.000 n=32+34)
BM_Sort_uint64_Descending_1024 2.20ns ± 2% 2.29ns ± 3% +4.20% (p=0.000 n=31+36)
BM_Sort_uint64_Descending_16384 1.89ns ± 3% 2.08ns ± 3% +10.00% (p=0.000 n=31+37)
BM_Sort_uint64_Descending_262144 1.92ns ± 3% 2.07ns ± 4% +7.95% (p=0.000 n=31+36)
BM_Sort_uint64_SingleElement_1 3.68ns ± 5% 3.67ns ± 3% ~ (p=0.716 n=31+37)
BM_Sort_uint64_SingleElement_4 1.46ns ± 3% 1.46ns ± 3% ~ (p=0.557 n=34+37)
BM_Sort_uint64_SingleElement_16 1.09ns ± 2% 0.91ns ± 3% -16.93% (p=0.000 n=33+36)
BM_Sort_uint64_SingleElement_64 0.83ns ± 4% 1.47ns ± 4% +78.03% (p=0.000 n=34+34)
BM_Sort_uint64_SingleElement_256 0.95ns ± 4% 1.28ns ± 4% +35.17% (p=0.000 n=35+35)
BM_Sort_uint64_SingleElement_1024 0.76ns ± 3% 1.05ns ± 3% +37.78% (p=0.000 n=35+33)
BM_Sort_uint64_SingleElement_16384 0.71ns ± 2% 0.98ns ± 5% +38.43% (p=0.000 n=34+33)
BM_Sort_uint64_SingleElement_262144 0.72ns ± 3% 0.98ns ± 4% +35.93% (p=0.000 n=35+33)
BM_Sort_uint64_PipeOrgan_1 3.68ns ± 3% 3.68ns ± 3% ~ (p=0.650 n=35+33)
BM_Sort_uint64_PipeOrgan_4 1.53ns ± 2% 1.54ns ± 4% ~ (p=0.424 n=33+36)
BM_Sort_uint64_PipeOrgan_16 2.23ns ± 3% 2.06ns ± 4% -7.68% (p=0.000 n=34+35)
BM_Sort_uint64_PipeOrgan_64 5.46ns ± 2% 3.41ns ± 4% -37.67% (p=0.000 n=33+36)
BM_Sort_uint64_PipeOrgan_256 2.92ns ± 4% 2.91ns ± 3% ~ (p=0.257 n=35+35)
BM_Sort_uint64_PipeOrgan_1024 3.72ns ± 3% 5.35ns ± 4% +43.95% (p=0.000 n=35+35)
BM_Sort_uint64_PipeOrgan_16384 4.12ns ± 3% 6.37ns ± 3% +54.74% (p=0.000 n=34+36)
BM_Sort_uint64_PipeOrgan_262144 4.99ns ± 3% 7.25ns ± 5% +45.45% (p=0.000 n=35+35)
BM_Sort_uint64_QuickSortAdversary_1 3.67ns ± 2% 3.65ns ± 3% ~ (p=0.071 n=35+37)
BM_Sort_uint64_QuickSortAdversary_4 1.46ns ± 3% 1.46ns ± 3% ~ (p=0.214 n=36+37)
BM_Sort_uint64_QuickSortAdversary_16 1.09ns ± 3% 0.91ns ± 3% -16.73% (p=0.000 n=36+38)
BM_Sort_uint64_QuickSortAdversary_64 13.7ns ± 3% 17.8ns ± 5% +29.86% (p=0.000 n=36+37)
BM_Sort_uint64_QuickSortAdversary_256 20.0ns ± 3% 25.9ns ± 3% +29.25% (p=0.000 n=35+38)
BM_Sort_uint64_QuickSortAdversary_1024 28.1ns ± 3% 31.0ns ± 4% +10.35% (p=0.000 n=33+37)
BM_Sort_uint64_QuickSortAdversary_16384 45.8ns ± 2% 50.5ns ± 4% +10.29% (p=0.000 n=36+37)
BM_Sort_uint64_QuickSortAdversary_262144 64.9ns ± 3% 69.5ns ± 3% +7.15% (p=0.000 n=36+36)
BM_Sort_pair<uint32, uint32>_Random_1 4.03ns ± 5% 4.33ns ± 4% +7.31% (p=0.000 n=36+36)
BM_Sort_pair<uint32, uint32>_Random_4 6.78ns ± 5% 6.71ns ± 4% -1.09% (p=0.040 n=35+35)
BM_Sort_pair<uint32, uint32>_Random_16 25.2ns ± 6% 16.8ns ± 7% -33.35% (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_Random_64 35.6ns ± 7% 27.2ns ± 8% -23.73% (p=0.000 n=34+36)
BM_Sort_pair<uint32, uint32>_Random_256 43.5ns ±13% 34.0ns ± 8% -21.78% (p=0.000 n=32+34)
BM_Sort_pair<uint32, uint32>_Random_1024 50.6ns ± 8% 40.8ns ± 5% -19.35% (p=0.000 n=32+32)
BM_Sort_pair<uint32, uint32>_Random_16384 66.0ns ± 3% 55.9ns ± 6% -15.24% (p=0.000 n=32+32)
BM_Sort_pair<uint32, uint32>_Random_262144 82.4ns ± 4% 72.0ns ± 5% -12.64% (p=0.000 n=32+31)
BM_Sort_pair<uint32, uint32>_Ascending_1 4.00ns ± 2% 4.50ns ±16% +12.59% (p=0.000 n=33+40)
BM_Sort_pair<uint32, uint32>_Ascending_4 2.22ns ± 3% 2.34ns ±16% +5.46% (p=0.041 n=33+40)
BM_Sort_pair<uint32, uint32>_Ascending_16 2.33ns ± 4% 1.30ns ±15% -44.33% (p=0.000 n=34+40)
BM_Sort_pair<uint32, uint32>_Ascending_64 1.39ns ± 4% 1.50ns ± 8% +8.48% (p=0.000 n=35+32)
BM_Sort_pair<uint32, uint32>_Ascending_256 1.47ns ± 4% 1.56ns ± 3% +5.96% (p=0.000 n=37+31)
BM_Sort_pair<uint32, uint32>_Ascending_1024 1.34ns ± 3% 1.35ns ± 4% +1.22% (p=0.000 n=38+31)
BM_Sort_pair<uint32, uint32>_Ascending_16384 1.18ns ± 2% 1.18ns ± 3% ~ (p=0.687 n=37+32)
BM_Sort_pair<uint32, uint32>_Ascending_262144 1.18ns ± 3% 1.17ns ± 2% ~ (p=0.153 n=38+34)
BM_Sort_pair<uint32, uint32>_Descending_1 4.00ns ± 2% 4.29ns ± 3% +7.22% (p=0.000 n=37+36)
BM_Sort_pair<uint32, uint32>_Descending_4 2.91ns ± 3% 2.92ns ± 3% ~ (p=0.065 n=37+35)
BM_Sort_pair<uint32, uint32>_Descending_16 4.96ns ± 4% 6.51ns ± 2% +31.36% (p=0.000 n=37+30)
BM_Sort_pair<uint32, uint32>_Descending_64 3.13ns ± 2% 2.92ns ± 3% -6.71% (p=0.000 n=36+37)
BM_Sort_pair<uint32, uint32>_Descending_256 2.56ns ± 3% 2.73ns ± 5% +6.55% (p=0.000 n=35+37)
BM_Sort_pair<uint32, uint32>_Descending_1024 3.11ns ± 3% 2.34ns ± 4% -24.85% (p=0.000 n=36+35)
BM_Sort_pair<uint32, uint32>_Descending_16384 2.84ns ± 3% 2.14ns ± 5% -24.48% (p=0.000 n=37+37)
BM_Sort_pair<uint32, uint32>_Descending_262144 2.86ns ± 3% 2.15ns ± 3% -25.08% (p=0.000 n=36+35)
BM_Sort_pair<uint32, uint32>_SingleElement_1 3.99ns ± 3% 4.28ns ± 3% +7.08% (p=0.000 n=33+35)
BM_Sort_pair<uint32, uint32>_SingleElement_4 2.32ns ± 6% 2.30ns ± 3% -0.77% (p=0.032 n=32+35)
BM_Sort_pair<uint32, uint32>_SingleElement_16 1.67ns ± 4% 1.27ns ± 4% -24.13% (p=0.000 n=32+35)
BM_Sort_pair<uint32, uint32>_SingleElement_64 1.64ns ± 7% 1.83ns ± 4% +11.54% (p=0.000 n=31+35)
BM_Sort_pair<uint32, uint32>_SingleElement_256 1.57ns ± 3% 1.90ns ± 3% +21.46% (p=0.000 n=31+36)
BM_Sort_pair<uint32, uint32>_SingleElement_1024 1.49ns ±15% 1.63ns ± 3% +9.42% (p=0.000 n=40+37)
BM_Sort_pair<uint32, uint32>_SingleElement_16384 1.29ns ±17% 1.57ns ± 3% +21.51% (p=0.000 n=33+36)
BM_Sort_pair<uint32, uint32>_SingleElement_262144 1.26ns ± 4% 1.56ns ± 4% +24.11% (p=0.000 n=33+36)
BM_Sort_pair<uint32, uint32>_PipeOrgan_1 4.01ns ± 2% 4.28ns ± 3% +6.68% (p=0.000 n=32+35)
BM_Sort_pair<uint32, uint32>_PipeOrgan_4 2.38ns ± 5% 2.42ns ± 4% +1.61% (p=0.000 n=34+35)
BM_Sort_pair<uint32, uint32>_PipeOrgan_16 4.83ns ± 2% 2.71ns ± 7% -43.96% (p=0.000 n=34+34)
BM_Sort_pair<uint32, uint32>_PipeOrgan_64 4.53ns ± 3% 3.89ns ± 7% -14.11% (p=0.000 n=35+33)
BM_Sort_pair<uint32, uint32>_PipeOrgan_256 5.53ns ± 4% 2.81ns ± 4% -49.13% (p=0.000 n=36+33)
BM_Sort_pair<uint32, uint32>_PipeOrgan_1024 6.49ns ± 4% 5.29ns ± 3% -18.50% (p=0.000 n=35+32)
BM_Sort_pair<uint32, uint32>_PipeOrgan_16384 7.21ns ± 4% 5.97ns ± 3% -17.24% (p=0.000 n=36+33)
BM_Sort_pair<uint32, uint32>_PipeOrgan_262144 7.98ns ± 5% 6.59ns ± 3% -17.46% (p=0.000 n=33+33)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_1 3.99ns ± 3% 4.27ns ± 3% +6.95% (p=0.000 n=36+34)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_4 2.40ns ± 3% 2.37ns ± 3% -1.00% (p=0.007 n=34+34)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_16 4.96ns ± 5% 2.72ns ± 7% -45.07% (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_64 7.24ns ± 4% 7.51ns ± 4% +3.63% (p=0.000 n=34+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_256 9.85ns ± 5% 7.12ns ± 4% -27.70% (p=0.000 n=34+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_1024 11.6ns ± 6% 8.8ns ± 5% -23.86% (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_16384 32.7ns ± 3% 20.8ns ± 4% -36.26% (p=0.000 n=35+35)
BM_Sort_pair<uint32, uint32>_QuickSortAdversary_262144 36.4ns ± 3% 24.0ns ± 4% -34.12% (p=0.000 n=34+36)
BM_Sort_tuple<uint32, uint64, uint32>_Random_1 4.04ns ± 6% 4.34ns ± 4% +7.55% (p=0.000 n=37+37)
BM_Sort_tuple<uint32, uint64, uint32>_Random_4 7.19ns ± 6% 7.26ns ± 5% +0.99% (p=0.042 n=36+38)
BM_Sort_tuple<uint32, uint64, uint32>_Random_16 30.4ns ± 6% 21.8ns ± 7% -28.28% (p=0.000 n=34+37)
BM_Sort_tuple<uint32, uint64, uint32>_Random_64 42.8ns ±11% 33.5ns ± 9% -21.70% (p=0.000 n=36+38)
BM_Sort_tuple<uint32, uint64, uint32>_Random_256 49.9ns ± 6% 40.3ns ± 9% -19.20% (p=0.000 n=35+38)
BM_Sort_tuple<uint32, uint64, uint32>_Random_1024 56.3ns ± 3% 46.1ns ± 4% -18.08% (p=0.000 n=35+35)
BM_Sort_tuple<uint32, uint64, uint32>_Random_16384 72.2ns ± 5% 62.1ns ± 3% -14.05% (p=0.000 n=37+36)
BM_Sort_tuple<uint32, uint64, uint32>_Random_262144 88.7ns ± 6% 79.0ns ± 6% -10.93% (p=0.000 n=36+36)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_1 3.96ns ± 3% 4.36ns ± 3% +9.96% (p=0.000 n=34+37)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_4 2.39ns ± 2% 2.39ns ± 3% ~ (p=0.604 n=36+37)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_16 3.04ns ± 4% 1.48ns ± 3% -51.20% (p=0.000 n=34+35)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_64 2.44ns ± 3% 2.30ns ± 5% -5.61% (p=0.000 n=36+35)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_256 2.35ns ± 3% 2.39ns ± 5% +1.78% (p=0.000 n=33+34)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_1024 2.12ns ± 5% 2.08ns ± 4% -1.80% (p=0.000 n=33+34)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_16384 2.02ns ± 3% 2.00ns ± 5% -1.25% (p=0.000 n=32+32)
BM_Sort_tuple<uint32, uint64, uint32>_Ascending_262144 2.06ns ± 5% 2.11ns ± 9% ~ (p=0.618 n=32+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_1 3.97ns ± 2% 4.57ns ±16% +15.19% (p=0.000 n=32+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_4 3.64ns ± 3% 4.05ns ±15% +11.05% (p=0.000 n=33+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_16 5.68ns ± 5% 9.36ns ±16% +64.92% (p=0.000 n=35+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_64 4.27ns ± 4% 3.88ns ± 8% -9.13% (p=0.000 n=35+32)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_256 3.58ns ± 3% 3.76ns ±14% +5.12% (p=0.002 n=38+40)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_1024 4.16ns ± 3% 3.21ns ± 5% -22.77% (p=0.000 n=38+31)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_16384 3.90ns ± 4% 3.00ns ± 3% -23.12% (p=0.000 n=38+32)
BM_Sort_tuple<uint32, uint64, uint32>_Descending_262144 4.52ns ± 3% 3.42ns ± 3% -24.29% (p=0.000 n=38+33)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_1 3.97ns ± 3% 4.31ns ± 3% +8.78% (p=0.000 n=39+34)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_4 2.54ns ± 2% 2.54ns ± 4% ~ (p=0.341 n=38+36)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_16 2.39ns ± 3% 1.70ns ± 6% -28.90% (p=0.000 n=38+35)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_64 2.61ns ± 2% 3.23ns ± 3% +24.07% (p=0.000 n=35+35)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_256 2.83ns ± 2% 2.97ns ± 4% +4.83% (p=0.000 n=35+37)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_1024 2.44ns ± 4% 2.44ns ± 3% ~ (p=0.481 n=36+36)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_16384 2.19ns ± 3% 2.37ns ± 6% +8.01% (p=0.000 n=36+37)
BM_Sort_tuple<uint32, uint64, uint32>_SingleElement_262144 2.34ns ± 2% 2.36ns ± 5% +1.11% (p=0.001 n=36+36)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_1 3.96ns ± 2% 4.31ns ± 3% +8.76% (p=0.000 n=33+35)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_4 2.65ns ± 6% 2.67ns ± 4% ~ (p=0.139 n=32+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_16 5.64ns ± 3% 3.56ns ± 3% -36.80% (p=0.000 n=31+35)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_64 6.12ns ±16% 5.04ns ± 4% -17.64% (p=0.000 n=40+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_256 6.78ns ± 6% 3.73ns ± 3% -44.94% (p=0.000 n=31+36)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_1024 8.36ns ±15% 6.51ns ± 4% -22.13% (p=0.000 n=40+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_16384 9.24ns ±15% 7.91ns ± 3% -14.34% (p=0.000 n=40+37)
BM_Sort_tuple<uint32, uint64, uint32>_PipeOrgan_262144 10.7ns ± 3% 9.3ns ± 6% -12.36% (p=0.000 n=32+36)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_1 3.97ns ± 3% 4.31ns ± 3% +8.63% (p=0.000 n=32+35)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_4 2.79ns ± 3% 2.76ns ± 4% -0.95% (p=0.002 n=33+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_16 5.07ns ± 3% 3.69ns ± 4% -27.35% (p=0.000 n=35+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_64 9.26ns ± 3% 8.34ns ± 7% -9.88% (p=0.000 n=35+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_256 11.8ns ± 5% 9.7ns ± 3% -17.83% (p=0.000 n=37+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_1024 19.2ns ± 4% 14.5ns ±10% -24.59% (p=0.000 n=36+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_16384 45.5ns ± 4% 37.4ns ± 9% -17.71% (p=0.000 n=35+33)
BM_Sort_tuple<uint32, uint64, uint32>_QuickSortAdversary_262144 50.0ns ± 4% 43.2ns ± 3% -13.69% (p=0.000 n=35+34)
BM_Sort_string_Random_1 4.66ns ± 6% 4.40ns ± 4% -5.55% (p=0.000 n=35+37)
BM_Sort_string_Random_4 14.9ns ± 3% 15.0ns ± 6% ~ (p=0.863 n=36+38)
BM_Sort_string_Random_16 45.5ns ± 6% 35.8ns ± 8% -21.37% (p=0.000 n=36+36)
BM_Sort_string_Random_64 66.6ns ± 4% 58.2ns ± 3% -12.69% (p=0.000 n=36+37)
BM_Sort_string_Random_256 86.0ns ± 5% 77.4ns ± 3% -10.01% (p=0.000 n=37+37)
BM_Sort_string_Random_1024 106ns ± 3% 96ns ± 6% -9.39% (p=0.000 n=37+37)
BM_Sort_string_Random_16384 154ns ± 3% 141ns ± 5% -8.03% (p=0.000 n=35+36)
BM_Sort_string_Random_262144 213ns ± 4% 197ns ± 4% -7.59% (p=0.000 n=34+34)
BM_Sort_string_Ascending_1 4.59ns ± 2% 4.56ns ±17% -0.60% (p=0.002 n=32+40)
BM_Sort_string_Ascending_4 7.52ns ± 9% 7.54ns ±12% ~ (p=0.554 n=37+40)
BM_Sort_string_Ascending_16 13.1ns ± 6% 8.8ns ±12% -33.26% (p=0.000 n=39+38)
BM_Sort_string_Ascending_64 14.8ns ±10% 14.5ns ±11% -2.15% (p=0.013 n=40+37)
BM_Sort_string_Ascending_256 14.0ns ± 6% 14.1ns ±10% ~ (p=0.760 n=37+40)
BM_Sort_string_Ascending_1024 12.9ns ±10% 12.8ns ±20% ~ (p=0.055 n=35+40)
BM_Sort_string_Ascending_16384 17.2ns ±13% 17.4ns ±21% ~ (p=1.000 n=37+40)
BM_Sort_string_Ascending_262144 17.5ns ±12% 17.5ns ±25% ~ (p=0.392 n=35+39)
BM_Sort_string_Descending_1 4.59ns ± 3% 4.34ns ± 3% -5.51% (p=0.000 n=32+33)
BM_Sort_string_Descending_4 10.1ns ± 5% 9.8ns ± 4% -2.84% (p=0.000 n=36+34)
BM_Sort_string_Descending_16 22.0ns ± 4% 39.6ns ± 4% +79.84% (p=0.000 n=36+33)
BM_Sort_string_Descending_64 21.4ns ±12% 21.3ns ±14% ~ (p=0.542 n=37+39)
BM_Sort_string_Descending_256 19.4ns ±13% 18.9ns ±13% -2.74% (p=0.039 n=37+39)
BM_Sort_string_Descending_1024 22.7ns ± 5% 17.6ns ±15% -22.52% (p=0.000 n=35+40)
BM_Sort_string_Descending_16384 27.9ns ±14% 22.6ns ±10% -19.11% (p=0.000 n=40+37)
BM_Sort_string_Descending_262144 33.8ns ±14% 26.1ns ±21% -22.74% (p=0.000 n=39+38)
BM_Sort_string_SingleElement_1 4.58ns ± 2% 4.35ns ± 3% -5.14% (p=0.000 n=35+37)
BM_Sort_string_SingleElement_4 7.92ns ± 3% 7.92ns ± 7% ~ (p=0.625 n=38+39)
BM_Sort_string_SingleElement_16 18.0ns ± 3% 7.9ns ± 6% -56.23% (p=0.000 n=36+35)
BM_Sort_string_SingleElement_64 20.3ns ± 5% 19.3ns ±15% -4.83% (p=0.000 n=34+38)
BM_Sort_string_SingleElement_256 19.4ns ± 7% 18.1ns ±14% -6.67% (p=0.000 n=36+39)
BM_Sort_string_SingleElement_1024 19.3ns ± 9% 17.4ns ±17% -9.40% (p=0.000 n=35+40)
BM_Sort_string_SingleElement_16384 17.5ns ±12% 16.2ns ±20% -7.91% (p=0.000 n=37+40)
BM_Sort_string_SingleElement_262144 16.7ns ±18% 15.3ns ±27% -8.56% (p=0.000 n=40+40)
BM_Sort_string_PipeOrgan_1 4.60ns ± 2% 4.33ns ± 3% -5.80% (p=0.000 n=33+31)
BM_Sort_string_PipeOrgan_4 8.29ns ± 4% 8.17ns ± 8% -1.50% (p=0.004 n=39+36)
BM_Sort_string_PipeOrgan_16 22.9ns ± 3% 16.4ns ± 6% -28.45% (p=0.000 n=39+38)
BM_Sort_string_PipeOrgan_64 30.7ns ± 4% 28.9ns ± 7% -6.05% (p=0.000 n=38+37)
BM_Sort_string_PipeOrgan_256 38.1ns ± 3% 22.5ns ± 9% -40.78% (p=0.000 n=37+37)
BM_Sort_string_PipeOrgan_1024 45.4ns ± 4% 36.2ns ± 6% -20.33% (p=0.000 n=37+37)
BM_Sort_string_PipeOrgan_16384 56.2ns ± 4% 49.0ns ± 8% -12.73% (p=0.000 n=36+38)
BM_Sort_string_PipeOrgan_262144 77.8ns ±13% 62.8ns ±10% -19.27% (p=0.000 n=39+39)
BM_Sort_string_QuickSortAdversary_1 4.80ns ±16% 4.34ns ± 4% -9.56% (p=0.000 n=39+34)
BM_Sort_string_QuickSortAdversary_4 14.8ns ± 5% 14.7ns ± 4% -0.80% (p=0.037 n=33+33)
BM_Sort_string_QuickSortAdversary_16 44.6ns ± 4% 34.8ns ± 5% -21.98% (p=0.000 n=35+34)
BM_Sort_string_QuickSortAdversary_64 66.2ns ± 3% 58.1ns ± 4% -12.32% (p=0.000 n=36+35)
BM_Sort_string_QuickSortAdversary_256 85.4ns ± 5% 76.9ns ± 6% -9.99% (p=0.000 n=36+36)
BM_Sort_string_QuickSortAdversary_1024 106ns ± 4% 96ns ± 3% -9.62% (p=0.000 n=34+37)
BM_Sort_string_QuickSortAdversary_16384 153ns ± 3% 141ns ± 4% -8.22% (p=0.000 n=34+37)
BM_Sort_string_QuickSortAdversary_262144 211ns ± 5% 195ns ± 6% -7.77% (p=0.000 n=35+38)
Differential Revision: https://reviews.llvm.org/D122780
More information about the All-commits
mailing list