[all-commits] [llvm/llvm-project] 69279a: [libc++][test] add benchmarks for `std::atomic::wa...

Hui via All-commits all-commits at lists.llvm.org
Wed Feb 21 05:43:46 PST 2024


  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: 69279a8413e08dd24168bad961975e79a50d9c19
      https://github.com/llvm/llvm-project/commit/69279a8413e08dd24168bad961975e79a50d9c19
  Author: Hui <hui.xie1990 at gmail.com>
  Date:   2024-02-21 (Wed, 21 Feb 2024)

  Changed paths:
    M libcxx/benchmarks/CMakeLists.txt
    A libcxx/benchmarks/atomic_wait.bench.cpp
    A libcxx/benchmarks/atomic_wait_vs_mutex_lock.bench.cpp

  Log Message:
  -----------
  [libc++][test] add benchmarks for `std::atomic::wait` (#70571)

For the mutex vs atomic  test:

Old: `unique_lock<mutex>`
New: a lock implemented with `atomic::wait`

On 10 years old Intel Macbook, `atomic::wait` is 50% slower than `mutex`

```
Benchmark                                             Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------
BM_multi_thread_lock_unlock/1024                   +0.3735         +2.4497       1724726       2368935        153159        528354
BM_multi_thread_lock_unlock/2048                   +0.4174         +1.2487       3410538       4834012        435062        978311
BM_multi_thread_lock_unlock/4096                   +0.5256         +1.9824       6903783      10532681        590266       1760405
BM_multi_thread_lock_unlock/8192                   +0.5415         +0.4578      14536391      22408399       1456328       2123075
BM_multi_thread_lock_unlock/16384                  +0.5663         +0.0513      30181991      47275023       3316850       3486950
BM_multi_thread_lock_unlock/32768                  +0.5635         -0.2081      62027663      96977726       6477076       5129190
BM_multi_thread_lock_unlock/65536                  +0.5228         -0.3273     129637761     197408739      11341630       7628955
BM_multi_thread_lock_unlock/131072                 +0.4825         -0.1070     266256295     394712193      10379800       9269200
BM_multi_thread_lock_unlock/262144                 +0.4793         +0.2795     539732340     798409253      10802200      13821100
BM_multi_thread_lock_unlock/524288                 +0.5272         +0.2847    1070035132    1634124353      14523000      18657800
BM_multi_thread_lock_unlock/1048576                +0.4799         +0.3353    2125510441    3145636119      13404200      17899000
OVERALL_GEOMEAN                                    +0.4970         +0.3886             0             0             0             0
```

On Apple Arm, `atomic::wait` is 200% slower than `mutex`. And
`atomic::wait` is even slower than my 10 years old Intel CPU Macbook

```
Benchmark                                             Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------
BM_multi_thread_lock_unlock/1024                   +2.1811         +3.9854       2036726       6478993        119817        597334
BM_multi_thread_lock_unlock/2048                   +1.6736         +1.4301       3162161       8454415        426201       1035727
BM_multi_thread_lock_unlock/4096                   +1.1017         +0.6456       6620503      13914159        893019       1469578
BM_multi_thread_lock_unlock/8192                   +0.6688         +0.2148      12089392      20174635       1489000       1808799
BM_multi_thread_lock_unlock/16384                  +1.4217         -0.2436      19365999      46899345       2068266       1564530
BM_multi_thread_lock_unlock/32768                  +2.6161         -0.4927      31371052     113440165       3715100       1884540
BM_multi_thread_lock_unlock/65536                  +2.6286         -0.3967      54314581     197086847       5912764       3567410
BM_multi_thread_lock_unlock/131072                 +2.3554         +0.4990     103176565     346201425       9260407      13880900
BM_multi_thread_lock_unlock/262144                 +2.8780         +0.4995     182355400     707170733      16335852      24496000
BM_multi_thread_lock_unlock/524288                 +3.0280         +0.3001     360953079    1453902595      32548700      42316364
BM_multi_thread_lock_unlock/1048576                +3.7480         +1.2374     714500462    3392470417      48603455     108747000
OVERALL_GEOMEAN                                    +2.0791         +0.3874             0             0             0             0
```











For the atomic_wait test:

On my 2013 MacBook with Intel CPU

```
Run on (8 X 2300 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x4)
  L1 Instruction 32 KiB (x4)
  L2 Unified 256 KiB (x4)
  L3 Unified 6144 KiB (x1)
Load Average: 1.95, 3.77, 4.13
-----------------------------------------------------------------------------------------------------
Benchmark                                                           Time             CPU   Iterations
-----------------------------------------------------------------------------------------------------
BM_atomic_wait_one_thread_one_atomic_wait/1024                 184455 ns       183979 ns         3760
BM_atomic_wait_one_thread_one_atomic_wait/2048                 361607 ns       360917 ns         1912
BM_atomic_wait_one_thread_one_atomic_wait/4096                 709055 ns       708326 ns          929
BM_atomic_wait_one_thread_one_atomic_wait/8192                1469063 ns      1467430 ns          488
BM_atomic_wait_one_thread_one_atomic_wait/16384               2865332 ns      2863473 ns          237
BM_atomic_wait_one_thread_one_atomic_wait/32768               5839429 ns      5834708 ns          113
BM_atomic_wait_one_thread_one_atomic_wait/65536              11460822 ns     11453183 ns           60
BM_atomic_wait_one_thread_one_atomic_wait/131072             23052804 ns     23035000 ns           30
BM_atomic_wait_one_thread_one_atomic_wait/262144             46958743 ns     46712733 ns           15
BM_atomic_wait_one_thread_one_atomic_wait/524288             93151904 ns     92977429 ns            7
BM_atomic_wait_one_thread_one_atomic_wait/1048576           186100011 ns    185888500 ns            4
BM_atomic_wait_one_thread_one_atomic_wait/2097152           364548135 ns    364280000 ns            2
BM_atomic_wait_one_thread_one_atomic_wait/4194304           747181672 ns    745056000 ns            1
BM_atomic_wait_one_thread_one_atomic_wait/8388608          1473070400 ns   1471165000 ns            1
BM_atomic_wait_one_thread_one_atomic_wait/16777216         2950352547 ns   2947373000 ns            1
BM_atomic_wait_multi_thread_one_atomic_wait/1024               668544 ns       167233 ns         4496
BM_atomic_wait_multi_thread_one_atomic_wait/2048              1384668 ns       369750 ns         1941
BM_atomic_wait_multi_thread_one_atomic_wait/4096              2851627 ns       768559 ns          995
BM_atomic_wait_multi_thread_one_atomic_wait/8192              5797669 ns      1476876 ns          526
BM_atomic_wait_multi_thread_one_atomic_wait/16384            11597952 ns      2692792 ns          260
BM_atomic_wait_multi_thread_one_atomic_wait/32768            23528028 ns      5291465 ns          142
BM_atomic_wait_multi_thread_one_atomic_wait/65536            46287247 ns      8547713 ns           87
BM_atomic_wait_multi_thread_one_atomic_wait/131072           90315848 ns     13294492 ns           61
BM_atomic_wait_multi_thread_one_atomic_wait/262144          190722393 ns     16193917 ns           36
BM_atomic_wait_multi_thread_one_atomic_wait/524288          408456684 ns     23641600 ns           10
BM_atomic_wait_multi_thread_one_atomic_wait/1048576         708809670 ns     36361900 ns           10
BM_atomic_wait_multi_thread_wait_different_atomics/1024       2116444 ns        11669 ns        10000
BM_atomic_wait_multi_thread_wait_different_atomics/2048      12435259 ns        21905 ns         1000
BM_atomic_wait_multi_thread_wait_different_atomics/4096       6393816 ns        17819 ns         1000
BM_atomic_wait_multi_thread_wait_different_atomics/8192      11930400 ns        28637 ns         1000
BM_atomic_wait_multi_thread_wait_different_atomics/16384     20987224 ns        35272 ns         1000
BM_atomic_wait_multi_thread_wait_different_atomics/32768     44335820 ns        66660 ns          100
BM_atomic_wait_multi_thread_wait_different_atomics/65536     91395912 ns       129030 ns          100
BM_atomic_wait_multi_thread_wait_different_atomics/131072   145440007 ns       165960 ns          100
BM_atomic_wait_multi_thread_wait_different_atomics/262144   368219935 ns       420800 ns           10
BM_atomic_wait_multi_thread_wait_different_atomics/524288   630106863 ns       809500 ns           10
BM_atomic_wait_multi_thread_wait_different_atomics/1048576 1138174673 ns      1093000 ns           10
```

On apple arm

```
Run on (8 X 24.1208 MHz CPU s)
CPU Caches:
  L1 Data 64 KiB (x8)
  L1 Instruction 128 KiB (x8)
  L2 Unified 4096 KiB (x2)
Load Average: 1.34, 1.58, 1.66
-----------------------------------------------------------------------------------------------------
Benchmark                                                           Time             CPU   Iterations
-----------------------------------------------------------------------------------------------------
BM_atomic_wait_one_thread_one_atomic_wait/1024                  61602 ns        61602 ns         8701
BM_atomic_wait_one_thread_one_atomic_wait/2048                 123148 ns       123146 ns         5688
BM_atomic_wait_one_thread_one_atomic_wait/4096                 246248 ns       246249 ns         2888
BM_atomic_wait_one_thread_one_atomic_wait/8192                 480373 ns       480359 ns         1455
BM_atomic_wait_one_thread_one_atomic_wait/16384                974725 ns       974721 ns          724
BM_atomic_wait_one_thread_one_atomic_wait/32768               1922185 ns      1922115 ns          355
BM_atomic_wait_one_thread_one_atomic_wait/65536               3940632 ns      3940608 ns          181
BM_atomic_wait_one_thread_one_atomic_wait/131072              7886302 ns      7886102 ns           88
BM_atomic_wait_one_thread_one_atomic_wait/262144             15393156 ns     15393000 ns           45
BM_atomic_wait_one_thread_one_atomic_wait/524288             30833221 ns     30832174 ns           23
BM_atomic_wait_one_thread_one_atomic_wait/1048576            62551936 ns     62551909 ns           11
BM_atomic_wait_one_thread_one_atomic_wait/2097152           123155625 ns    123155667 ns            6
BM_atomic_wait_one_thread_one_atomic_wait/4194304           252468180 ns    252458667 ns            3
BM_atomic_wait_one_thread_one_atomic_wait/8388608           505075604 ns    505075500 ns            2
BM_atomic_wait_one_thread_one_atomic_wait/16777216          992977209 ns    992935000 ns            1
BM_atomic_wait_multi_thread_one_atomic_wait/1024               531411 ns       239695 ns         2783
BM_atomic_wait_multi_thread_one_atomic_wait/2048              1030592 ns       484868 ns         1413
BM_atomic_wait_multi_thread_one_atomic_wait/4096              1951896 ns       922357 ns          631
BM_atomic_wait_multi_thread_one_atomic_wait/8192              3759893 ns      1952074 ns          390
BM_atomic_wait_multi_thread_one_atomic_wait/16384             7417929 ns      3458309 ns          233
BM_atomic_wait_multi_thread_one_atomic_wait/32768            14386361 ns      5590830 ns          100
BM_atomic_wait_multi_thread_one_atomic_wait/65536            29725536 ns      6521887 ns          115
BM_atomic_wait_multi_thread_one_atomic_wait/131072           60023797 ns     10766795 ns           73
BM_atomic_wait_multi_thread_one_atomic_wait/262144          120782267 ns     17532091 ns           44
BM_atomic_wait_multi_thread_one_atomic_wait/524288          242539333 ns     27506920 ns           25
BM_atomic_wait_multi_thread_one_atomic_wait/1048576         482833787 ns     53721600 ns           10
BM_atomic_wait_multi_thread_wait_different_atomics/1024       2230048 ns       626042 ns         1000
BM_atomic_wait_multi_thread_wait_different_atomics/2048       3931958 ns       837540 ns          884
BM_atomic_wait_multi_thread_wait_different_atomics/4096       6506887 ns      1127922 ns          586
BM_atomic_wait_multi_thread_wait_different_atomics/8192      10528008 ns      1651254 ns          456
BM_atomic_wait_multi_thread_wait_different_atomics/16384     18055829 ns      2066379 ns          317
BM_atomic_wait_multi_thread_wait_different_atomics/32768     29878496 ns      2875600 ns          100
BM_atomic_wait_multi_thread_wait_different_atomics/65536     50523799 ns      3193170 ns          100
BM_atomic_wait_multi_thread_wait_different_atomics/131072    85926943 ns      4121950 ns          100
BM_atomic_wait_multi_thread_wait_different_atomics/262144   154602296 ns      5879050 ns          100
BM_atomic_wait_multi_thread_wait_different_atomics/524288   279121754 ns     10063400 ns           10
BM_atomic_wait_multi_thread_wait_different_atomics/1048576  522796900 ns     12370300 ns           10
```



To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications


More information about the All-commits mailing list