[libcxx-commits] [PATCH] D126882: [libc++] Do not yield from __sp_mut::lock()

Louis Dionne via Phabricator via libcxx-commits libcxx-commits at lists.llvm.org
Tue Jun 7 06:49:06 PDT 2022


ldionne added a comment.

In D126882#3558306 <https://reviews.llvm.org/D126882#3558306>, @dim wrote:

> To answer my own question partially, the original commit is Howard's: rG088e37c77aafaec5ead8fbe7ebf918265e6b86f2 <https://reviews.llvm.org/rG088e37c77aafaec5ead8fbe7ebf918265e6b86f2>:
>
>> Despite my pathological distrust of spin locks, the number just don't lie.  I've put a small spin in __sp_mut::lock() on std::mutex::try_lock(), which is testing quite well.  In my experience, putting in a yield for every failed iteration is also a major performance booster.  This change makes one of the performance tests I was using (a highly contended one) run about 20 times faster.
>
> But this was on macOS at the time, so ~2012...

Right, and macOS's locking primitives have improved a lot since then. According to our perf engineers, the current pattern of yielding is actually actively harmful for performance on our platform, and apparently on Linux too.

In D126882#3558302 <https://reviews.llvm.org/D126882#3558302>, @dim wrote:

> Or maybe I should ask the question differently: what was the original reason for adding the while loop in the first place?

One way to see this is that Howard originally added this complexity layer on top of the OS's mutex because experimentation showed it to be better. However, as OSes improve their threading primitives, it becomes better to give as much information to the OS as possible, which is what this patch does.



================
Comment at: libcxx/src/memory.cpp:141
 
-static constexpr std::size_t __sp_mut_count = 16;
+static constexpr std::size_t __sp_mut_count = 32;
 static constinit __libcpp_mutex_t mut_back[__sp_mut_count] =
----------------
Mordante wrote:
> ldionne wrote:
> > Mordante wrote:
> > > Since you change this value it might be a good moment to document why 32 is "the right" value.
> > Honestly, I just decided to double it to decrease contention a bit, but it wasn't done in any scientific kind of way.
> Fair point. Since you looked at it with a performance engineer I hoped there was some additional info on this change.
> If there isn't more info then it's fine by me too. 
The reasoning is basically that by doubling the number of mutexes, we should roughly half the amount of contention. Since there's virtually no size increase by doubling the number of mutexes, it's an easy tradeoff to make.


With 16 mutexes:
```
$ bloaty build/default/lib/libc++.dylib
    FILE SIZE        VM SIZE
 --------------  --------------
  37.3%   906Ki  37.0%   906Ki    String Table
  30.2%   733Ki  29.9%   733Ki    __TEXT,__text
  19.6%   476Ki  19.5%   476Ki    Symbol Table
   5.2%   125Ki   5.1%   125Ki    __TEXT,__const
   3.2%  77.9Ki   3.3%  80.0Ki    [__LINKEDIT]
   0.0%       0   1.0%  23.3Ki    __DATA,__bss
   0.8%  18.7Ki   0.8%  18.7Ki    [__TEXT]
   0.8%  18.5Ki   0.8%  18.5Ki    __DATA_CONST,__const
   0.7%  16.7Ki   0.7%  16.7Ki    __TEXT,__gcc_except_tab
   0.6%  14.3Ki   0.2%  5.52Ki    [__DATA]
   0.5%  11.9Ki   0.5%  11.9Ki    [__DATA_CONST]
   0.4%  10.2Ki   0.4%  10.2Ki    __TEXT,__unwind_info
   0.3%  6.58Ki   0.3%  6.58Ki    Function Start Addresses
   0.2%  4.93Ki   0.2%  4.93Ki    __TEXT,__cstring
   0.1%  1.71Ki   0.1%  1.71Ki    __DATA,__data
   0.1%  1.70Ki   0.1%  1.66Ki    [Mach-O Headers]
   0.1%  1.59Ki   0.1%  1.59Ki    __DATA_CONST,__got
   0.0%       0   0.1%  1.45Ki    __DATA,__common
   0.1%  1.41Ki   0.1%  1.41Ki    Indirect Symbol Table
   0.0%     954   0.0%     954    __TEXT,__stubs
   0.0%     496   0.0%     496    [2 Others]
 100.0%  2.37Mi 100.0%  2.39Mi    TOTAL

$ bloaty build/default/lib/libc++.a
    FILE SIZE        VM SIZE
 --------------  --------------
 100.0%  7.43Mi   NAN%       0    [AR Non-ELF Member File]
   0.0%  2.82Ki   NAN%       0    [AR Headers]
 100.0%  7.44Mi 100.0%       0    TOTAL
```

With 32 mutexes:
```
$ bloaty build/default/lib/libc++.dylib
    FILE SIZE        VM SIZE
 --------------  --------------
  37.3%   906Ki  37.0%   906Ki    String Table
  30.2%   733Ki  29.9%   733Ki    __TEXT,__text
  19.6%   476Ki  19.5%   476Ki    Symbol Table
   5.2%   125Ki   5.1%   125Ki    __TEXT,__const
   3.2%  77.9Ki   3.3%  80.2Ki    [__LINKEDIT]
   0.0%       0   1.0%  23.3Ki    __DATA,__bss
   0.8%  18.8Ki   0.8%  18.8Ki    [__TEXT]
   0.8%  18.5Ki   0.8%  18.5Ki    __DATA_CONST,__const
   0.7%  16.7Ki   0.7%  16.7Ki    __TEXT,__gcc_except_tab
   0.5%  13.2Ki   0.2%  4.39Ki    [__DATA]
   0.5%  11.9Ki   0.5%  11.9Ki    [__DATA_CONST]
   0.4%  10.2Ki   0.4%  10.2Ki    __TEXT,__unwind_info
   0.3%  6.57Ki   0.3%  6.57Ki    Function Start Addresses
   0.2%  4.93Ki   0.2%  4.93Ki    __TEXT,__cstring
   0.1%  2.84Ki   0.1%  2.84Ki    __DATA,__data
   0.1%  1.70Ki   0.1%  1.66Ki    [Mach-O Headers]
   0.1%  1.59Ki   0.1%  1.59Ki    __DATA_CONST,__got
   0.0%       0   0.1%  1.45Ki    __DATA,__common
   0.1%  1.41Ki   0.1%  1.41Ki    Indirect Symbol Table
   0.0%     954   0.0%     954    __TEXT,__stubs
   0.0%     496   0.0%     496    [2 Others]
 100.0%  2.37Mi 100.0%  2.39Mi    TOTAL

$ bloaty build/default/lib/libc++.a
    FILE SIZE        VM SIZE
 --------------  --------------
 100.0%  7.43Mi   NAN%       0    [AR Non-ELF Member File]
   0.0%  2.82Ki   NAN%       0    [AR Headers]
 100.0%  7.44Mi 100.0%       0    TOTAL
```




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D126882/new/

https://reviews.llvm.org/D126882



More information about the libcxx-commits mailing list