[libcxx-commits] [libcxx] [libcxx] Shared Mutex no longer holds the lock when calling notify_* on gates. (PR #107876)

Sun Jan 12 03:17:42 PST 2025

https://github.com/huixie90 approved this pull request.

Thanks! Thank implementation looks correct and should help with the performance. We delegate the underlying platform's pthread implementation (if there is one). I am not aware any optimisations like "if notifying thread holds the mutex that the cv is associated with, skip the wake up". I did a simple benchmark on Mac OS (with M4 CPU)

```cpp
template <std::size_t NumReaders>
void test() {
  std::shared_mutex m;
  std::size_t writerCount = 0;
  std::array<std::size_t, NumReaders> readerCounts{};

  std::vector<std::jthread> threads;
  threads.reserve(NumReaders + 1);

  threads.emplace_back([&](std::stop_token st) {
    while (!st.stop_requested()) {
      std::unique_lock lock(m);
      writerCount++;
    }
  });

  for (std::size_t i = 0; i < NumReaders; ++i) {
    threads.emplace_back([&](std::stop_token st) {
      while (!st.stop_requested()) {
        std::shared_lock lock(m);
        readerCounts[i]++;
      }
    });
  }

  std::this_thread::sleep_for(2s);
  threads.clear();

  std::cout << "writerCount: " << writerCount << std::endl;
  for(std::size_t i = 0; i < NumReaders; ++i) {
    std::cout << "readerCounts[" << i << "]: " << readerCounts[i] << std::endl;
  }
}
```

Before your change, a typical output looks like

```bash
writerCount: 16492645
readerCounts[0]: 0
readerCounts[1]: 0
readerCounts[2]: 0
readerCounts[3]: 39
readerCounts[4]: 0
```

And with your fix , the output looks like

```bash
writerCount: 17607546
readerCounts[0]: 0
readerCounts[1]: 258
readerCounts[2]: 193
readerCounts[3]: 22
readerCounts[4]: 0
```

Both overall throughput and the fairness have improved

https://github.com/llvm/llvm-project/pull/107876