[libc-commits] [libc] [libc][rwlock] fix timeout writer signal stealing problem (PR #201937)

Schrodinger ZHU Yifan via libc-commits libc-commits at lists.llvm.org
Wed Jun 10 08:28:51 PDT 2026


https://github.com/SchrodingerZhu updated https://github.com/llvm/llvm-project/pull/201937

>From 1bd653703dcc6e728bfb7b92d20e8641d453b7f8 Mon Sep 17 00:00:00 2001
From: Yifan Zhu <yfzhu at google.com>
Date: Fri, 5 Jun 2026 13:35:46 -0700
Subject: [PATCH 1/3] [libc][rwlock] fix timeout writer signal stealing problem

When a timeout triggers, the waiting thread wakes up, unregisters
itself from the waiting queue, and exits. However, if the timing-out
thread is preempted after waking up but before it can unregister,
and a concurrent unlock occurs during this window, the timing-out
thread may consume the wake-up signal.

For example, assume the lock is in writer-preference mode and a
writer (W0) holds the lock. A reader (R) and another writer (W1,
with a short timeout) arrive and join the queue. W1's timeout
expires, so it wakes up and attempts to acquire the queue lock,
but is preempted before succeeding. W0 then releases the lock and,
preferring writers, sends a wake-up signal to W1. When W1 resumes,
it acquires the queue lock, unregisters, and exits due to the
timeout, ignoring the wake-up signal. As a result, the reader (R)
is left waiting indefinitely, leading to a deadlock.

To fix this, we track whether the serialization number changed
specifically for writers, and propagate the wake signal if it did.
If the timing-out thread is a reader, signal consumption is safe
because:
1. If there are pending writers, they will be woken up first.
2. Otherwise, if there are pending readers, they are all woken up
   via broadcasting (notify_all), so one reader timing out does not
   steal others' signal.

Assisted-by: AI tools, manually checked
---
 libc/src/__support/threads/raw_rwlock.h | 37 +++++++++++++++++++++++--
 1 file changed, 35 insertions(+), 2 deletions(-)

diff --git a/libc/src/__support/threads/raw_rwlock.h b/libc/src/__support/threads/raw_rwlock.h
index 1b0bba9c72675..0e85c40102faa 100644
--- a/libc/src/__support/threads/raw_rwlock.h
+++ b/libc/src/__support/threads/raw_rwlock.h
@@ -405,22 +405,55 @@ class RawRwLock {
       }
 
       // Phase 7: unregister ourselves as a pending reader/writer.
+      bool writer_serial_changed = false;
       {
         // Similarly, the unregister operation should also be an atomic
         // transaction.
         WaitingQueue::Guard guard = queue.acquire(is_pshared);
         guard.pending_count<role>()--;
-        // Clear the flag if we are the last reader. The flag must be
+        // Clear the flag if we are the last one. The flag must be
         // cleared otherwise operations like trylock may fail even though
         // there is no competitors.
         if (guard.pending_count<role>() == 0)
           RwState::fetch_clear_pending_bit<role>(state,
                                                  cpp::MemoryOrder::RELAXED);
+        if constexpr (role == Role::Writer) {
+          int new_serial =
+              guard.serialization<role>().load(cpp::MemoryOrder::RELAXED);
+          writer_serial_changed = new_serial != serial_number;
+        }
       }
 
       // Phase 8: exit the loop is timeout is reached.
-      if (timeout_flag)
+      if (timeout_flag) {
+        // When a timeout triggers, the waiting thread wakes up, unregisters
+        // itself from the waiting queue, and exits. However, if the timing-out
+        // thread is preempted after waking up but before it can unregister,
+        // and a concurrent unlock occurs during this window, the timing-out
+        // thread may consume the wake-up signal.
+        //
+        // For example, assume the lock is in writer-preference mode and a
+        // writer (W0) holds the lock. A reader (R) and another writer (W1,
+        // with a short timeout) arrive and join the queue. W1's timeout
+        // expires, so it wakes up and attempts to acquire the queue lock,
+        // but is preempted before succeeding. W0 then releases the lock and,
+        // preferring writers, sends a wake-up signal to W1. When W1 resumes,
+        // it acquires the queue lock, unregisters, and exits due to the
+        // timeout, ignoring the wake-up signal. As a result, the reader (R)
+        // is left waiting indefinitely, leading to a deadlock.
+        //
+        // To fix this, we track whether the serialization number changed
+        // specifically for writers, and propagate the wake signal if it did.
+        // If the timing-out thread is a reader, signal consumption is safe
+        // because:
+        // 1. If there are pending writers, they will be woken up first.
+        // 2. Otherwise, if there are pending readers, they are all woken up
+        //    via broadcasting (notify_all), so one reader timing out does not
+        //    steal others' signal.
+        if (writer_serial_changed)
+          notify_pending_threads();
         return LockResult::TimedOut;
+      }
 
       // Phase 9: reload the state and retry the acquisition.
       old = RwState::spin_reload<role>(state, get_preference(), spin_count);

>From 3896fbed5c0c9a9876fd02761c27c829ba035b4b Mon Sep 17 00:00:00 2001
From: Schrodinger ZHU Yifan <yifanzhu at rochester.edu>
Date: Tue, 9 Jun 2026 10:55:37 -0700
Subject: [PATCH 2/3] fix typo

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot at users.noreply.github.com>
---
 libc/src/__support/threads/raw_rwlock.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libc/src/__support/threads/raw_rwlock.h b/libc/src/__support/threads/raw_rwlock.h
index 0e85c40102faa..03c2f699860e0 100644
--- a/libc/src/__support/threads/raw_rwlock.h
+++ b/libc/src/__support/threads/raw_rwlock.h
@@ -424,7 +424,7 @@ class RawRwLock {
         }
       }
 
-      // Phase 8: exit the loop is timeout is reached.
+      // Phase 8: exit the loop if timeout is reached.
       if (timeout_flag) {
         // When a timeout triggers, the waiting thread wakes up, unregisters
         // itself from the waiting queue, and exits. However, if the timing-out

>From fbfc2729af87279fc5c5e10d47390a3efa1ea47a Mon Sep 17 00:00:00 2001
From: Yifan Zhu <yfzhu at google.com>
Date: Wed, 10 Jun 2026 08:28:33 -0700
Subject: [PATCH 3/3] [libc][rwlock] adjust comment styles and use
 FutexWordType to replace int

---
 libc/src/__support/threads/raw_rwlock.h | 45 +++++++++----------------
 1 file changed, 16 insertions(+), 29 deletions(-)

diff --git a/libc/src/__support/threads/raw_rwlock.h b/libc/src/__support/threads/raw_rwlock.h
index 03c2f699860e0..6b61ff3a2eefc 100644
--- a/libc/src/__support/threads/raw_rwlock.h
+++ b/libc/src/__support/threads/raw_rwlock.h
@@ -375,7 +375,7 @@ class RawRwLock {
         return result;
 
       // Phase 5: register ourselves as a  reader.
-      int serial_number;
+      FutexWordType serial_number;
       {
         // The queue need to be protected by a mutex since the operations in
         // this block must be executed as a whole transaction. It is possible
@@ -411,14 +411,14 @@ class RawRwLock {
         // transaction.
         WaitingQueue::Guard guard = queue.acquire(is_pshared);
         guard.pending_count<role>()--;
-        // Clear the flag if we are the last one. The flag must be
-        // cleared otherwise operations like trylock may fail even though
-        // there is no competitors.
+        // Clear the flag if we are the last pending thread of this role. The
+        // flag must be cleared; otherwise operations like trylock may fail even
+        // though there are no competitors.
         if (guard.pending_count<role>() == 0)
           RwState::fetch_clear_pending_bit<role>(state,
                                                  cpp::MemoryOrder::RELAXED);
         if constexpr (role == Role::Writer) {
-          int new_serial =
+          FutexWordType new_serial =
               guard.serialization<role>().load(cpp::MemoryOrder::RELAXED);
           writer_serial_changed = new_serial != serial_number;
         }
@@ -426,30 +426,10 @@ class RawRwLock {
 
       // Phase 8: exit the loop if timeout is reached.
       if (timeout_flag) {
-        // When a timeout triggers, the waiting thread wakes up, unregisters
-        // itself from the waiting queue, and exits. However, if the timing-out
-        // thread is preempted after waking up but before it can unregister,
-        // and a concurrent unlock occurs during this window, the timing-out
-        // thread may consume the wake-up signal.
-        //
-        // For example, assume the lock is in writer-preference mode and a
-        // writer (W0) holds the lock. A reader (R) and another writer (W1,
-        // with a short timeout) arrive and join the queue. W1's timeout
-        // expires, so it wakes up and attempts to acquire the queue lock,
-        // but is preempted before succeeding. W0 then releases the lock and,
-        // preferring writers, sends a wake-up signal to W1. When W1 resumes,
-        // it acquires the queue lock, unregisters, and exits due to the
-        // timeout, ignoring the wake-up signal. As a result, the reader (R)
-        // is left waiting indefinitely, leading to a deadlock.
-        //
-        // To fix this, we track whether the serialization number changed
-        // specifically for writers, and propagate the wake signal if it did.
-        // If the timing-out thread is a reader, signal consumption is safe
-        // because:
-        // 1. If there are pending writers, they will be woken up first.
-        // 2. Otherwise, if there are pending readers, they are all woken up
-        //    via broadcasting (notify_all), so one reader timing out does not
-        //    steal others' signal.
+        // A timed out (but not unregistered) thread can still be the target of
+        // a wakeup signal. If this happens, we must wake up the next thread to
+        // avoid a deadlock. This is only a problem for writers because readers
+        // are woken up all at once and we prefer waking up writers first.
         if (writer_serial_changed)
           notify_pending_threads();
         return LockResult::TimedOut;
@@ -465,6 +445,13 @@ class RawRwLock {
   // Since notifcation routine is colder we mark it as noinline explicitly.
   [[gnu::noinline]]
   LIBC_INLINE void notify_pending_threads() {
+    // We wake up writers first. This ordering is relevant for the correctness
+    // of timeout handling in lock_slow. Because writers are preferred, we do
+    // not need to handle reader timeouts specially:
+    // 1. If there are pending writers, they are woken up first, so a reader
+    //    cannot steal their wake-up signal.
+    // 2. If there are only pending readers, they are all woken up together
+    //    (via notify_all), so a timed-out reader cannot starve other readers.
     enum class WakeTarget { Readers, Writers, None };
     WakeTarget status;
 



More information about the libc-commits mailing list