[PATCH] D78856: [Support] Simplify and optimize ThreadPool

Sat Apr 25 11:08:27 PDT 2020

MaskRay marked an inline comment as done.
MaskRay added a comment.

In D78856#2003725 <https://reviews.llvm.org/D78856#2003725>, @mehdi_amini wrote:

> > No. Only the thread(s) calling ThreadPool::Wait() waits on CompletionCondition. When it gets stuck, it will not awake spuriously. Note, before, we took locks 3 times in the loop body. Now it is 2...
>
> I wasn't worried about contention with the waiting threads, but between working threads: they are taking the QueueLock more often now (it was taken once or twice per iteration of the loop body before and is now taken an extra time).
>  I am not sure we'd be able to measure any difference here, just curious how you thought about this tradeoff!

There are two stages, before and after the execution of as task. The original code sequence used different locks. I suppose this is what you meant by less contention: a worker in the first stage (holding QueueLock while not holding CompletionLock) cannot contend with a worker in the second stage (holding CompletionLock)..

But note that:

- there is still a window when a worker in the first stage can hold CompletionLock. The window is small, but it still exists.
- the spurious signalling of a condition variable (CompletionCondition) can cause contention because a condition variable has an internal lock.

> Originally I think I started writing all this code with `ActiveThreads` being an atomic so that the working threads can increment/decrement it without taking any lock: however adding the "wait()" then I couldn't see how to avoid taking the `CompletionLock` because of how std::condition_variable is setup. 
>  That said since the waiting thread takes the QueueLock with your change, would leaving the  `ActiveThreads` atomic allow to not take the Queue lock on task completion? 
>  I'm wondering about something like this:
> 
>   // Notify task completion if this is the last active thread, in case
>   // someone waits on ThreadPool::wait().
>   bool Notify = --ActiveThreads == 0;
>   if (Notify)
>     CompletionCondition.notify_all();
>    

The suggested code sequence has the problem of lost signals. It happens because a waiter thread can change the observed state between waiter's testing and blocking.

    worker: pop one item from Tasks and Tasks is now empty
  waiter: lock QueueLock
  waiter: test that `!ActiveThreads && Tasks.empty()` is false and decide to unlock and block
    worker: decrement ActiveThreads to zero
    worker: signal QueueLock  # the signal is lost
  waiter: unlock QueueLock and block

With an atomic `ActiveThreads`, an alternative code sequence is possible, but I'm not sure it is better:

  bool Notify = --ActiveThreads == 0;
  { lock_guard guard(QueueLock); Notify &= Tasks.empty(); }
  if (Notify) CompletionCondition.notify_all();

================
Comment at: llvm/lib/Support/ThreadPool.cpp:57
+          std::lock_guard<std::mutex> LockGuard(QueueLock);
+          Notify = --ActiveThreads == 0 && Tasks.empty();
         }
----------------
aganea wrote:
> Is it worth generalizing the notify condition between this and `ThreadPool::wait()` below, to ease future maintenance/comprehension?
> Is it worth generalizing the notify condition between this and ThreadPool::wait() below, to ease future maintenance/comprehension?

Sorry that I am not following this suggestion. Can you elaborate?

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78856/new/

https://reviews.llvm.org/D78856