[PATCH] D18811: Fix a race condition in support library ThreadPool

Wed Apr 6 13:59:54 PDT 2016

jhen added inline comments.

================
Comment at: include/llvm/Support/ThreadPool.h:130
@@ -132,3 +129,3 @@
   bool EnableFlag;
 #endif
 };
----------------
jlebar wrote:
> Maybe worth adding comments here indicating which variables are guarded by the mutex, so we are less likely to have the bug we previously had?  I'd also move the definition of Mutex and the condvars above whatever they protect.
> 
> It would also be nice to have a testcase which illustrates the bug (even if only under tsan), but I don't know how hard that would be.
Done for the comments and rearranging.

The current tests show the bug under tsan, but I can't think of a way to get it to reliably manifest itself otherwise.

================
Comment at: lib/Support/ThreadPool.cpp:39
@@ -40,1 +38,3 @@
+          SubmissionCondition.wait(
+              LockGuard, [&] { return !EnableFlag || !Tasks.empty(); });
           // Exit condition
----------------
jlebar wrote:
> Hmm, this means that if there are tasks in the queue when we run the destructor, we skip them.  That is, we only block until we complete the currently-running tasks.
> 
> At least this should be documented in the destructor comment; it's not clear to me from that comment that this is our behavior.  But I'm also not sure it's desirable, as can leave us with futures that will never be completed.  Specifically, if you have threads waiting on futures, and then you destroy the threadpool without calling wait() on the threadpool first, those waiting threads will never wake up.
> 
> Since the destructor has to wait for running threads to complete *anyway*, clearing out the whole task queue seems pretty reasonable to me.
It looks to me like it's not actually leaving any tasks in the queue. The statement below is the only way the thread can terminate from within, and it checks for Tasks.empty() before returning. If there are still tasks in the queue, the thread will grab one and run it regardless of whether the EnableFlag is set.

================
Comment at: lib/Support/ThreadPool.cpp:88
@@ -90,3 +87,3 @@
     // Don't allow enqueueing after disabling the pool
     assert(EnableFlag && "Queuing a thread during ThreadPool destruction");
 
----------------
jlebar wrote:
> This is interesting, as it means that you're not allowed to enqueue new tasks from a task T, if it's possible that T is still running when we destroy the threadpool.  Seems like it's worth a comment at least.  I think the right behavior probably is to explicitly not allow new tasks after we've begun shutting down -- maybe we return a null future?  Can we do that?
> 
> Threadpool shutdown is always so complicated...
I think this is OK because the only way to destroy the ThreadPool is to call its destructor. If anyone is calling the method to enqueue work after the destructor was called, then I would say that is a bug in their program, not in this ThreadPool, and I think this assert failure is appropriate. If a method is added one day to destroy the pool without calling its destructor, then I could see this being a problem, so I changed the name of the EnableFlag to InDestructorFlag to make anybody think twice about setting it if they are not in the destructor.


http://reviews.llvm.org/D18811