[PATCH] D36607: [Support/Parallel] - Do not spawn thread for single/last task.

George Rimar via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Aug 16 04:06:16 PDT 2017


grimar added a comment.

In https://reviews.llvm.org/D36607#840916, @ruiu wrote:

> I still wonder if a communication between threads can make something that slow. Where is the exact location where you observe the slowdown? Until we understand the exact reason, we cannot exclude the possibility that this change is hiding a real issue.


It is ThreadPoolExecutor::work().
(https://github.com/llvm-mirror/llvm/blob/master/lib/Support/Parallel.cpp#L102)

Currently it has following implementation:

  void work() {
    while (true) {
      std::unique_lock<std::mutex> Lock(Mutex);
      Cond.wait(Lock, [&] { return Stop || !WorkStack.empty(); });
      if (Stop)
        break;
      auto Task = WorkStack.top();
      WorkStack.pop();
      Lock.unlock();
      Task();
    }
    Done.dec();
  }

If I modify it slightly to avoid waiting on condition variable (and use busy-waiting instead) then it works much faster for me:

  void work() {
      while (true) {
        std::unique_lock<std::mutex> Lock(Mutex);
        if (!WorkStack.empty()) {
            auto Task = WorkStack.top();
            WorkStack.pop();
            Lock.unlock();
            Task();
        }
        if (Stop)
         break;
      }
      Done.dec();
    }

What make me think that sync overhead is significant in this case.

Another example is when I stop calling `Task()` from `work()` and do that instead in `add()`.
There are 2 cases, in first I commented out all sync stuff. In second leaved sync stuff as is:

  void add(std::function<void()> F) override {
     //std::unique_lock<std::mutex> Lock(Mutex); // COMMENTED OUT
     //WorkStack.push(F);                                       // COMMENTED OUT
     //Lock.unlock();                                                // COMMENTED OUT
     //Cond.notify_one();                                        // COMMENTED OUT
     F();                                                                   // NEW LINE
   }
  
  void add(std::function<void()> F) override {
     std::unique_lock<std::mutex> Lock(Mutex);
     WorkStack.push(F);
     Lock.unlock();
     Cond.notify_one();
     F();                                                                 // NEW LINE
   }

In first case (when all sync is commented in `add()`) link time is instant, in second takes some time, though
all job is done in `add()` by `F()` anyways, what means all time is spend on sync stuff I believe.


https://reviews.llvm.org/D36607





More information about the llvm-commits mailing list