[PATCH] D36607: [Support/Parallel] - Do not spawn thread for single/last task.

Fri Aug 11 03:52:19 PDT 2017

grimar created this revision.

We have an issue with one of testcases in LLD, it is PR32942.
In short: testcase takes much more time to finish with threads
than without them.

I found that it is not reproduces under windows, but linux implementation
is affected. Issue itself is cased by case when we have about ~65000 calls
of `for_each_n`, each one has `Begin=0`, `End=1`. 
Windows implementation does not spawn any threads at all in that case, 
see internal implementation:

  \VC\include\ppl.h
  template <typename _Index_type, typename _Function, typename _Partitioner>
  void _Parallel_for_impl(_Index_type _First, _Index_type _Last, _Index_type _Step, const _Function& _Func, _Partitioner&& _Part)
  {
  .....
      if (_Range_size <= _Diff_step)
      {
          _Func(_First); // THIS BRANCH IS CALLED
      }
      else
      {
          _Parallel_for_partitioned_impl<_Index_type, _Diff_type, _Function>(_First, _Range_size, _Step, _Func, std::forward<_Partitioner>(_Part));
      }
  }

But linux implementation spawns about ~65k threads, what works really slow and looks useless. 
I suggest do not create threads for single or last task, and that fixes issue I am observing.


https://reviews.llvm.org/D36607

Files:
  Parallel.h


Index: Parallel.h
===================================================================

--- Parallel.h
+++ Parallel.h
@@ -179,10 +179,8 @@
         Fn(J);
     });
   }
-  TG.spawn([=, &Fn] {
-    for (IndexTy J = I; J < End; ++J)
-      Fn(J);
-  });
+  for (IndexTy J = I; J < End; ++J)
+    Fn(J);
 }
 
 #endif


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D36607.110683.patch
Type: text/x-patch
Size: 312 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20170811/76ee7e7a/attachment.bin>