[Openmp-commits] [PATCH] D11990: Lock-free start of serialized parallel regions

Andrey Churbanov via Openmp-commits openmp-commits at lists.llvm.org
Mon Aug 17 03:50:30 PDT 2015


AndreyChurbanov added inline comments.

================
Comment at: runtime/src/kmp_runtime.c:1740
@@ +1739,3 @@
+            if ( nthreads == 1 ) {
+                __kmp_release_bootstrap_lock( &__kmp_forkjoin_lock );
+            }
----------------
hfinkel wrote:
> Why do we only release the lock when nthreads == 1? Does __kmp_reserve_threads release it otherwise?
> 
> (I realize that you've only moved this line from down below, but this seems non-obvious)
> 
No, the __kmp_reserve_threads does not release the lock.

Let me detail the rational of the change:

Old code: get lock always at the beginning, then release lock for nthreads==1 on line 1756, for nthreads>1 on line 2168 when a lot of multithread-sensitive actions have completed.

New code: lock skipped for simple 1-thread cases, but still got lock for other 1-thread cases (e.g. when serial execution caused by dynamic threads adjustment inside __kmp_reserve_threads). As a result the lock releasing for 1-thread moved here, because it now cannot be done for all 1-thread cases.  Multi-thread case releases the lock in the same place as earlier.

Performance result - 10x or more speedup of the code like
  <long loop>
    #pragma omp parallel
      #pragma omp parallel
where inner parallel region are serialized by default because OMP nesting is disabled, and number of threads in outer region is big (e.g. 60 threads on Xeon PHI to keep all cores busy).



Repository:
  rL LLVM

http://reviews.llvm.org/D11990





More information about the Openmp-commits mailing list