[Openmp-commits] [openmp] r268640 - [STATS] Use partitioned timer scheme

Hal Finkel via Openmp-commits openmp-commits at lists.llvm.org
Wed May 11 18:00:34 PDT 2016


----- Original Message -----
> From: "Hal Finkel" <hfinkel at anl.gov>
> To: "Hal Finkel" <hfinkel at anl.gov>
> Cc: openmp-commits at lists.llvm.org, "Alexey Bataev" <a.bataev at hotmail.com>, "Jonathan Peyton"
> <jonathan.l.peyton at intel.com>
> Sent: Wednesday, May 11, 2016 7:56:14 PM
> Subject: Re: [Openmp-commits] [openmp] r268640 - [STATS] Use partitioned timer	scheme
> 
> Hi again,
> 
> I found the problem... there was a presumaby-unintentional change in
> this commit:
> 
>  static inline void __kmp_null_resume_wrapper(int gtid, volatile void
>  *flag) {
> -    if (!flag) return;
> -
>      switch (((kmp_flag_64 *)flag)->get_type()) {
>      case flag32: __kmp_resume_32(gtid, __null); break;
>      case flag64: __kmp_resume_64(gtid, __null); break;
>      case flag_oncore: __kmp_resume_oncore(gtid, __null); break;
>      }
>  }
> 
> Restoring that check fixes the segfaulting tests... I'll fix it.

r269259.

 -Hal

> 
>  -Hal
> 
> ----- Original Message -----
> > From: "Hal Finkel via Openmp-commits"
> > <openmp-commits at lists.llvm.org>
> > To: "Jonathan Peyton" <jonathan.l.peyton at intel.com>
> > Cc: openmp-commits at lists.llvm.org, "Alexey Bataev"
> > <a.bataev at hotmail.com>
> > Sent: Wednesday, May 11, 2016 3:42:12 PM
> > Subject: Re: [Openmp-commits] [openmp] r268640 - [STATS] Use
> > partitioned timer	scheme
> > 
> > Hi Jonathan, et al.,
> > 
> > It seems that my x86_64 nightly builds have been broken, starting
> > on
> > the morning of May 6th (8am UTC / 3am CT). The OpenMP regression
> > tests fail with:
> > 
> > Failing Tests (7):
> >     libomp :: tasking/omp_task.c
> >     libomp :: tasking/omp_task_final.c
> >     libomp :: tasking/omp_task_imp_firstprivate.c
> >     libomp :: tasking/omp_task_private.c
> >     libomp :: tasking/omp_task_shared.c
> >     libomp :: tasking/omp_taskwait.c
> >     libomp :: tasking/omp_taskyield.c
> > 
> > My last known-good revision was r268613. These tests are being run
> > from a self-hosted *debug* build. Although they run alright when
> > run
> > individually, when I run several copies of the test in parallel, I
> > start seeing segfaults. It is fairly deterministic once at least
> > 8-10 tests are run in parallel.
> > 
> >     libomp :: tasking/kmp_taskloop.c
> > 
> > also sometimes fails.
> > 
> > There was only one commit made in the relevant time frame to the
> > runtime library (this one), and reverting this locally removes all
> > of the failures (except for libomp :: tasking/kmp_taskloop.c).
> > Unless this can be fixed quickly, we should revert this commit.
> > 
> > Thanks again,
> > Hal
> > 
> > ----- Original Message -----
> > > From: "Jonathan Peyton via Openmp-commits"
> > > <openmp-commits at lists.llvm.org>
> > > To: openmp-commits at lists.llvm.org
> > > Sent: Thursday, May 5, 2016 11:15:58 AM
> > > Subject: [Openmp-commits] [openmp] r268640 - [STATS] Use
> > > partitioned timer	scheme
> > > 
> > > Author: jlpeyton
> > > Date: Thu May  5 11:15:57 2016
> > > New Revision: 268640
> > > 
> > > URL: http://llvm.org/viewvc/llvm-project?rev=268640&view=rev
> > > Log:
> > > [STATS] Use partitioned timer scheme
> > > 
> > > This change removes the current timers with ones that partition
> > > time
> > > properly.
> > > The current timers are nested, so that if a new timer, B, starts
> > > when
> > > the
> > > current timer, A, is already timing, A's time will include B's.
> > > To
> > > eliminate
> > > this problem, the partitioned timers are designed to stop the
> > > current
> > > timer (A),
> > > let the new timer run (B), and when the new timer is finished,
> > > restart the
> > > previously running timer (A). With this partitioning of time, a
> > > threads' timers
> > > all sum up to the OMP_worker_thread_life time and can now easily
> > > show
> > > the
> > > percentage of time a thread is spending in different parts of the
> > > runtime or
> > > user code.
> > > 
> > > There is also a new state variable associated with each thread
> > > which
> > > tells where
> > > it is executing a task. This corresponds with the timers:
> > > OMP_task_*,
> > > e.g., if
> > > time is spent in OMP_task_taskwait, then that thread executed
> > > tasks
> > > inside a
> > > #pragma omp taskwait construct.
> > > 
> > > The changes are mostly changing the MACROs to use the new
> > > PARITIONED_* macros,
> > > the new partitionedTimers class and its methods, and new state
> > > logic.
> > > 
> > > Differential Revision: http://reviews.llvm.org/D19229
> > > 
> > > Modified:
> > >     openmp/trunk/runtime/src/kmp_barrier.cpp
> > >     openmp/trunk/runtime/src/kmp_csupport.c
> > >     openmp/trunk/runtime/src/kmp_dispatch.cpp
> > >     openmp/trunk/runtime/src/kmp_runtime.c
> > >     openmp/trunk/runtime/src/kmp_sched.cpp
> > >     openmp/trunk/runtime/src/kmp_stats.cpp
> > >     openmp/trunk/runtime/src/kmp_stats.h
> > >     openmp/trunk/runtime/src/kmp_stats_timing.h
> > >     openmp/trunk/runtime/src/kmp_tasking.c
> > >     openmp/trunk/runtime/src/kmp_wait_release.h
> > >     openmp/trunk/runtime/src/z_Linux_util.c
> > > 
> > > Modified: openmp/trunk/runtime/src/kmp_barrier.cpp
> > > URL:
> > > http://llvm.org/viewvc/llvm-project/openmp/trunk/runtime/src/kmp_barrier.cpp?rev=268640&r1=268639&r2=268640&view=diff
> > > ==============================================================================
> > > --- openmp/trunk/runtime/src/kmp_barrier.cpp (original)
> > > +++ openmp/trunk/runtime/src/kmp_barrier.cpp Thu May  5 11:15:57
> > > 2016
> > > @@ -1048,6 +1048,8 @@ __kmp_barrier(enum barrier_type bt, int
> > >                void *reduce_data, void (*reduce)(void *, void *))
> > >  {
> > >      KMP_TIME_DEVELOPER_BLOCK(KMP_barrier);
> > > +    KMP_SET_THREAD_STATE_BLOCK(PLAIN_BARRIER);
> > > +    KMP_TIME_PARTITIONED_BLOCK(OMP_plain_barrier);
> > >      register int tid = __kmp_tid_from_gtid(gtid);
> > >      register kmp_info_t *this_thr = __kmp_threads[gtid];
> > >      register kmp_team_t *team = this_thr->th.th_team;
> > > @@ -1348,6 +1350,8 @@ __kmp_end_split_barrier(enum barrier_typ
> > >  void
> > >  __kmp_join_barrier(int gtid)
> > >  {
> > > +    KMP_TIME_PARTITIONED_BLOCK(OMP_fork_join_barrier);
> > > +    KMP_SET_THREAD_STATE_BLOCK(FORK_JOIN_BARRIER);
> > >      KMP_TIME_DEVELOPER_BLOCK(KMP_join_barrier);
> > >      register kmp_info_t *this_thr = __kmp_threads[gtid];
> > >      register kmp_team_t *team;
> > > @@ -1463,6 +1467,18 @@ __kmp_join_barrier(int gtid)
> > >              __kmp_task_team_wait(this_thr, team
> > >                                   USE_ITT_BUILD_ARG(itt_sync_obj)
> > >                                   );
> > >          }
> > > +#if KMP_STATS_ENABLED
> > > +        // Have master thread flag the workers to indicate they
> > > are
> > > now waiting for
> > > +        // next parallel region, Also wake them up so they
> > > switch
> > > their timers to idle.
> > > +        for (int i=0; i<team->t.t_nproc; ++i) {
> > > +            kmp_info_t* team_thread = team->t.t_threads[i];
> > > +            if (team_thread == this_thr)
> > > +                continue;
> > > +            team_thread->th.th_stats->setIdleFlag();
> > > +            if (__kmp_dflt_blocktime != KMP_MAX_BLOCKTIME &&
> > > team_thread->th.th_sleep_loc != NULL)
> > > +
> > >                __kmp_null_resume_wrapper(__kmp_gtid_from_thread(team_thread),
> > > team_thread->th.th_sleep_loc);
> > > +        }
> > > +#endif
> > >  #if USE_ITT_BUILD
> > >          if (__itt_sync_create_ptr || KMP_ITT_DEBUG)
> > >              __kmp_itt_barrier_middle(gtid, itt_sync_obj);
> > > @@ -1546,6 +1562,8 @@ __kmp_join_barrier(int gtid)
> > >  void
> > >  __kmp_fork_barrier(int gtid, int tid)
> > >  {
> > > +    KMP_TIME_PARTITIONED_BLOCK(OMP_fork_join_barrier);
> > > +    KMP_SET_THREAD_STATE_BLOCK(FORK_JOIN_BARRIER);
> > >      KMP_TIME_DEVELOPER_BLOCK(KMP_fork_barrier);
> > >      kmp_info_t *this_thr = __kmp_threads[gtid];
> > >      kmp_team_t *team = (tid == 0) ? this_thr->th.th_team : NULL;
> > > 
> > > Modified: openmp/trunk/runtime/src/kmp_csupport.c
> > > URL:
> > > http://llvm.org/viewvc/llvm-project/openmp/trunk/runtime/src/kmp_csupport.c?rev=268640&r1=268639&r2=268640&view=diff
> > > ==============================================================================
> > > --- openmp/trunk/runtime/src/kmp_csupport.c (original)
> > > +++ openmp/trunk/runtime/src/kmp_csupport.c Thu May  5 11:15:57
> > > 2016
> > > @@ -290,7 +290,6 @@ __kmpc_fork_call(ident_t *loc, kmp_int32
> > >    }
> > >    else
> > >    {
> > > -      KMP_STOP_EXPLICIT_TIMER(OMP_serial);
> > >        KMP_COUNT_BLOCK(OMP_PARALLEL);
> > >    }
> > >  #endif
> > > @@ -345,10 +344,6 @@ __kmpc_fork_call(ident_t *loc, kmp_int32
> > >      }
> > >  #endif
> > >    }
> > > -#if (KMP_STATS_ENABLED)
> > > -  if (!inParallel)
> > > -      KMP_START_EXPLICIT_TIMER(OMP_serial);
> > > -#endif
> > >  }
> > >  
> > >  #if OMP_40_ENABLED
> > > @@ -669,7 +664,6 @@ void
> > >  __kmpc_barrier(ident_t *loc, kmp_int32 global_tid)
> > >  {
> > >      KMP_COUNT_BLOCK(OMP_BARRIER);
> > > -    KMP_TIME_BLOCK(OMP_barrier);
> > >      KC_TRACE( 10, ("__kmpc_barrier: called T#%d\n", global_tid )
> > >      );
> > >  
> > >      if (! TCR_4(__kmp_init_parallel))
> > > @@ -713,7 +707,7 @@ __kmpc_master(ident_t *loc, kmp_int32 gl
> > >  
> > >      if( KMP_MASTER_GTID( global_tid )) {
> > >          KMP_COUNT_BLOCK(OMP_MASTER);
> > > -        KMP_START_EXPLICIT_TIMER(OMP_master);
> > > +        KMP_PUSH_PARTITIONED_TIMER(OMP_master);
> > >          status = 1;
> > >      }
> > >  
> > > @@ -763,7 +757,7 @@ __kmpc_end_master(ident_t *loc, kmp_int3
> > >      KC_TRACE( 10, ("__kmpc_end_master: called T#%d\n",
> > >      global_tid
> > >      )
> > >      );
> > >  
> > >      KMP_DEBUG_ASSERT( KMP_MASTER_GTID( global_tid ));
> > > -    KMP_STOP_EXPLICIT_TIMER(OMP_master);
> > > +    KMP_POP_PARTITIONED_TIMER();
> > >  
> > >  #if OMPT_SUPPORT && OMPT_TRACE
> > >      kmp_info_t  *this_thr        = __kmp_threads[ global_tid ];
> > > @@ -1088,7 +1082,7 @@ __kmpc_critical( ident_t * loc, kmp_int3
> > >      __kmpc_critical_with_hint(loc, global_tid, crit,
> > >      omp_lock_hint_none);
> > >  #else
> > >      KMP_COUNT_BLOCK(OMP_CRITICAL);
> > > -    KMP_TIME_BLOCK(OMP_critical_wait);        /* Time spent
> > > waiting
> > > to enter the critical section */
> > > +    KMP_TIME_PARTITIONED_BLOCK(OMP_critical_wait);        /*
> > > Time
> > > spent waiting to enter the critical section */
> > >      kmp_user_lock_p lck;
> > >  
> > >      KC_TRACE( 10, ("__kmpc_critical: called T#%d\n", global_tid
> > >      )
> > >      );
> > > @@ -1250,6 +1244,7 @@ __kmpc_critical_with_hint( ident_t * loc
> > >      __kmp_itt_critical_acquired( lck );
> > >  #endif /* USE_ITT_BUILD */
> > >  
> > > +    KMP_PUSH_PARTITIONED_TIMER(OMP_critical);
> > >      KA_TRACE( 15, ("__kmpc_critical: done T#%d\n", global_tid
> > >      ));
> > >  } // __kmpc_critical_with_hint
> > >  
> > > @@ -1342,7 +1337,7 @@ __kmpc_end_critical(ident_t *loc, kmp_in
> > >  #endif
> > >  
> > >  #endif // KMP_USE_DYNAMIC_LOCK
> > > -    KMP_STOP_EXPLICIT_TIMER(OMP_critical);
> > > +    KMP_POP_PARTITIONED_TIMER();
> > >      KA_TRACE( 15, ("__kmpc_end_critical: done T#%d\n",
> > >      global_tid
> > >      ));
> > >  }
> > >  
> > > @@ -1464,7 +1459,7 @@ __kmpc_single(ident_t *loc, kmp_int32 gl
> > >      if (rc) {
> > >          // We are going to execute the single statement, so we
> > >          should count it.
> > >          KMP_COUNT_BLOCK(OMP_SINGLE);
> > > -        KMP_START_EXPLICIT_TIMER(OMP_single);
> > > +        KMP_PUSH_PARTITIONED_TIMER(OMP_single);
> > >      }
> > >  
> > >  #if OMPT_SUPPORT && OMPT_TRACE
> > > @@ -1507,7 +1502,7 @@ void
> > >  __kmpc_end_single(ident_t *loc, kmp_int32 global_tid)
> > >  {
> > >      __kmp_exit_single( global_tid );
> > > -    KMP_STOP_EXPLICIT_TIMER(OMP_single);
> > > +    KMP_POP_PARTITIONED_TIMER();
> > >  
> > >  #if OMPT_SUPPORT && OMPT_TRACE
> > >      kmp_info_t *this_thr        = __kmp_threads[ global_tid ];
> > > 
> > > Modified: openmp/trunk/runtime/src/kmp_dispatch.cpp
> > > URL:
> > > http://llvm.org/viewvc/llvm-project/openmp/trunk/runtime/src/kmp_dispatch.cpp?rev=268640&r1=268639&r2=268640&view=diff
> > > ==============================================================================
> > > --- openmp/trunk/runtime/src/kmp_dispatch.cpp (original)
> > > +++ openmp/trunk/runtime/src/kmp_dispatch.cpp Thu May  5 11:15:57
> > > 2016
> > > @@ -1424,7 +1424,7 @@ __kmp_dispatch_next(
> > >      // This is potentially slightly misleading,
> > >      schedule(runtime)
> > >      will appear here even if the actual runtme schedule
> > >      // is static. (Which points out a disadavantage of
> > >      schedule(runtime): even when static scheduling is used it
> > >      costs
> > >      // more than a compile time choice to use static scheduling
> > >      would.)
> > > -    KMP_TIME_BLOCK(FOR_dynamic_scheduling);
> > > +    KMP_TIME_PARTITIONED_BLOCK(FOR_dynamic_scheduling);
> > >  
> > >      int                                   status;
> > >      dispatch_private_info_template< T > * pr;
> > > 
> > > Modified: openmp/trunk/runtime/src/kmp_runtime.c
> > > URL:
> > > http://llvm.org/viewvc/llvm-project/openmp/trunk/runtime/src/kmp_runtime.c?rev=268640&r1=268639&r2=268640&view=diff
> > > ==============================================================================
> > > --- openmp/trunk/runtime/src/kmp_runtime.c (original)
> > > +++ openmp/trunk/runtime/src/kmp_runtime.c Thu May  5 11:15:57
> > > 2016
> > > @@ -1543,7 +1543,8 @@ __kmp_fork_call(
> > >  #endif
> > >  
> > >              {
> > > -                KMP_TIME_BLOCK(OMP_work);
> > > +                KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
> > > +                KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
> > >                  __kmp_invoke_microtask( microtask, gtid, 0,
> > >                  argc,
> > >                  parent_team->t.t_argv
> > >  #if OMPT_SUPPORT
> > >                                          , exit_runtime_p
> > > @@ -1618,7 +1619,8 @@ __kmp_fork_call(
> > >                      gtid, parent_team->t.t_id,
> > >                      parent_team->t.t_pkfn
> > >                      ) );
> > >  
> > >          {
> > > -            KMP_TIME_BLOCK(OMP_work);
> > > +            KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
> > > +            KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
> > >              if (! parent_team->t.t_invoke( gtid )) {
> > >                  KMP_ASSERT2( 0, "cannot invoke microtask for
> > >                  MASTER
> > >                  thread" );
> > >              }
> > > @@ -1738,7 +1740,8 @@ __kmp_fork_call(
> > >  #endif
> > >  
> > >                  {
> > > -                    KMP_TIME_BLOCK(OMP_work);
> > > +                    KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
> > > +                    KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
> > >                      __kmp_invoke_microtask( microtask, gtid, 0,
> > >                      argc, parent_team->t.t_argv
> > >  #if OMPT_SUPPORT
> > >                          , exit_runtime_p
> > > @@ -1795,7 +1798,8 @@ __kmp_fork_call(
> > >                  team->t.t_level--;
> > >                  // AC: call special invoker for outer "parallel"
> > >                  of
> > >                  the teams construct
> > >                  {
> > > -                    KMP_TIME_BLOCK(OMP_work);
> > > +                    KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
> > > +                    KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
> > >                      invoker(gtid);
> > >                  }
> > >              } else {
> > > @@ -1842,7 +1846,8 @@ __kmp_fork_call(
> > >  #endif
> > >  
> > >                  {
> > > -                    KMP_TIME_BLOCK(OMP_work);
> > > +                    KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
> > > +                    KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
> > >                      __kmp_invoke_microtask( microtask, gtid, 0,
> > >                      argc, args
> > >  #if OMPT_SUPPORT
> > >                          , exit_runtime_p
> > > @@ -2178,7 +2183,8 @@ __kmp_fork_call(
> > >      }  // END of timer KMP_fork_call block
> > >  
> > >      {
> > > -        KMP_TIME_BLOCK(OMP_work);
> > > +        KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
> > > +        KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
> > >          // KMP_TIME_DEVELOPER_BLOCK(USER_master_invoke);
> > >          if (! team->t.t_invoke( gtid )) {
> > >              KMP_ASSERT2( 0, "cannot invoke microtask for MASTER
> > >              thread" );
> > > @@ -5448,6 +5454,8 @@ __kmp_launch_thread( kmp_info_t *this_th
> > >                  KMP_STOP_DEVELOPER_EXPLICIT_TIMER(USER_launch_thread_loop);
> > >                  {
> > >                      KMP_TIME_DEVELOPER_BLOCK(USER_worker_invoke);
> > > +                    KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
> > > +                    KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
> > >                      rc = (*pteam)->t.t_invoke( gtid );
> > >                  }
> > >                  KMP_START_DEVELOPER_EXPLICIT_TIMER(USER_launch_thread_loop);
> > > @@ -6783,7 +6791,8 @@ __kmp_invoke_task_func( int gtid )
> > >  #endif
> > >  
> > >      {
> > > -        KMP_TIME_BLOCK(OMP_work);
> > > +        KMP_TIME_PARTITIONED_BLOCK(OMP_parallel);
> > > +        KMP_SET_THREAD_STATE_BLOCK(IMPLICIT_TASK);
> > >          rc = __kmp_invoke_microtask( (microtask_t)
> > >          TCR_SYNC_PTR(team->t.t_pkfn),
> > >                                       gtid, tid, (int)
> > >                                       team->t.t_argc, (void **)
> > >                                       team->t.t_argv
> > >  #if OMPT_SUPPORT
> > > 
> > > Modified: openmp/trunk/runtime/src/kmp_sched.cpp
> > > URL:
> > > http://llvm.org/viewvc/llvm-project/openmp/trunk/runtime/src/kmp_sched.cpp?rev=268640&r1=268639&r2=268640&view=diff
> > > ==============================================================================
> > > --- openmp/trunk/runtime/src/kmp_sched.cpp (original)
> > > +++ openmp/trunk/runtime/src/kmp_sched.cpp Thu May  5 11:15:57
> > > 2016
> > > @@ -84,7 +84,7 @@ __kmp_for_static_init(
> > >      typename traits_t< T >::signed_t  chunk
> > >  ) {
> > >      KMP_COUNT_BLOCK(OMP_FOR_static);
> > > -    KMP_TIME_BLOCK (FOR_static_scheduling);
> > > +    KMP_TIME_PARTITIONED_BLOCK(FOR_static_scheduling);
> > >  
> > >      typedef typename traits_t< T >::unsigned_t  UT;
> > >      typedef typename traits_t< T >::signed_t    ST;
> > > 
> > > Modified: openmp/trunk/runtime/src/kmp_stats.cpp
> > > URL:
> > > http://llvm.org/viewvc/llvm-project/openmp/trunk/runtime/src/kmp_stats.cpp?rev=268640&r1=268639&r2=268640&view=diff
> > > ==============================================================================
> > > --- openmp/trunk/runtime/src/kmp_stats.cpp (original)
> > > +++ openmp/trunk/runtime/src/kmp_stats.cpp Thu May  5 11:15:57
> > > 2016
> > > @@ -157,6 +157,7 @@ std::string statistic::format(char unit,
> > >  
> > >  void explicitTimer::start(timer_e timerEnumValue) {
> > >      startTime = tsc_tick_count::now();
> > > +    totalPauseTime = 0;
> > >      if(timeStat::logEvent(timerEnumValue)) {
> > >          __kmp_stats_thread_ptr->incrementNestValue();
> > >      }
> > > @@ -170,7 +171,7 @@ void explicitTimer::stop(timer_e timerEn
> > >      tsc_tick_count finishTime = tsc_tick_count::now();
> > >  
> > >      //stat->addSample ((tsc_tick_count::now() -
> > >      startTime).ticks());
> > > -    stat->addSample ((finishTime - startTime).ticks());
> > > +    stat->addSample(((finishTime - startTime) -
> > > totalPauseTime).ticks());
> > >  
> > >      if(timeStat::logEvent(timerEnumValue)) {
> > >          __kmp_stats_thread_ptr->push_event(startTime.getValue()
> > >          -
> > >          __kmp_stats_start_time.getValue(), finishTime.getValue()
> > >          -
> > >          __kmp_stats_start_time.getValue(),
> > >          __kmp_stats_thread_ptr->getNestValue(), timerEnumValue);
> > > @@ -182,6 +183,74 @@ void explicitTimer::stop(timer_e timerEn
> > >      return;
> > >  }
> > >  
> > > +/*
> > > **************************************************************
> > > */
> > > +/* ************* partitionedTimers member functions
> > > *************
> > > */
> > > +partitionedTimers::partitionedTimers() {
> > > +    timer_stack.reserve(8);
> > > +}
> > > +
> > > +// add a timer to this collection of partitioned timers.
> > > +void partitionedTimers::add_timer(explicit_timer_e timer_index,
> > > explicitTimer* timer_pointer) {
> > > +    KMP_DEBUG_ASSERT((int)timer_index <
> > > (int)EXPLICIT_TIMER_LAST+1);
> > > +    timers[timer_index] = timer_pointer;
> > > +}
> > > +
> > > +// initialize the paritioned timers to an initial timer
> > > +void partitionedTimers::init(timerPair init_timer_pair) {
> > > +    KMP_DEBUG_ASSERT(this->timer_stack.size() == 0);
> > > +    timer_stack.push_back(init_timer_pair);
> > > +
> > >    timers[init_timer_pair.get_index()]->start(init_timer_pair.get_timer());
> > > +}
> > > +
> > > +// stop/save the current timer, and start the new timer
> > > (timer_pair)
> > > +// There is a special condition where if the current timer is
> > > equal
> > > to
> > > +// the one you are trying to push, then it only manipulates the
> > > stack,
> > > +// and it won't stop/start the currently running timer.
> > > +void partitionedTimers::push(timerPair timer_pair) {
> > > +    // get the current timer
> > > +    // stop current timer
> > > +    // push new timer
> > > +    // start the new timer
> > > +    KMP_DEBUG_ASSERT(this->timer_stack.size() > 0);
> > > +    timerPair current_timer = timer_stack.back();
> > > +    timer_stack.push_back(timer_pair);
> > > +    if(current_timer != timer_pair) {
> > > +        timers[current_timer.get_index()]->pause();
> > > +
> > >        timers[timer_pair.get_index()]->start(timer_pair.get_timer());
> > > +    }
> > > +}
> > > +
> > > +// stop/discard the current timer, and start the previously
> > > saved
> > > timer
> > > +void partitionedTimers::pop() {
> > > +    // get the current timer
> > > +    // stop current timer
> > > +    // pop current timer
> > > +    // get the new current timer and start it back up
> > > +    KMP_DEBUG_ASSERT(this->timer_stack.size() > 1);
> > > +    timerPair current_timer = timer_stack.back();
> > > +    timer_stack.pop_back();
> > > +    timerPair new_timer = timer_stack.back();
> > > +    if(current_timer != new_timer) {
> > > +
> > >        timers[current_timer.get_index()]->stop(current_timer.get_timer());
> > > +        timers[new_timer.get_index()]->resume();
> > > +    }
> > > +}
> > > +
> > > +// Wind up all the currently running timers.
> > > +// This pops off all the timers from the stack and clears the
> > > stack
> > > +// After this is called, init() must be run again to initialize
> > > the
> > > +// stack of timers
> > > +void partitionedTimers::windup() {
> > > +    while(timer_stack.size() > 1) {
> > > +        this->pop();
> > > +    }
> > > +    if(timer_stack.size() > 0) {
> > > +        timerPair last_timer = timer_stack.back();
> > > +        timer_stack.pop_back();
> > > +
> > >        timers[last_timer.get_index()]->stop(last_timer.get_timer());
> > > +    }
> > > +}
> > > +
> > >  /*
> > >  *******************************************************************
> > >  */
> > >  /* ************* kmp_stats_event_vector member functions
> > >  ************* */
> > >  
> > > @@ -397,8 +466,10 @@ void kmp_stats_output_module::windupExpl
> > >      // If the timer wasn't running, this won't record anything
> > >      anyway.
> > >      kmp_stats_list::iterator it;
> > >      for(it = __kmp_stats_list.begin(); it !=
> > >      __kmp_stats_list.end();
> > >      it++) {
> > > +        kmp_stats_list* ptr = *it;
> > > +        ptr->getPartitionedTimers()->windup();
> > >          for (int timer=0; timer<EXPLICIT_TIMER_LAST; timer++) {
> > > -
> > >            (*it)->getExplicitTimer(explicit_timer_e(timer))->stop((timer_e)timer);
> > > +
> > >            ptr->getExplicitTimer(explicit_timer_e(timer))->stop((timer_e)timer);
> > >          }
> > >      }
> > >  }
> > > @@ -595,11 +666,7 @@ void __kmp_reset_stats()
> > >  
> > >          // reset the event vector so all previous events are
> > >          "erased"
> > >          (*it)->resetEventVector();
> > > -
> > > -        // May need to restart the explicit timers in thread
> > > zero?
> > >      }
> > > -    KMP_START_EXPLICIT_TIMER(OMP_serial);
> > > -    KMP_START_EXPLICIT_TIMER(OMP_start_end);
> > >  }
> > >  
> > >  // This function will reset all stats and stop all threads'
> > >  explicit
> > >  timers if they haven't been stopped already.
> > > 
> > > Modified: openmp/trunk/runtime/src/kmp_stats.h
> > > URL:
> > > http://llvm.org/viewvc/llvm-project/openmp/trunk/runtime/src/kmp_stats.h?rev=268640&r1=268639&r2=268640&view=diff
> > > ==============================================================================
> > > --- openmp/trunk/runtime/src/kmp_stats.h (original)
> > > +++ openmp/trunk/runtime/src/kmp_stats.h Thu May  5 11:15:57 2016
> > > @@ -27,6 +27,7 @@
> > >  
> > >  #include <limits>
> > >  #include <math.h>
> > > +#include <vector>
> > >  #include <string>
> > >  #include <stdint.h>
> > >  #include <new> // placement new
> > > @@ -52,6 +53,23 @@ enum stats_flags_e {
> > >  };
> > >  
> > >  /*!
> > > + * @ingroup STATS_GATHERING
> > > + * \brief the states which a thread can be in
> > > + *
> > > + */
> > > +enum stats_state_e {
> > > +    IDLE,
> > > +    SERIAL_REGION,
> > > +    FORK_JOIN_BARRIER,
> > > +    PLAIN_BARRIER,
> > > +    TASKWAIT,
> > > +    TASKYIELD,
> > > +    TASKGROUP,
> > > +    IMPLICIT_TASK,
> > > +    EXPLICIT_TASK
> > > +};
> > > +
> > > +/*!
> > >   * \brief Add new counters under KMP_FOREACH_COUNTER() macro in
> > >   kmp_stats.h
> > >   *
> > >   * @param macro a user defined macro that takes three arguments
> > >   -
> > >   macro(COUNTER_NAME, flags, arg)
> > > @@ -103,18 +121,25 @@ enum stats_flags_e {
> > >   *
> > >   * @ingroup STATS_GATHERING2
> > >   */
> > > -#define KMP_FOREACH_TIMER(macro, arg)
> > >                                   \
> > > -    macro (OMP_start_end, stats_flags_e::onlyInMaster |
> > > stats_flags_e::noTotal, arg) \
> > > -    macro (OMP_serial,    stats_flags_e::onlyInMaster |
> > > stats_flags_e::noTotal, arg) \
> > > -    macro (OMP_work,      0, arg)
> > >                                       \
> > > -    macro (OMP_barrier,   0, arg)
> > >                                       \
> > > -    macro (FOR_static_scheduling, 0, arg)
> > >                               \
> > > -    macro (FOR_dynamic_scheduling, 0, arg)
> > >                              \
> > > -    macro (OMP_task,      0, arg)
> > >                                       \
> > > -    macro (OMP_critical,  0, arg)
> > >                                       \
> > > -    macro (OMP_critical_wait,  0, arg)
> > >                                  \
> > > -    macro (OMP_single,    0, arg)
> > >                                       \
> > > -    macro (OMP_master,    0, arg)
> > >                                       \
> > > +#define KMP_FOREACH_TIMER(macro, arg)
> > >                              \
> > > +    macro (OMP_worker_thread_life, 0, arg)
> > >                         \
> > > +    macro (FOR_static_scheduling, 0, arg)
> > >                          \
> > > +    macro (FOR_dynamic_scheduling, 0, arg)
> > >                         \
> > > +    macro (OMP_critical,  0, arg)
> > >                                  \
> > > +    macro (OMP_critical_wait,  0, arg)
> > >                             \
> > > +    macro (OMP_single,    0, arg)
> > >                                  \
> > > +    macro (OMP_master,    0, arg)
> > >                                  \
> > > +    macro (OMP_idle, 0, arg)
> > >                                       \
> > > +    macro (OMP_plain_barrier, 0, arg)
> > >                              \
> > > +    macro (OMP_fork_join_barrier, 0, arg)
> > >                          \
> > > +    macro (OMP_parallel, 0, arg)
> > >                                   \
> > > +    macro (OMP_task_immediate, 0, arg)
> > >                             \
> > > +    macro (OMP_task_taskwait, 0, arg)
> > >                              \
> > > +    macro (OMP_task_taskyield, 0, arg)
> > >                             \
> > > +    macro (OMP_task_taskgroup, 0, arg)
> > >                             \
> > > +    macro (OMP_task_join_bar, 0, arg)
> > >                              \
> > > +    macro (OMP_task_plain_bar, 0, arg)
> > >                             \
> > > +    macro (OMP_serial, 0, arg)
> > >                                     \
> > >      macro (OMP_set_numthreads,    stats_flags_e::noUnits |
> > >      stats_flags_e::noTotal, arg) \
> > >      macro (OMP_PARALLEL_args,     stats_flags_e::noUnits |
> > >      stats_flags_e::noTotal, arg) \
> > >      macro (FOR_static_iterations, stats_flags_e::noUnits |
> > >      stats_flags_e::noTotal, arg) \
> > > @@ -129,7 +154,16 @@ enum stats_flags_e {
> > >  // OMP_barrier            -- Time at "real" barriers (includes
> > >  task
> > >  time)
> > >  // FOR_static_scheduling  -- Time spent doing scheduling for a
> > >  static "for"
> > >  // FOR_dynamic_scheduling -- Time spent doing scheduling for a
> > >  dynamic "for"
> > > -// OMP_task               -- Time spent executing tasks
> > > +// OMP_idle               -- Worker threads time spent waiting
> > > for
> > > inclusion in a parallel region
> > > +// OMP_plain_barrier      -- Time spent in a barrier construct
> > > +// OMP_fork_join_barrier  -- Time spent in a the fork-join
> > > barrier
> > > surrounding a parallel region
> > > +// OMP_parallel           -- Time spent inside a parallel
> > > construct
> > > +// OMP_task_immediate     -- Time spent executing non-deferred
> > > tasks
> > > +// OMP_task_taskwait      -- Time spent executing tasks inside a
> > > taskwait construct
> > > +// OMP_task_taskyield     -- Time spent executing tasks inside a
> > > taskyield construct
> > > +// OMP_task_taskgroup     -- Time spent executing tasks inside a
> > > taskygroup construct
> > > +// OMP_task_join_bar      -- Time spent executing tasks inside a
> > > join barrier
> > > +// OMP_task_plain_bar     -- Time spent executing tasks inside a
> > > barrier construct
> > >  // OMP_single             -- Time spent executing a "single"
> > >  region
> > >  // OMP_master             -- Time spent executing a "master"
> > >  region
> > >  // OMP_set_numthreads     -- Values passed to
> > >  omp_set_num_threads
> > > @@ -197,12 +231,25 @@ enum stats_flags_e {
> > >   *
> > >   * @ingroup STATS_GATHERING
> > >  */
> > > -#define KMP_FOREACH_EXPLICIT_TIMER(macro, arg)          \
> > > -    macro(OMP_serial, 0, arg)                           \
> > > -    macro(OMP_start_end, 0, arg)                        \
> > > -    macro(OMP_critical, 0, arg)                         \
> > > -    macro(OMP_single, 0, arg)                           \
> > > -    macro(OMP_master, 0, arg)                           \
> > > +#define KMP_FOREACH_EXPLICIT_TIMER(macro, arg)     \
> > > +    macro(OMP_worker_thread_life, 0, arg)          \
> > > +    macro(FOR_static_scheduling, 0, arg)           \
> > > +    macro(FOR_dynamic_scheduling, 0, arg)          \
> > > +    macro(OMP_critical, 0, arg)                    \
> > > +    macro(OMP_critical_wait, 0, arg)               \
> > > +    macro(OMP_single, 0, arg)                      \
> > > +    macro(OMP_master, 0, arg)                      \
> > > +    macro(OMP_idle, 0, arg)                        \
> > > +    macro(OMP_plain_barrier, 0, arg)               \
> > > +    macro(OMP_fork_join_barrier, 0, arg)           \
> > > +    macro(OMP_parallel, 0, arg)                    \
> > > +    macro(OMP_task_immediate, 0, arg)              \
> > > +    macro(OMP_task_taskwait, 0, arg)               \
> > > +    macro(OMP_task_taskyield, 0, arg)              \
> > > +    macro(OMP_task_taskgroup, 0, arg)              \
> > > +    macro(OMP_task_join_bar, 0, arg)               \
> > > +    macro(OMP_task_plain_bar, 0, arg)              \
> > > +    macro(OMP_serial, 0, arg)                      \
> > >      KMP_FOREACH_EXPLICIT_DEVELOPER_TIMER(macro,arg)     \
> > >      macro(LAST, 0, arg)
> > >  
> > > @@ -227,6 +274,21 @@ enum counter_e {
> > >  };
> > >  #undef ENUMERATE
> > >  
> > > +class timerPair {
> > > +    explicit_timer_e timer_index;
> > > +    timer_e timer;
> > > + public:
> > > +    timerPair(explicit_timer_e ti, timer_e t) : timer_index(ti),
> > > timer(t) {}
> > > +    inline explicit_timer_e get_index() const { return
> > > timer_index;
> > > }
> > > +    inline timer_e get_timer() const { return timer; }
> > > +    bool operator==(const timerPair & rhs) {
> > > +        return this->get_index() == rhs.get_index();
> > > +    }
> > > +    bool operator!=(const timerPair & rhs) {
> > > +        return !(*this == rhs);
> > > +    }
> > > +};
> > > +
> > >  class statistic
> > >  {
> > >      double   minVal;
> > > @@ -294,15 +356,19 @@ class explicitTimer
> > >  {
> > >      timeStat * stat;
> > >      tsc_tick_count startTime;
> > > +    tsc_tick_count pauseStartTime;
> > > +    tsc_tick_count::tsc_interval_t totalPauseTime;
> > >  
> > >   public:
> > > -    explicitTimer () : stat(0), startTime(0) { }
> > > -    explicitTimer (timeStat * s) : stat(s), startTime() { }
> > > +    explicitTimer () : stat(0), startTime(0), pauseStartTime(0),
> > > totalPauseTime() { }
> > > +    explicitTimer (timeStat * s) : stat(s), startTime(),
> > > pauseStartTime(0), totalPauseTime() { }
> > >  
> > >      void setStat (timeStat *s) { stat = s; }
> > >      void start(timer_e timerEnumValue);
> > > +    void pause() { pauseStartTime = tsc_tick_count::now(); }
> > > +    void resume() { totalPauseTime += (tsc_tick_count::now() -
> > > pauseStartTime); }
> > >      void stop(timer_e timerEnumValue);
> > > -    void reset() { startTime = 0; }
> > > +    void reset() { startTime = 0; pauseStartTime = 0;
> > > totalPauseTime
> > > = 0; }
> > >  };
> > >  
> > >  // Where all you need is to time a block, this is enough.
> > > @@ -315,6 +381,49 @@ class blockTimer : public explicitTimer
> > >      ~blockTimer() { stop(timerEnumValue); }
> > >  };
> > >  
> > > +// Where you need to partition a threads clock ticks into
> > > separate
> > > states
> > > +// e.g., a partitionedTimers class with two timers of
> > > EXECUTING_TASK, and
> > > +//   DOING_NOTHING would render these conditions:
> > > +//   time(EXECUTING_TASK) + time(DOING_NOTHING) = total time
> > > thread
> > > is alive
> > > +//   No clock tick in the EXECUTING_TASK is a member of
> > > DOING_NOTHING and vice versa
> > > +class partitionedTimers
> > > +{
> > > + private:
> > > +    explicitTimer* timers[EXPLICIT_TIMER_LAST+1];
> > > +    std::vector<timerPair> timer_stack;
> > > + public:
> > > +    partitionedTimers();
> > > +    void add_timer(explicit_timer_e timer_index, explicitTimer*
> > > timer_pointer);
> > > +    void init(timerPair timer_index);
> > > +    void push(timerPair timer_index);
> > > +    void pop();
> > > +    void windup();
> > > +};
> > > +
> > > +// Special wrapper around the partioned timers to aid timing
> > > code
> > > blocks
> > > +// It avoids the need to have an explicit end, leaving the scope
> > > suffices.
> > > +class blockPartitionedTimer
> > > +{
> > > +    partitionedTimers* part_timers;
> > > +    timerPair timer_pair;
> > > + public:
> > > +    blockPartitionedTimer(partitionedTimers* pt, timerPair tp) :
> > > part_timers(pt), timer_pair(tp) { part_timers->push(timer_pair);
> > > }
> > > +   ~blockPartitionedTimer() { part_timers->pop(); }
> > > +};
> > > +
> > > +// Special wrapper around the thread state to aid in keeping
> > > state
> > > in code blocks
> > > +// It avoids the need to have an explicit end, leaving the scope
> > > suffices.
> > > +class blockThreadState
> > > +{
> > > +    stats_state_e* state_pointer;
> > > +    stats_state_e  old_state;
> > > + public:
> > > +    blockThreadState(stats_state_e* thread_state_pointer,
> > > stats_state_e new_state) : state_pointer(thread_state_pointer),
> > > old_state(*thread_state_pointer) {
> > > +        *state_pointer = new_state;
> > > +    }
> > > +   ~blockThreadState() { *state_pointer = old_state;  }
> > > +};
> > > +
> > >  // If all you want is a count, then you can use this...
> > >  // The individual per-thread counts will be aggregated into a
> > >  statistic at program exit.
> > >  class counter
> > > @@ -473,14 +582,19 @@ class kmp_stats_list {
> > >      timeStat      _timers[TIMER_LAST+1];
> > >      counter       _counters[COUNTER_LAST+1];
> > >      explicitTimer _explicitTimers[EXPLICIT_TIMER_LAST+1];
> > > +    partitionedTimers _partitionedTimers;
> > >      int           _nestLevel; // one per thread
> > >      kmp_stats_event_vector _event_vector;
> > >      kmp_stats_list* next;
> > >      kmp_stats_list* prev;
> > > +    stats_state_e state;
> > > +    int thread_is_idle_flag;
> > >   public:
> > > -    kmp_stats_list() : next(this) , prev(this) ,
> > > _event_vector(),
> > > _nestLevel(0) {
> > > +    kmp_stats_list() : _nestLevel(0), _event_vector(),
> > > next(this),
> > > prev(this),
> > > +      state(IDLE), thread_is_idle_flag(0) {
> > >  #define doInit(name,ignore1,ignore2) \
> > > -
> > >        getExplicitTimer(EXPLICIT_TIMER_##name)->setStat(getTimer(TIMER_##name));
> > > +
> > >        getExplicitTimer(EXPLICIT_TIMER_##name)->setStat(getTimer(TIMER_##name));
> > > \
> > > +        _partitionedTimers.add_timer(EXPLICIT_TIMER_##name,
> > > getExplicitTimer(EXPLICIT_TIMER_##name));
> > >          KMP_FOREACH_EXPLICIT_TIMER(doInit,0);
> > >  #undef doInit
> > >      }
> > > @@ -488,6 +602,7 @@ class kmp_stats_list {
> > >      inline timeStat *      getTimer(timer_e idx)
> > >                       {
> > >      return &_timers[idx]; }
> > >      inline counter  *      getCounter(counter_e idx)
> > >                   {
> > >      return &_counters[idx]; }
> > >      inline explicitTimer * getExplicitTimer(explicit_timer_e
> > >      idx)
> > >      {
> > >      return &_explicitTimers[idx]; }
> > > +    inline partitionedTimers * getPartitionedTimers()
> > >             {
> > > return &_partitionedTimers; }
> > >      inline timeStat *      getTimers()
> > >                                 {
> > >      return _timers; }
> > >      inline counter  *      getCounters()
> > >                               {
> > >      return _counters; }
> > >      inline explicitTimer * getExplicitTimers()
> > >                         {
> > >      return _explicitTimers; }
> > > @@ -498,6 +613,12 @@ class kmp_stats_list {
> > >      inline void decrementNestValue()
> > >                                   {
> > >      _nestLevel--; }
> > >      inline int  getGtid() const
> > >                                        {
> > >      return gtid; }
> > >      inline void setGtid(int newgtid)
> > >                                   {
> > >      gtid = newgtid; }
> > > +    inline void setState(stats_state_e newstate)
> > >                  {
> > > state = newstate; }
> > > +    inline stats_state_e getState() const
> > >                         {
> > > return state; }
> > > +    inline stats_state_e * getStatePointer()
> > >                      {
> > > return &state; }
> > > +    inline bool  isIdle()
> > >                                         {
> > > return thread_is_idle_flag==1; }
> > > +    inline void setIdleFlag()
> > >                                     {
> > > thread_is_idle_flag = 1; }
> > > +    inline void resetIdleFlag()
> > >                                   {
> > > thread_is_idle_flag = 0; }
> > >      kmp_stats_list* push_back(int gtid); // returns newly
> > >      created
> > >      list node
> > >      inline void     push_event(uint64_t start_time, uint64_t
> > >      stop_time, int nest_level, timer_e name) {
> > >          _event_vector.push_back(start_time, stop_time,
> > >          nest_level,
> > >          name);
> > > @@ -699,6 +820,35 @@ extern kmp_stats_output_module __kmp_sta
> > >      __kmp_output_stats(heading_string)
> > >  
> > >  /*!
> > > + * \brief Initializes the paritioned timers to begin with name.
> > > + *
> > > + * @param name timer which you want this thread to begin with
> > > + *
> > > + * @ingroup STATS_GATHERING
> > > +*/
> > > +#define KMP_INIT_PARTITIONED_TIMERS(name) \
> > > +
> > >    __kmp_stats_thread_ptr->getPartitionedTimers()->init(timerPair(EXPLICIT_TIMER_##name,
> > > TIMER_##name))
> > > +
> > > +#define KMP_TIME_PARTITIONED_BLOCK(name) \
> > > +    blockPartitionedTimer
> > > __PBLOCKTIME__(__kmp_stats_thread_ptr->getPartitionedTimers(), \
> > > +        timerPair(EXPLICIT_TIMER_##name, TIMER_##name))
> > > +
> > > +#define KMP_PUSH_PARTITIONED_TIMER(name) \
> > > +
> > >    __kmp_stats_thread_ptr->getPartitionedTimers()->push(timerPair(EXPLICIT_TIMER_##name,
> > > TIMER_##name))
> > > +
> > > +#define KMP_POP_PARTITIONED_TIMER() \
> > > +    __kmp_stats_thread_ptr->getPartitionedTimers()->pop()
> > > +
> > > +#define KMP_SET_THREAD_STATE(state_name) \
> > > +    __kmp_stats_thread_ptr->setState(state_name)
> > > +
> > > +#define KMP_GET_THREAD_STATE() \
> > > +    __kmp_stats_thread_ptr->getState()
> > > +
> > > +#define KMP_SET_THREAD_STATE_BLOCK(state_name) \
> > > +    blockThreadState
> > > __BTHREADSTATE__(__kmp_stats_thread_ptr->getStatePointer(),
> > > state_name)
> > > +
> > > +/*!
> > >   * \brief resets all stats (counters to 0, timers to 0 elapsed
> > >   ticks)
> > >   *
> > >   * \details Reset all stats for all threads.
> > > @@ -739,6 +889,13 @@ extern kmp_stats_output_module __kmp_sta
> > >  #define KMP_COUNT_DEVELOPER_BLOCK(n)            ((void)0)
> > >  #define KMP_START_DEVELOPER_EXPLICIT_TIMER(n)   ((void)0)
> > >  #define KMP_STOP_DEVELOPER_EXPLICIT_TIMER(n)    ((void)0)
> > > +#define KMP_INIT_PARTITIONED_TIMERS(name)       ((void)0)
> > > +#define KMP_TIME_PARTITIONED_BLOCK(name)        ((void)0)
> > > +#define KMP_PUSH_PARTITIONED_TIMER(name)        ((void)0)
> > > +#define KMP_POP_PARTITIONED_TIMER()             ((void)0)
> > > +#define KMP_SET_THREAD_STATE(state_name)        ((void)0)
> > > +#define KMP_GET_THREAD_STATE()                  ((void)0)
> > > +#define KMP_SET_THREAD_STATE_BLOCK(state_name)  ((void)0)
> > >  #endif  // KMP_STATS_ENABLED
> > >  
> > >  #endif // KMP_STATS_H
> > > 
> > > Modified: openmp/trunk/runtime/src/kmp_stats_timing.h
> > > URL:
> > > http://llvm.org/viewvc/llvm-project/openmp/trunk/runtime/src/kmp_stats_timing.h?rev=268640&r1=268639&r2=268640&view=diff
> > > ==============================================================================
> > > --- openmp/trunk/runtime/src/kmp_stats_timing.h (original)
> > > +++ openmp/trunk/runtime/src/kmp_stats_timing.h Thu May  5
> > > 11:15:57
> > > 2016
> > > @@ -40,11 +40,16 @@ class tsc_tick_count {
> > >  #endif
> > >          double ticks() const { return double(value); }
> > >          int64_t getValue() const { return value; }
> > > +        tsc_interval_t& operator=(int64_t nvalue) { value =
> > > nvalue;
> > > return *this; }
> > >  
> > >          friend class tsc_tick_count;
> > >  
> > > -        friend tsc_interval_t operator-(
> > > -        const tsc_tick_count t1, const tsc_tick_count t0);
> > > +        friend tsc_interval_t operator-(const tsc_tick_count&
> > > t1,
> > > +                                        const tsc_tick_count&
> > > t0);
> > > +        friend tsc_interval_t operator-(const
> > > tsc_tick_count::tsc_interval_t& i1,
> > > +                                        const
> > > tsc_tick_count::tsc_interval_t& i0);
> > > +        friend tsc_interval_t&
> > > operator+=(tsc_tick_count::tsc_interval_t& i1,
> > > +                                         const
> > > tsc_tick_count::tsc_interval_t& i0);
> > >      };
> > >  
> > >  #if KMP_HAVE___BUILTIN_READCYCLECOUNTER
> > > @@ -66,14 +71,25 @@ class tsc_tick_count {
> > >      static double tick_time(); // returns seconds per cycle
> > >      (period)
> > >      of clock
> > >  #endif
> > >      static tsc_tick_count now() { return tsc_tick_count(); } //
> > >      returns the rdtsc register value
> > > -    friend tsc_tick_count::tsc_interval_t operator-(const
> > > tsc_tick_count t1, const tsc_tick_count t0);
> > > +    friend tsc_tick_count::tsc_interval_t operator-(const
> > > tsc_tick_count& t1, const tsc_tick_count& t0);
> > >  };
> > >  
> > > -inline tsc_tick_count::tsc_interval_t operator-(const
> > > tsc_tick_count
> > > t1, const tsc_tick_count t0)
> > > +inline tsc_tick_count::tsc_interval_t operator-(const
> > > tsc_tick_count& t1, const tsc_tick_count& t0)
> > >  {
> > >      return tsc_tick_count::tsc_interval_t(
> > >      t1.my_count-t0.my_count
> > >      );
> > >  }
> > >  
> > > +inline tsc_tick_count::tsc_interval_t operator-(const
> > > tsc_tick_count::tsc_interval_t& i1, const
> > > tsc_tick_count::tsc_interval_t& i0)
> > > +{
> > > +    return tsc_tick_count::tsc_interval_t( i1.value-i0.value );
> > > +}
> > > +
> > > +inline tsc_tick_count::tsc_interval_t&
> > > operator+=(tsc_tick_count::tsc_interval_t& i1, const
> > > tsc_tick_count::tsc_interval_t& i0)
> > > +{
> > > +    i1.value += i0.value;
> > > +    return i1;
> > > +}
> > > +
> > >  #if KMP_HAVE_TICK_TIME
> > >  inline double tsc_tick_count::tsc_interval_t::seconds() const
> > >  {
> > > 
> > > Modified: openmp/trunk/runtime/src/kmp_tasking.c
> > > URL:
> > > http://llvm.org/viewvc/llvm-project/openmp/trunk/runtime/src/kmp_tasking.c?rev=268640&r1=268639&r2=268640&view=diff
> > > ==============================================================================
> > > --- openmp/trunk/runtime/src/kmp_tasking.c (original)
> > > +++ openmp/trunk/runtime/src/kmp_tasking.c Thu May  5 11:15:57
> > > 2016
> > > @@ -36,16 +36,6 @@ static int  __kmp_realloc_task_threads_d
> > >  static void __kmp_bottom_half_finish_proxy( kmp_int32 gtid,
> > >  kmp_task_t * ptask );
> > >  #endif
> > >  
> > > -static inline void __kmp_null_resume_wrapper(int gtid, volatile
> > > void
> > > *flag) {
> > > -    if (!flag) return;
> > > -    // Attempt to wake up a thread: examine its type and call
> > > appropriate template
> > > -    switch (((kmp_flag_64 *)flag)->get_type()) {
> > > -    case flag32: __kmp_resume_32(gtid, NULL); break;
> > > -    case flag64: __kmp_resume_64(gtid, NULL); break;
> > > -    case flag_oncore: __kmp_resume_oncore(gtid, NULL); break;
> > > -    }
> > > -}
> > > -
> > >  #ifdef BUILD_TIED_TASK_STACK
> > >  
> > >  //---------------------------------------------------------------------------
> > > @@ -1207,8 +1197,17 @@ __kmp_invoke_task( kmp_int32 gtid, kmp_t
> > >      // Thunks generated by gcc take a different argument list.
> > >      //
> > >      if (!discard) {
> > > +#if KMP_STATS_ENABLED
> > >          KMP_COUNT_BLOCK(TASK_executed);
> > > -        KMP_TIME_BLOCK (OMP_task);
> > > +        switch(KMP_GET_THREAD_STATE()) {
> > > +         case FORK_JOIN_BARRIER:
> > > KMP_PUSH_PARTITIONED_TIMER(OMP_task_join_bar); break;
> > > +         case PLAIN_BARRIER:
> > > KMP_PUSH_PARTITIONED_TIMER(OMP_task_plain_bar); break;
> > > +         case TASKYIELD:
> > > KMP_PUSH_PARTITIONED_TIMER(OMP_task_taskyield); break;
> > > +         case TASKWAIT:
> > > KMP_PUSH_PARTITIONED_TIMER(OMP_task_taskwait); break;
> > > +         case TASKGROUP:
> > > KMP_PUSH_PARTITIONED_TIMER(OMP_task_taskgroup); break;
> > > +         default:
> > > KMP_PUSH_PARTITIONED_TIMER(OMP_task_immediate);
> > > break;
> > > +        }
> > > +#endif // KMP_STATS_ENABLED
> > >  #endif // OMP_40_ENABLED
> > >  
> > >  #if OMPT_SUPPORT && OMPT_TRACE
> > > @@ -1231,6 +1230,7 @@ __kmp_invoke_task( kmp_int32 gtid, kmp_t
> > >          {
> > >              (*(task->routine))(gtid, task);
> > >          }
> > > +        KMP_POP_PARTITIONED_TIMER();
> > >  
> > >  #if OMPT_SUPPORT && OMPT_TRACE
> > >          /* let OMPT know that we're returning to the callee task
> > >          */
> > > @@ -1369,6 +1369,7 @@ kmp_int32
> > >  __kmpc_omp_task( ident_t *loc_ref, kmp_int32 gtid, kmp_task_t *
> > >  new_task)
> > >  {
> > >      kmp_int32 res;
> > > +    KMP_SET_THREAD_STATE_BLOCK(EXPLICIT_TASK);
> > >  
> > >  #if KMP_DEBUG
> > >      kmp_taskdata_t * new_taskdata =
> > >      KMP_TASK_TO_TASKDATA(new_task);
> > > @@ -1392,6 +1393,7 @@ __kmpc_omp_taskwait( ident_t *loc_ref, k
> > >      kmp_taskdata_t * taskdata;
> > >      kmp_info_t * thread;
> > >      int thread_finished = FALSE;
> > > +    KMP_SET_THREAD_STATE_BLOCK(TASKWAIT);
> > >  
> > >      KA_TRACE(10, ("__kmpc_omp_taskwait(enter): T#%d loc=%p\n",
> > >      gtid,
> > >      loc_ref) );
> > >  
> > > @@ -1481,6 +1483,7 @@ __kmpc_omp_taskyield( ident_t *loc_ref,
> > >      int thread_finished = FALSE;
> > >  
> > >      KMP_COUNT_BLOCK(OMP_TASKYIELD);
> > > +    KMP_SET_THREAD_STATE_BLOCK(TASKYIELD);
> > >  
> > >      KA_TRACE(10, ("__kmpc_omp_taskyield(enter): T#%d loc=%p
> > >      end_part
> > >      = %d\n",
> > >                    gtid, loc_ref, end_part) );
> > > @@ -1561,6 +1564,7 @@ __kmpc_end_taskgroup( ident_t* loc, int
> > >  
> > >      KA_TRACE(10, ("__kmpc_end_taskgroup(enter): T#%d loc=%p\n",
> > >      gtid, loc) );
> > >      KMP_DEBUG_ASSERT( taskgroup != NULL );
> > > +    KMP_SET_THREAD_STATE_BLOCK(TASKGROUP);
> > >  
> > >      if ( __kmp_tasking_mode != tskm_immediate_exec ) {
> > >  #if USE_ITT_BUILD
> > > 
> > > Modified: openmp/trunk/runtime/src/kmp_wait_release.h
> > > URL:
> > > http://llvm.org/viewvc/llvm-project/openmp/trunk/runtime/src/kmp_wait_release.h?rev=268640&r1=268639&r2=268640&view=diff
> > > ==============================================================================
> > > --- openmp/trunk/runtime/src/kmp_wait_release.h (original)
> > > +++ openmp/trunk/runtime/src/kmp_wait_release.h Thu May  5
> > > 11:15:57
> > > 2016
> > > @@ -18,6 +18,7 @@
> > >  
> > >  #include "kmp.h"
> > >  #include "kmp_itt.h"
> > > +#include "kmp_stats.h"
> > >  
> > >  /*!
> > >  @defgroup WAIT_RELEASE Wait/Release operations
> > > @@ -104,6 +105,9 @@ __kmp_wait_template(kmp_info_t *this_thr
> > >      }
> > >      th_gtid = this_thr->th.th_info.ds.ds_gtid;
> > >      KA_TRACE(20, ("__kmp_wait_sleep: T#%d waiting for
> > >      flag(%p)\n",
> > >      th_gtid, flag));
> > > +#if KMP_STATS_ENABLED
> > > +    stats_state_e thread_state = KMP_GET_THREAD_STATE();
> > > +#endif
> > >  
> > >  #if OMPT_SUPPORT && OMPT_BLAME
> > >      ompt_state_t ompt_state =
> > >      this_thr->th.ompt_thread_info.state;
> > > @@ -223,6 +227,15 @@ __kmp_wait_template(kmp_info_t *this_thr
> > >              }
> > >          }
> > >  
> > > +#if KMP_STATS_ENABLED
> > > +        // Check if thread has been signalled to idle state
> > > +        // This indicates that the logical "join-barrier" has
> > > finished
> > > +        if (this_thr->th.th_stats->isIdle() &&
> > > KMP_GET_THREAD_STATE() == FORK_JOIN_BARRIER) {
> > > +            KMP_SET_THREAD_STATE(IDLE);
> > > +            KMP_PUSH_PARTITIONED_TIMER(OMP_idle);
> > > +        }
> > > +#endif
> > > +
> > >          // Don't suspend if KMP_BLOCKTIME is set to "infinite"
> > >          if (__kmp_dflt_blocktime == KMP_MAX_BLOCKTIME)
> > >              continue;
> > > @@ -273,6 +286,14 @@ __kmp_wait_template(kmp_info_t *this_thr
> > >          }
> > >      }
> > >  #endif
> > > +#if KMP_STATS_ENABLED
> > > +    // If we were put into idle state, pop that off the state
> > > stack
> > > +    if (KMP_GET_THREAD_STATE() == IDLE) {
> > > +        KMP_POP_PARTITIONED_TIMER();
> > > +        KMP_SET_THREAD_STATE(thread_state);
> > > +        this_thr->th.th_stats->resetIdleFlag();
> > > +    }
> > > +#endif
> > >  
> > >      KMP_FSYNC_SPIN_ACQUIRED(spin);
> > >  }
> > > @@ -556,6 +577,15 @@ public:
> > >      flag_type get_ptr_type() { return flag_oncore; }
> > >  };
> > >  
> > > +// Used to wake up threads, volatile void* flag is usually the
> > > th_sleep_loc associated
> > > +// with int gtid.
> > > +static inline void __kmp_null_resume_wrapper(int gtid, volatile
> > > void
> > > *flag) {
> > > +    switch (((kmp_flag_64 *)flag)->get_type()) {
> > > +    case flag32: __kmp_resume_32(gtid, NULL); break;
> > > +    case flag64: __kmp_resume_64(gtid, NULL); break;
> > > +    case flag_oncore: __kmp_resume_oncore(gtid, NULL); break;
> > > +    }
> > > +}
> > >  
> > >  /*!
> > >  @}
> > > 
> > > Modified: openmp/trunk/runtime/src/z_Linux_util.c
> > > URL:
> > > http://llvm.org/viewvc/llvm-project/openmp/trunk/runtime/src/z_Linux_util.c?rev=268640&r1=268639&r2=268640&view=diff
> > > ==============================================================================
> > > --- openmp/trunk/runtime/src/z_Linux_util.c (original)
> > > +++ openmp/trunk/runtime/src/z_Linux_util.c Thu May  5 11:15:57
> > > 2016
> > > @@ -697,6 +697,9 @@ __kmp_launch_worker( void *thr )
> > >  #if KMP_STATS_ENABLED
> > >      // set __thread local index to point to thread-specific
> > >      stats
> > >      __kmp_stats_thread_ptr = ((kmp_info_t*)thr)->th.th_stats;
> > > +    KMP_START_EXPLICIT_TIMER(OMP_worker_thread_life);
> > > +    KMP_SET_THREAD_STATE(IDLE);
> > > +    KMP_INIT_PARTITIONED_TIMERS(OMP_idle);
> > >  #endif
> > >  
> > >  #if USE_ITT_BUILD
> > > @@ -972,8 +975,9 @@ __kmp_create_worker( int gtid, kmp_info_
> > >          __kmp_stats_start_time = tsc_tick_count::now();
> > >          __kmp_stats_thread_ptr = th->th.th_stats;
> > >          __kmp_stats_init();
> > > -        KMP_START_EXPLICIT_TIMER(OMP_serial);
> > > -        KMP_START_EXPLICIT_TIMER(OMP_start_end);
> > > +        KMP_START_EXPLICIT_TIMER(OMP_worker_thread_life);
> > > +        KMP_SET_THREAD_STATE(SERIAL_REGION);
> > > +        KMP_INIT_PARTITIONED_TIMERS(OMP_serial);
> > >      }
> > >      __kmp_release_tas_lock(&__kmp_stats_lock, gtid);
> > >  
> > > @@ -1856,6 +1860,7 @@ void __kmp_resume_oncore(int target_gtid
> > >  void
> > >  __kmp_resume_monitor()
> > >  {
> > > +    KMP_TIME_DEVELOPER_BLOCK(USER_resume);
> > >      int status;
> > >  #ifdef KMP_DEBUG
> > >      int gtid = TCR_4(__kmp_init_gtid) ? __kmp_get_gtid() : -1;
> > > 
> > > 
> > > _______________________________________________
> > > Openmp-commits mailing list
> > > Openmp-commits at lists.llvm.org
> > > http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-commits
> > > 
> > 
> > --
> > Hal Finkel
> > Assistant Computational Scientist
> > Leadership Computing Facility
> > Argonne National Laboratory
> > _______________________________________________
> > Openmp-commits mailing list
> > Openmp-commits at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-commits
> > 
> 
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
> 

-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory


More information about the Openmp-commits mailing list