[Openmp-dev] 100% CPU usage when threads can't steal tasks

Adhityaa Chandrasekar via Openmp-dev openmp-dev at lists.llvm.org
Thu Sep 21 23:22:06 PDT 2017


On Fri, Sep 22, 2017 at 2:21 AM, Jeff Hammond <jeff.science at gmail.com>
wrote:

> You might find https://software.intel.com/en-us/forums/intel-many-
> integrated-core/topic/556869 relevant.
>

Thanks for that link. I agree with the statement that time to solution is a
more important metric in comparison to CPU time. However, having a thread
run at 100% is detrimental to other unrelated processes in the same machine.

Is there anything else you want me to notice in that thread?


>
> I consider idling in the OpenMP runtime to be a sign of a badly behaved
> application, not a runtime bug in need of fixing.  But in any case,
> OMP_WAIT_POLICY/KMP_BLOCKTIME exist to address that.
>

OMP_WAIT_POLICY and KMP_BLOCKTIME indeed are designed to solve that -
however, I don't think the clang OpenMP runtime is respecting the
OMP_WAIT_POLICY value -- I set `export OMP_WAIT_POLICY=passive` and
compiled the example program with g++ and clang++. The g++ binary used
virtually no CPU while the clang++ binary had three processes at 100%
(three because OMP_NUM_THREADS=4 and only 1 is executing a task; the other
three are idle).

Is this intended? Are there any other parameters to tweak?

In any case, I think a default of "temporarily active for a while before
switching to passive" instead of "always active unless the user manually
overrides" is saner, no?


>
> Best,
>
> Jeff
>
> On Thu, Sep 21, 2017 at 1:11 PM, Adhityaa Chandrasekar via Openmp-dev <
> openmp-dev at lists.llvm.org> wrote:
>
>> Hi,
>>
>> My name is Adhityaa and I'm an undergrad student. This is my first time
>> getting involved in OpenMP development at LLVM. I have used the library a
>> bunch of times before, but I haven't been involved in the behind-the-scenes
>> work before.
>>
>> So I was just taking a look at https://bugs.llvm.org/show_
>> bug.cgi?id=33543 and I tried solving it (it looked like a simple,
>> reproducible bug that looked quite important). First I ran strace on the
>> program to see exactly what was happening - I found that the threads
>> without any tasks were issuing a whole bunch of `sched_yield()`, making the
>> CPU go to 100%.
>>
>> Then I tried going into the code. I came up with a fairly simple patch
>> that almost solved the issue:
>>
>> --- kmp_wait_release.h (revision 313888)
>> +++ kmp_wait_release.h (working copy)
>> @@ -188,6 +188,7 @@
>>    // Main wait spin loop
>>    while (flag->notdone_check()) {
>>      int in_pool;
>> +    int executed_tasks = 1;
>>      kmp_task_team_t *task_team = NULL;
>>      if (__kmp_tasking_mode != tskm_immediate_exec) {
>>        task_team = this_thr->th.th_task_team;
>> @@ -200,10 +201,11 @@
>>           disabled (KMP_TASKING=0).  */
>>        if (task_team != NULL) {
>>          if (TCR_SYNC_4(task_team->tt.tt_active)) {
>> -          if (KMP_TASKING_ENABLED(task_team))
>> -            flag->execute_tasks(
>> +          if (KMP_TASKING_ENABLED(task_team)) {
>> +            executed_tasks = flag->execute_tasks(
>>                  this_thr, th_gtid, final_spin,
>>                  &tasks_completed USE_ITT_BUILD_ARG(itt_sync_obj), 0);
>> +          }
>>            else
>>              this_thr->th.th_reap_state = KMP_SAFE_TO_REAP;
>>          } else {
>> @@ -269,7 +271,7 @@
>>        continue;
>>
>>      // Don't suspend if there is a likelihood of new tasks being spawned.
>> -    if ((task_team != NULL) && TCR_4(task_team->tt.tt_found_tasks))
>> +    if ((task_team != NULL) && TCR_4(task_team->tt.tt_found_tasks) &&
>> executed_tasks)
>>        continue;
>>
>>  #if KMP_USE_MONITOR
>>
>> Alas, this led to a deadlock - the thread went into a futex wait and was
>> never woke again. So I looked at how GCC did things and they issue a
>> FUTEX_WAKE_PRIVATE for INT_MAX threads at the end of things. This should
>> solve the problem, I think?
>>
>> Anyway, it's pretty late here and I've been at this for over 6-7 hours at
>> a stretch, so I'm really tired. I was just wondering if I could get some
>> help on how to proceed from here (and whether I'm even on the right track).
>>
>> Thanks,
>> Adhityaa
>>
>> _______________________________________________
>> Openmp-dev mailing list
>> Openmp-dev at lists.llvm.org
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/openmp-dev
>>
>>
>
>
> --
> Jeff Hammond
> jeff.science at gmail.com
> http://jeffhammond.github.io/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/openmp-dev/attachments/20170922/5cddf1fe/attachment.html>


More information about the Openmp-dev mailing list