[Openmp-dev] [kmp_tasking] segfault while benchmarking on AArch64
Paul Osmialowski via Openmp-dev
openmp-dev at lists.llvm.org
Thu Feb 18 15:25:32 PST 2016
I did some more investigation on this case and found that this bug does
not occur with KMP_TASKING=1 (env variable setting). This however has
negative performance impact. So I worked out following workaround:
--- a/runtime/src/kmp_tasking.c
+++ b/runtime/src/kmp_tasking.c
@@ -1650,7 +1650,7 @@ __kmp_steal_task( kmp_info_t *victim, kmp_int32
gtid, kmp_task_team_t *task_team
gtid, __kmp_gtid_from_thread( victim ), task_team,
victim_td->td.td_deque_ntasks,
victim_td->td.td_deque_head,
victim_td->td.td_deque_tail) );
- if ( (TCR_4(victim_td -> td.td_deque_ntasks) == 0) || // Caller
should not check this condition
+ if ( (TCR_4(victim_td -> td.td_deque_ntasks) <= 1) || // Caller
should not check this condition
(TCR_PTR(victim->th.th_task_team) != task_team)) // GEH: why
would this happen?
{
KA_TRACE(10, ("__kmp_steal_task(exit #1): T#%d could not steal
from T#%d: task_team=%p "
@@ -1663,7 +1663,7 @@ __kmp_steal_task( kmp_info_t *victim, kmp_int32
gtid, kmp_task_team_t *task_team
__kmp_acquire_bootstrap_lock( & victim_td -> td.td_deque_lock );
// Check again after we acquire the lock
- if ( (TCR_4(victim_td -> td.td_deque_ntasks) == 0) ||
+ if ( (TCR_4(victim_td -> td.td_deque_ntasks) <= 1) ||
(TCR_PTR(victim->th.th_task_team) != task_team)) // GEH: why
would this happen?
{
__kmp_release_bootstrap_lock( & victim_td -> td.td_deque_lock );
This patch avoids stealing from thread that has less than two waiting
tasks - for some reason stealing the only task from a victim causes
disaster on non-x86(_64) architectures.
Best regards,
Paul
>> Hello,
>>
>> I¹m experiencing a crash while using LLVM libomp from git master (SHA
>> 306381bbdaf32d784487bbaf021c4acf68d68553). I¹m running taskwait_bench
>> benchmark from OpenMPTaskBench-1.0 on SoftIron board with 64-bit AMD
>> Opteron A1100 ARMv8 CPU:
>>
>> *********************************************************
>> ** Running OpenMP benchmark on 6 thread(s) **
>> ** delaylength = 500, tasks per test = 10101 **
>> *********************************************************
>>
>>
>> *****************************************
>> Running tests for tasks
>> using "taskwait" clause.
>>
>> Branching factor = 100
>> Task graph depth = 3
>> Total taskwaits generated = 10101
>> *****************************************
>>
>>
>> ****************************************************************
>> Computing TASKWAIT reference time
>> ****************************************************************
>> Sample Size Mean Min Max S.D. Outliers
>> 25 1.9861 1.2388 3.6929 0.8457 0
>> Reference time = 1.986125 usec +/- 1.657646
>>
>> ********************Measuring parallel time*********************
>> Segmentation fault (core dumped)
>>
>> With OMP_NUM_THREADS I encouraged taskwait_bench to use 6 of 8 CPU cores.
>>
>> Sometimes this benchmark completes, but mostly it fails with SIGSEGV. At
>> one occasion it failed on an assertion:
>>
>> OMP: Error #13: Assertion failure at kmp_tasking.c(670).
>>
>>
>> Though I could not reproduce this any more times.
>>
>> I also couldn¹t reproduce this failure on my x86_64 Linux desktop with
>> libomp built from the same SHA.
>>
>> Post-mortem gdb looks like this:
>>
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib64/libthread_db.so.1".
>> Core was generated by `./taskwait_bench_clang'.
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0 0x000003ffa0a96870 in ___kmp_fast_free () from
>> /home/pawosm01/llvm/lib/libomp.so
>> Missing separate debuginfos, use: dnf debuginfo-install
>> glibc-2.21-8.fc22.aarch64
>> (gdb) bt
>> #0 0x000003ffa0a96870 in ___kmp_fast_free () from
>> /home/pawosm01/llvm/lib/libomp.so
>> #1 0x000003ffa0abbb00 in __kmp_task_finish(int, kmp_task*, kmp_taskdata*)
>> () from /home/pawosm01/llvm/lib/libomp.so
>> #2 0x000003ffa0abc77c in __kmp_invoke_task(int, kmp_task*, kmp_taskdata*)
>> () from /home/pawosm01/llvm/lib/libomp.so
>> #3 0x000003ffa0abd484 in __kmp_execute_tasks_32 () from
>> /home/pawosm01/llvm/lib/libomp.so
>> #4 0x000003ffa0abca4c in __kmpc_omp_taskwait () from
>> /home/pawosm01/llvm/lib/libomp.so
>> #5 0x0000000000401268 in ?? ()
>> #6 0x000003ffa0abc71c in __kmp_invoke_task(int, kmp_task*, kmp_taskdata*)
>> () from /home/pawosm01/llvm/lib/libomp.so
>> #7 0x000003ffa0abe020 in __kmp_execute_tasks_64 () from
>> /home/pawosm01/llvm/lib/libomp.so
>> #8 0x000003ffa0ad71dc in kmp_flag_64::wait(kmp_info*, int, void*) () from
>> /home/pawosm01/llvm/lib/libomp.so
>> #9 0x000003ffa0ad5ff0 in __kmp_linear_barrier_release(barrier_type,
>> kmp_info*, int, int, int, void*) () from /home/pawosm01/llvm/lib/libomp.so
>> #10 0x000003ffa0ad6c4c in __kmp_fork_barrier(int, int) () from
>> /home/pawosm01/llvm/lib/libomp.so
>> #11 0x000003ffa0aabc80 in __kmp_launch_thread () from
>> /home/pawosm01/llvm/lib/libomp.so
>> #12 0x000003ffa0ac3780 in __kmp_launch_worker(void*) () from
>> /home/pawosm01/llvm/lib/libomp.so
>> #13 0x000003ffa0a46f0c in start_thread () from /lib64/libpthread.so.0
>> #14 0x000003ffa0990af0 in thread_start () from /lib64/libc.so.6
>> (gdb)
>>
>> System: Fedora 22 on AArch64
>>
>> Libomp and the benchmark were built by clang and llvm both built from git
>> repo master (libomp built separately, outside LLVM directory tree).
>> Libomp cmake flags:
>>
>> CC=$HOME/llvm/bin/clang CXX=$HOME/llvm/bin/clang++ cmake .. \
>> -DBUILD_SHARED_LIBS=True \
>> -DCMAKE_BUILD_TYPE=Release \
>> -DLIBOMP_OMPT_SUPPORT=ON \
>> -DLIBOMP_LLVM_LIT_EXECUTABLE=$HOME/llvm.repo/build-shared-release/bin/llvm
>> -
>> lit \
>> -DCMAKE_INSTALL_PREFIX=$HOME/llvm
>>
>>
>> When built with -DCMAKE_BUILD_TYPE=Debug I can see more output:
>>
>> *********************************************************
>> ** Running OpenMP benchmark on 6 thread(s) **
>> ** delaylength = 500, tasks per test = 10101 **
>> *********************************************************
>>
>>
>> *****************************************
>> Running tests for tasks
>> using "taskwait" clause.
>>
>> Branching factor = 100
>> Task graph depth = 3
>> Total taskwaits generated = 10101
>> *****************************************
>>
>>
>> ****************************************************************
>> Computing TASKWAIT reference time
>> ****************************************************************
>> Sample Size Mean Min Max S.D. Outliers
>> 25 1.3623 1.2385 3.2206 0.4349 1
>> re-running reference time test; 1 outliers.
>> Reference time = 1.362320 usec +/- 0.852464
>>
>> ********************Measuring parallel time*********************
>> Assertion failure at kmp_tasking.c(440): taskdata -> td_flags.started ==
>> 0.
>> Assertion failure at kmp_tasking.c(422): taskdata -> td_flags.tasktype ==
>> 1.
>> OMP: Error #13: Assertion failure at kmp_tasking.c(440).
>> OMP: Hint: Please submit a bug report with this message, compile and run
>> commands used, and machine configuration info including native compiler
>> and operating system versions. Faster response will be obtained by
>> including all program sources. For information on submitting this issue,
>> please see http://www.intel.com/software/products/support/.
>> OMP: Error #13: Assertion failure at kmp_tasking.c(422).
>> OMP: Hint: Please submit a bug report with this message, compile and run
>> commands used, and machine configuration info including native compiler
>> and operating system versions. Faster response will be obtained by
>> including all program sources. For information on submitting this issue,
>> please see http://www.intel.com/software/products/support/.
>> Aborted (core dumped)
>>
>>
>>
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib64/libthread_db.so.1".
>> Core was generated by `./taskwait_bench_clang'.
>> Program terminated with signal SIGABRT, Aborted.
>> #0 0x000003ff79783310 in raise () from /lib64/libc.so.6
>> Missing separate debuginfos, use: dnf debuginfo-install
>> glibc-2.21-8.fc22.aarch64
>> (gdb) bt
>> #0 0x000003ff79783310 in raise () from /lib64/libc.so.6
>> #1 0x000003ff79785008 in abort () from /lib64/libc.so.6
>> #2 0x000003ff79953998 in __kmp_abort_process () at
>> /home/pawosm01/openmp.repo/runtime/src/kmp_runtime.c:434
>> #3 0x000003ff79951628 in __kmp_msg (severity=kmp_ms_fatal, message=...)
>> at /home/pawosm01/openmp.repo/runtime/src/kmp_i18n.c:965
>> #4 0x000003ff7994a52c in __kmp_debug_assert (msg=0x3ff799e86c0 "taskdata
>> -> td_flags.started == 0", file=0x3ff799e78f3 "kmp_tasking.c", line=440)
>> at /home/pawosm01/openmp.repo/runtime/src/kmp_debug.c:85
>> #5 0x000003ff79976440 in __kmp_task_start (gtid=2, task=0x3ff5c0fd9c0,
>> current_task=0xc516180) at
>> /home/pawosm01/openmp.repo/runtime/src/kmp_tasking.c:440
>> #6 0x000003ff7997870c in __kmp_invoke_task (gtid=2, task=0x3ff5c0fd9c0,
>> current_task=0xc516180) at
>> /home/pawosm01/openmp.repo/runtime/src/kmp_tasking.c:1113
>> #7 0x000003ff7997a40c in __kmp_execute_tasks_template<kmp_flag_64>
>> (thread=0xc51bf80, gtid=2, flag=0x3ff795fe7e8, final_spin=1,
>> thread_finished=0x3ff795fe748, itt_sync_obj=0x0, is_constrained=0)
>> at /home/pawosm01/openmp.repo/runtime/src/kmp_tasking.c:1957
>> #8 0x000003ff79979c50 in __kmp_execute_tasks_64 (thread=0xc51bf80,
>> gtid=2, flag=0x3ff795fe7e8, final_spin=1, thread_finished=0x3ff795fe748,
>> itt_sync_obj=0x0, is_constrained=0) at
>> /home/pawosm01/openmp.repo/runtime/src/kmp_tasking.c:2042
>> #9 0x000003ff799a5d58 in kmp_flag_64::execute_tasks (this=0x3ff795fe7e8,
>> this_thr=0xc51bf80, gtid=2, final_spin=1, thread_finished=0x3ff795fe748,
>> itt_sync_obj=0x0, is_constrained=0)
>> at /home/pawosm01/openmp.repo/runtime/src/kmp_wait_release.h:455
>> #10 0x000003ff799a476c in __kmp_wait_template<kmp_flag_64>
>> (this_thr=0xc51bf80, flag=0x3ff795fe7e8, final_spin=1, itt_sync_obj=0x0)
>> at /home/pawosm01/openmp.repo/runtime/src/kmp_wait_release.h:184
>> #11 0x000003ff799a5ac0 in kmp_flag_64::wait (this=0x3ff795fe7e8,
>> this_thr=0xc51bf80, final_spin=1, itt_sync_obj=0x0) at
>> /home/pawosm01/openmp.repo/runtime/src/kmp_wait_release.h:460
>> #12 0x000003ff799a2cac in __kmp_linear_barrier_release
>> (bt=bs_forkjoin_barrier, this_thr=0xc51bf80, gtid=2, tid=-2,
>> propagate_icvs=1, itt_sync_obj=0x0) at
>> /home/pawosm01/openmp.repo/runtime/src/kmp_barrier.cpp:181
>> #13 0x000003ff799a3e6c in __kmp_fork_barrier (gtid=2, tid=-2) at
>> /home/pawosm01/openmp.repo/runtime/src/kmp_barrier.cpp:1617
>> #14 0x000003ff7995f9a0 in __kmp_launch_thread (this_thr=0xc51bf80) at
>> /home/pawosm01/openmp.repo/runtime/src/kmp_runtime.c:5495
>> #15 0x000003ff79986b7c in __kmp_launch_worker (thr=0xc51bf80) at
>> /home/pawosm01/openmp.repo/runtime/src/z_Linux_util.c:747
>> #16 0x000003ff798e6f0c in start_thread () from /lib64/libpthread.so.0
>> #17 0x000003ff79830af0 in thread_start () from /lib64/libc.so.6
>> (gdb) info threads
>> Id Target Id Frame
>> 7 Thread 0x3ff7970f260 (LWP 27860) 0x000000000040128c in ?? ()
>> 6 Thread 0x3ff73fef360 (LWP 27862) 0x000000000040128c in ?? ()
>> 5 Thread 0x3ff794ef3e0 (LWP 27863) 0x000000000040128c in ?? ()
>> 4 Thread 0x3ff79aff1e0 (LWP 27859) 0x000003ff798ecba4 in
>> pthread_cond_timedwait@@GLIBC_2.17 () from /lib64/libpthread.so.0
>> 3 Thread 0x3ff79b545c0 (LWP 27858) 0x000003ff799c911c in
>> __kmp_bakery_check (value=0, checker=1) at
>> /home/pawosm01/openmp.repo/runtime/src/kmp_lock.cpp:739
>> 2 Thread 0x3ff793df460 (LWP 27864) 0x000003ff7997c6d4 in
>> __kmp_free_task_and_ancestors (gtid=5, taskdata=0x3ff5c0fd700,
>> thread=0xc524200) at
>> /home/pawosm01/openmp.repo/runtime/src/kmp_tasking.c:591
>> * 1 Thread 0x3ff795ff2e0 (LWP 27861) 0x000003ff79783310 in raise ()
>>from /lib64/libc.so.6
>> (gdb)
>>
>>
>>
>>
>> ________________________________________
>>
>>
>> Paul Osmialowski
More information about the Openmp-dev
mailing list