<table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Issue</th>
<td>
<a href=https://github.com/llvm/llvm-project/issues/85963>85963</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>
[OpenMP] Race condition in __kmpc_omp_taskwait_deps_51
</td>
</tr>
<tr>
<th>Labels</th>
<td>
new issue
</td>
</tr>
<tr>
<th>Assignees</th>
<td>
</td>
</tr>
<tr>
<th>Reporter</th>
<td>
uweigand
</td>
</tr>
</table>
<pre>
The target_nowait_target test is sporadically failing on the openmp-s390x-linux builder. When enabling debug assertions, the failure case triggers a
```
Assertion failure at kmp_taskdeps.h(26): n >= 0.
```
with the following stack trace
```
#6 0x000003fff7a47d5e in __kmp_debug_assert (msg=0x3fff7b6d296 "n >= 0", file=0x3fff7b6ec37 "kmp_taskdeps.h", line=26)
at /home/uweigand/llvm/llvm-head/openmp/runtime/src/kmp_debug.cpp:74
#7 0x000003fff7aa0728 in __kmp_node_deref (thread=0x3ffe4073a00, node=0x3ffffff8200) at /home/uweigand/llvm/llvm-head/openmp/runtime/src/kmp_taskdeps.h:26
#8 0x000003fff7aa2b0c in __kmp_release_deps (gtid=7, task=0x2aa001be480) at /home/uweigand/llvm/llvm-head/openmp/runtime/src/kmp_taskdeps.h:193
#9 0x000003fff7a9db2c in __kmp_task_finish<false> (gtid=7, task=0x2aa001be780, resumed_task=0x3ffe400be00)
at /home/uweigand/llvm/llvm-head/openmp/runtime/src/kmp_tasking.cpp:1179
#10 0x000003fff7a91936 in __kmp_invoke_task (gtid=7, task=0x2aa001be780, current_task=0x3ffe400be00)
at /home/uweigand/llvm/llvm-head/openmp/runtime/src/kmp_tasking.cpp:1944
#11 0x000003fff7a968e4 in __kmp_execute_tasks_template<kmp_flag_64<false, true> > (thread=0x3ffe4073a00, gtid=7, flag=0x3ffe89001c0, final_spin=1,
thread_finished=0x3ffe88fff54, itt_sync_obj=0x0, is_constrained=0) at /home/uweigand/llvm/llvm-head/openmp/runtime/src/kmp_tasking.cpp:3486
#12 0x000003fff7aa8bca in __kmp_execute_tasks_64<false, true> (thread=0x3ffe4073a00, gtid=7, flag=0x3ffe89001c0, final_spin=1, thread_finished=0x3ffe88fff54,
itt_sync_obj=0x0, is_constrained=0) at /home/uweigand/llvm/llvm-head/openmp/runtime/src/kmp_tasking.cpp:3599
#13 0x000003fff7ac136a in kmp_flag_64<false, true>::execute_tasks (this=0x3ffe89001c0, this_thr=0x3ffe4073a00, gtid=7, final_spin=1, thread_finished=0x3ffe88fff54,
itt_sync_obj=0x0, is_constrained=0) at /home/uweigand/llvm/llvm-head/openmp/runtime/src/kmp_wait_release.h:874
#14 0x000003fff7ababfc in __kmp_wait_template<kmp_flag_64<false, true>, true, false, true> (this_thr=0x3ffe4073a00, flag=0x3ffe89001c0, itt_sync_obj=0x0)
at /home/uweigand/llvm/llvm-head/openmp/runtime/src/kmp_wait_release.h:539
#15 0x000003fff7ac11aa in kmp_flag_64<false, true>::wait (this=0x3ffe89001c0, this_thr=0x3ffe4073a00, final_spin=1, itt_sync_obj=0x0)
at /home/uweigand/llvm/llvm-head/openmp/runtime/src/kmp_wait_release.h:881
#16 0x000003fff7ab2310 in __kmp_hyper_barrier_release (bt=bs_forkjoin_barrier, this_thr=0x3ffe4073a00, gtid=7, tid=-2, propagate_icvs=1, itt_sync_obj=0x0)
at /home/uweigand/llvm/llvm-head/openmp/runtime/src/kmp_barrier.cpp:1161
#17 0x000003fff7ab7970 in __kmp_fork_barrier (gtid=7, tid=-2) at /home/uweigand/llvm/llvm-head/openmp/runtime/src/kmp_barrier.cpp:2474
#18 0x000003fff7a65baa in __kmp_launch_thread (this_thr=0x3ffe4073a00) at /home/uweigand/llvm/llvm-head/openmp/runtime/src/kmp_runtime.cpp:6033
#19 0x000003fff7b1f3b2 in __kmp_launch_worker (thr=0x3ffe4073a00) at /home/uweigand/llvm/llvm-head/openmp/runtime/src/z_Linux_util.cpp:565
```
The node=0x3ffffff8200 pointer argument to __kmp_node_deref points to a local variable on the stack frame of __kmpc_omp_taskwait_deps_51 here:
```
kmp_depnode_t node = {0};
```
but that function is actually not running any more. The function had been called from the test case's main routine in the main thread, has returned, and the main thread is now executing glibc cleanup handlers as part of exiting the process.
The __kmpc_omp_taskwait_deps_51 function does the following before returning:
```
int thread_finished = FALSE;
kmp_flag_32<false, false> flag(
(std::atomic<kmp_uint32> *)&node.dn.npredecessors, 0U);
while (node.dn.npredecessors > 0) {
flag.execute_tasks(thread, gtid, FALSE,
&thread_finished USE_ITT_BUILD_ARG(NULL),
__kmp_task_stealing_constraint);
}
```
and it seems to assume that as soon as the dn.npredecessors counter drops to zero, it is OK to deallocate the local stack frame.
However, the __kmp_release_deps routine does the following:
```
for (kmp_depnode_list_t *p = node->dn.successors; p; p = next) {
kmp_depnode_t *successor = p->node;
#if USE_ITT_BUILD && USE_ITT_NOTIFY
__itt_sync_releasing(successor);
#endif
kmp_int32 npredecessors = KMP_ATOMIC_DEC(&successor->dn.npredecessors) - 1;
// successor task can be NULL for wait_depends or because deps are still
// being processed
if (npredecessors == 0) {
[...]
}
next = p->next;
__kmp_node_deref(thread, p->node);
[^^^ crash here]
```
Note that p->node->dn.npredecessors is decremented relatively early in the loop, but the kmp_depnode_t struct pointed to by p->node is still being accessed throughout the remainder of the loop. This seems to open a race where another thread deallocates the kmp_depnode_t struct immediately after npredecessors goes to zero.
</pre>
<img width="1px" height="1px" alt="" src="http://email.email.llvm.org/o/eJzUWV9v47gR_zTMyyCGRNqy_JAHJ47bxe3dHu52UfRJoKSRxY1MCiSVP_fpi6FkWXacvW2R7bVB4CQUOfrNb_4z0jm104g3bHHLFpsr2fna2JvuCdVO6vIqN-XLzecawUu7Q59p8ySVz_q_wKPzoBy41lhZqkI2zQtUUjVK78Bo8DWCaVHv22snVtHzdaN09wx5p5oS7QzgHzVqQC3zcKLEvNuBdA6tV0Y7xu-CCJLYWYRCOgRv1W6H1oFk0YZFa5ZEw3f4c304PZ6SHh72bealeyixdbOa8ZQnjK-YWIMGJu6Z2EA0uyjuSfm6x2CaxjwRSudl8QDeygIvHmFcJADRc0RfoqqqpZwvywWC0pBlBCXomfV6AuPp3u2Y2ETPYXOelHyVAOP8iI1xTlxUqsHpRizEkjaeqxc2N0rT5l7TAAwAiAzGt7XZI-Pbg5UZ3zbN4374cV2jpKXecIxvbae9CgecLRjfjirMirZlYr2cj4ovT_WW0ZKnR721KTEr0WJFWvva0osGfXAeLYWMIoJO-0Y9q6pKOa2v3g38hCux5smIPj1Dz_OoOKK32KB0pEDrCP_OK0K_DF4q3UNAzKWMojjHefrjEMcrMUJenUJelTmfQKZjWaW0cjUTd5VsHDJx_6fgl2kwg0XX7bHMxue9maIcgzne36noRUof3CqOl6tRzzg6UzReieSoqNKP5gGDgO_VruisRe3_Mu1W82PYxPGZdkmK86N2-IxF53v1XOZx3zbSIxN39LBq5C5L5qN9SWXb9Xbubf12oE15IjnjnnQVRXER9UlHyyZzrdJMbGJaGajpxQ7uhUf5aVpV1WJOO5X3mXvRRWbyr-F5kKhcVhjtvJVK9-feO1SONIt5eozvmJ8FeJoX8i2a36D0_en8Dh6Pzvg_wOhiNQlLccpoEYskMPptz2RizcT6hO-eWeUucUbrma_tn3L-_8dt6KaGyhKSezqppvH8lN1c5tUku_ed2Pdmg_FXIuqiW79N8lvOfImxH5A5X5G0EBMXXJy7YCy_2wVJ8n_qea-97a-iI03jIx3Jmc-E2jn6TP3Sos1yaa1CexBCDOSeiU3ussrYh69G6cOefyf--l-vQ_vZWtPKnfSYqeLR_ZcZGrCPjUQy4eesRc2Xq-WEH1L_cPxVJzGq937xf4qUz6fRf9aOJotcTqpVIztd1Fmf4b4dwO8Hd1gb4CaROHai8WkrmseVyPkruE_GPvTM_kCkf2Qfac7MOq-aAeoiWVwe1sInDbiXhg5ojdIeLUi76_aoPXjzepYJmxw9ktCYQjbwKK2SeYOHGbgfGSsr9wim6iUUmRlqa4hn6u6zRQw1WqTcdAks9JNXG97tA2Kg-ZAtbyO23DBxC-Hr4uG88-Br6aHqdBEGZOVAFr4LQ7s2HmynNQ24Ur_A3licARAx4_5alpAjaqA5H0uorNkH9cI1AA3njC8d7KXSYE3nlQ4zL-0Ia0PjxO-glg4s-s5SReV3IHV5vo3QafMEfZNAsHaNygsoGpS6a6GWumzCNYCDVlpPvOKzCjtJVGtNgc7Nzs38Le5HTUuD7mzqz7EyFgfUSu_etBEocpPTviNYabv--Ps9E7eHfWN9Enxan8YpLRRdnh6zI1DYOF_2pUt6s1fFUPY7pT2JoUq-ppTKE_KOWalnurVYInFhbLhRib6Eu48Rx1OtmlACLp4IQ0SITLY8HCFks5Pu7dgWH0oCvxsU5nfDKcaTc1q-_H6fffj8Obv98uHjJlv_9jfG01--fPwYNDicm4yyzqNslN4dOzJ_qgtFwSWrkIMpDw5x3weqo9G2jwfpwBmj6SeZ_BUBhelCDiitacPhP9CavpiRk376idZKlA3FvscgpE8Dk7A_8cO_myd8PJRWvHS9cAif1474tt9VJuTVaY5olPMZJdR1G1yQFq-ZuC_1zHXFwSfELbTho9-Dz_7U3ABniYfx9Xg8nGlJaJ9Ab8eCoKpT-5IDMJ6Mi798-vxh-8_RyGNr0BNBqvJ0fM3UzIwL1KWqTuGFCIBz393ATz__mq0_f_r5w122ub-jgOLJKHYg4yxGVnAN8fF142sY3zK-haPu4b6hkBpyBPLbYINDSkFdOjAWcixk5xCCYaWlaqCa5pXUHCnNDGkLy8lYEm7LXik2XAxODMUWt7PZjC02E9ljPIxLZN-J0cjaYmLp8-p2EtlHO0_Nsbhli_v-GworXd0XscXlUOw_fzF-CL9R5iVbUICVWFik4oslWGykV4_YvABK27wc6ktjTEsA-yKHZ_7qvO0KPxTzksI1fzm-N1xgk0kGE8iitwAlcdPtajPItEjlqURLpebw0hlQkSQJh9RCXQlIsLJAeCIiQGrja7SHwnZMFe5trGq_x1JJT5rKitLPKS-7kBj6XHSSW_rPq_JGlCuxkld4Ey_jaLUQYi6u6pt8UeaJ5POkylfIV-VSRHMe5wVP0yXOq_mVuuERn0eCR3EyX0VihlIWSVpGyzLGZY7I5hER0cyoFZsZu7tSznV4ky5WibhqZI6NC_9N4FzjE4SHjHO22FzZm9C-5d3OsXlEyckdpXjlm_BviE8t6p9_ZYsN_EYcFkaXqu9X9Leq91Vnm5va-9ZRigwxtVO-7vJZYfZnDWRrzVcsPOPbAM8xvg3w_xUAAP__73Pmwg">