[Openmp-commits] [openmp] [OpenMP] Fix work-stealing stack clobber with taskwait (PR #126049)
Jonathan Peyton via Openmp-commits
openmp-commits at lists.llvm.org
Fri Feb 7 23:42:25 PST 2025
jpeyton52 wrote:
It took me an embarrassingly long time to find this, but the timing hole is this sequence of events:
1) THREAD 1: A regular task with dependences is created, call it T1
2) THREAD 1: Call into `__kmpc_omp_taskwait_deps_51()` and create a stack based depnode (`NULL` task), call it T2 (stack)
3) THREAD 2: Steals task T1 and executes it getting to `__kmp_release_deps()` region.
4) THREAD 1: During processing of dependences for T2 (stack) (within `__kmp_check_deps()` region), a link is created T1 -> T2. This increases T2's (stack) `nrefs` count.
5) THREAD 2: Iterates through the successors list: decrement the T2's (stack) npredecessor count. BUT HASN'T YET `__kmp_node_deref()`-ed it.
6) THREAD 1: Now when finished with `__kmp_check_deps()`, it returns false because npredecessor count is 0, but T2's (stack) `nrefs` count is 2 because THREAD 2 still references it!
7) THREAD 1: Because `__kmp_check_deps()` returns false, early exit.
_Now the stack based depnode is invalid, but THREAD 2 still references it._
We've reached improper stack referencing behavior. Varied results/crashes/asserts can occur if THREAD 1 comes back and recreates the exact same depnode in the exact same stack address during the same time THREAD 2 calls `__kmp_node_deref()`.
One solution is along the lines of this patch which is to allocate all depnodes on the heap -- you may still need to have a deref() in the early exit as well.
The other is to stick another:
```
// Wait until the last __kmp_release_deps is finished before we free the
// current stack frame holding the "node" variable; once its nrefs count
// reaches 1, we're sure nobody else can try to reference it again.
while (node.dn.nrefs > 1)
KMP_YIELD(TRUE);
```
right before the early exit if `__kmp_check_deps()` returns false. This isn't ideal, but that's the current state of things.
https://github.com/llvm/llvm-project/pull/126049
More information about the Openmp-commits
mailing list