[Openmp-commits] [openmp] [OpenMP] Fix work-stealing stack clobber with taskwait (PR #126049)

Jonathan Peyton via Openmp-commits openmp-commits at lists.llvm.org
Fri Feb 7 23:42:25 PST 2025


jpeyton52 wrote:

It took me an embarrassingly long time to find this, but the timing hole is this sequence of events:

1) THREAD 1: A regular task with dependences is created, call it T1

2) THREAD 1: Call into `__kmpc_omp_taskwait_deps_51()` and create a stack based depnode (`NULL` task), call it T2 (stack)

3) THREAD 2: Steals task T1 and executes it getting to `__kmp_release_deps()` region.

4) THREAD 1: During processing of dependences for T2 (stack) (within `__kmp_check_deps()` region),  a link is created T1 -> T2. This increases T2's (stack) `nrefs` count.

5) THREAD 2: Iterates through the successors list: decrement the T2's (stack) npredecessor count. BUT HASN'T YET `__kmp_node_deref()`-ed it.

6) THREAD 1: Now when finished with `__kmp_check_deps()`, it returns false because npredecessor count is 0, but T2's (stack) `nrefs`  count is 2 because THREAD 2 still references it!

7) THREAD 1: Because `__kmp_check_deps()` returns false, early exit. 
_Now the stack based depnode is invalid, but THREAD 2 still references it._

We've reached improper stack referencing behavior. Varied results/crashes/asserts can occur if THREAD 1 comes back and recreates the exact same depnode in the exact same stack address during the same time THREAD 2 calls `__kmp_node_deref()`.

One solution is along the lines of this patch which is to allocate all depnodes on the heap -- you may still need to have a deref() in the early exit as well.

The other is to stick another:
```
    // Wait until the last __kmp_release_deps is finished before we free the
    // current stack frame holding the "node" variable; once its nrefs count
    // reaches 1, we're sure nobody else can try to reference it again.
    while (node.dn.nrefs > 1)
      KMP_YIELD(TRUE);
```
right before the early exit if `__kmp_check_deps()` returns false. This isn't ideal, but that's the current state of things.

https://github.com/llvm/llvm-project/pull/126049


More information about the Openmp-commits mailing list