[Openmp-commits] [openmp] [OpenMP] Adding a throttling threshold to bound dependent tasking mem… (PR #82274)

PEREIRA Romain via Openmp-commits openmp-commits at lists.llvm.org
Fri Feb 23 08:50:15 PST 2024


================
@@ -438,10 +438,9 @@ static kmp_int32 __kmp_push_priority_task(kmp_int32 gtid, kmp_info_t *thread,
 
   __kmp_acquire_bootstrap_lock(&thread_data->td.td_deque_lock);
   // Check if deque is full
-  if (TCR_4(thread_data->td.td_deque_ntasks) >=
-      TASK_DEQUE_SIZE(thread_data->td)) {
-    if (__kmp_enable_task_throttling &&
-        __kmp_task_is_allowed(gtid, __kmp_task_stealing_constraint, taskdata,
+  if (__kmp_enable_task_throttling && TCR_4(thread_data->td.td_deque_ntasks) >=
+                                          __kmp_task_maximum_ready_per_thread) {
+    if (__kmp_task_is_allowed(gtid, __kmp_task_stealing_constraint, taskdata,
----------------
rpereira-dev wrote:

I pulled from main and updated the patch with your feedback.

#### On performances
A single run on 16-cores Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz with `OMP_PLACES="cores" OMP_PROC_BIND=close OMP_NUM_THREADS=16`

Compiling with `#define KMP_COMPILE_GLOBAL_TASK_THROTTLING 0` and running with `KMP_ENABLE_TASK_THROTTLING=0`
```bash
$ time ./a.out 34
fib(34) = 5702887
real	0m0.210s
```

Compiling with `#define KMP_COMPILE_GLOBAL_TASK_THROTTLING 0` and running with `KMP_ENABLE_TASK_THROTTLING=1` (= current default behavior)
```bash
$ time ./a.out 34
fib(34) = 5702887
real	0m0.201s
```

Compiling with `#define KMP_COMPILE_GLOBAL_TASK_THROTTLING 1` and running with `KMP_ENABLE_TASK_THROTTLING=0`
```bash
$ time ./a.out 34
fib(34) = 5702887
real	0m0.210s
```

Compiling with `#define KMP_COMPILE_GLOBAL_TASK_THROTTLING 1` and running with `KMP_ENABLE_TASK_THROTTLING=1`
```bash
$ time ./a.out 34
fib(34) = 5702887
real	0m7.786s
```

This makes me think of adding an additional run-time option `KMP_ENABLE_GLOBAL_TASK_THROTTLING` to disable only "global throttling" at run-time even if its compiled, so users can still use "per-thread throttling"

#### On livelock
I could not reproduce on the above machine and configuration using taskbench from epcc-openmp-microbenchmarks.
(I ran 200+ times using default taskbench options)
I built LLVM using `-DCMAKE_BUILD_TYPE=RelWithDebInfo` that ends up being a gcc-12 with -O2 -g

#### On unit tests
In the current state, the test `omp_throttling_max.c` will fail if "global throttling" is not compiled (`#define KMP_COMPILE_GLOBAL_TASK_THROTTLING 0`) - any advices on disabling the test in such case ?

https://github.com/llvm/llvm-project/pull/82274


More information about the Openmp-commits mailing list