[Openmp-commits] [openmp] [OpenMP] Adding a throttling threshold to bound dependent tasking mem… (PR #82274)

via Openmp-commits openmp-commits at lists.llvm.org
Thu Feb 22 10:05:07 PST 2024


https://github.com/jprotze requested changes to this pull request.

I rebased the patch to f7c2e5fa05d, because your branch did not build with my build configuration. Then I build llvm for these two versions.

First, I saw a flaky livelock, when running taskbench from epcc-openmp-microbenchmarks. I could only reproduce the livelock with your patch, but not with the version from main. Your patch seems to introduce a race condition on task scheduling or the barrier logic.

Then, I see a significant performance impact by this patch. I execute below Fibonacci code with 96 threads on our machine and get 2,459s (with the patch) vs 0,068s (without the patch), which is a 36x runtime increase.
Even execution with a single thread is significantly impacted (0.670s vs. 0.500s). And the serial execution is faster than the execution with 48 threads.

The crucial thing is: never put anything on the code path for included tasks (if(0)).

```
#include <stdio.h>
#include <stdlib.h>
int fib(int n) {
  int i, j;
  if (n<2) {
    return n;
  } else {
    #pragma omp task shared(i) if(n>15)
    i=fib(n-1);
    #pragma omp task shared(j) if(n>15)
    j=fib(n-2);
    if (n>15) {
      #pragma omp taskwait
    }
    return i+j;
  }
}

int main(int argc, char** argv) {
  int n = 5;
  if (argc>1) 
    n = atoi(argv[1]);
  #pragma omp parallel
  #pragma omp single
  {
    printf("fib(%i) = %i\n", n, fib(n));
  }
  return 0;
}
```
Compiled as:
```
clang -fopenmp -g -O3 fib-if0.c
```
Executed as:
```
time env OMP_PLACES="cores" OMP_PROC_BIND=close OMP_NUM_THREADS=96 ~/testdir/openmp/a.out 34
```


https://github.com/llvm/llvm-project/pull/82274


More information about the Openmp-commits mailing list