[llvm-bugs] [Bug 39725] New: Incorrect results when splitting teams distribute and parallel for

via llvm-bugs llvm-bugs at lists.llvm.org
Tue Nov 20 09:29:08 PST 2018


https://bugs.llvm.org/show_bug.cgi?id=39725

            Bug ID: 39725
           Summary: Incorrect results when splitting teams distribute and
                    parallel for
           Product: OpenMP
           Version: unspecified
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P
         Component: Runtime Library
          Assignee: unassignedbugs at nondot.org
          Reporter: csdaley at lbl.gov
                CC: llvm-bugs at lists.llvm.org

Hello all,

My simple test program fails when "teams distribute" and "parallel for" are
used on different loops. The test program doubles the values in array x and
should print 0, 2, 4, 6. The program works when combining "teams distribute"
and "parallel for" directives on the outer loop. I am using the released
version of LLVM-7.0.0 on an Intel Skylake + Nvidia Volta V100 system. Both
programs work when using Clang-ykt. Please see below.

$ cat target.c
#include <stdio.h>

#define N 2
int main()
{
  double x[N*N];
  int i, j, k;
  for (k=0; k<N*N; ++k) x[k] = k;

#pragma omp target
#pragma omp teams distribute
  for (i=0; i<N; ++i) {
#pragma omp parallel for
    for (j=0; j<N; ++j) {
      x[j+N*i] *= 2.0;
    }
  }

  for (int i=0; i<N; ++i) {
    for (int j=0; j<N; ++j) {
      printf("x[%d]=%.0f\n", j+N*i, x[j+N*i]);
    }
  }
  return 0;
}

$ clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda target.c -o target
$ ./target
x[0]=0
x[1]=2
x[2]=2
x[3]=3

These are incorrect values. It works if I move all the OpenMP directives within
the target region to the outer loop:

$ cat target2.c 
#include <stdio.h>

#define N 2
int main()
{
  double x[N*N];
  int i, j, k;
  for (k=0; k<N*N; ++k) x[k] = k;

#pragma omp target
#pragma omp teams distribute parallel for collapse(2)
  for (i=0; i<N; ++i) {
    for (j=0; j<N; ++j) {
      x[j+N*i] *= 2.0;
    }
  }

  for (int i=0; i<N; ++i) {
    for (int j=0; j<N; ++j) {
      printf("x[%d]=%.0f\n", j+N*i, x[j+N*i]);
    }
  }
  return 0;
}

$ clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda target2.c -o target2
$ ./target2 
x[0]=0
x[1]=2
x[2]=4
x[3]=6


$ clang -v
clang version 7.0.0 (tags/RELEASE_700/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/acceptance/csdaley/software/llvm/7.0.0/bin
Found candidate GCC installation: /usr/lib/gcc/i686-redhat-linux/4.8.2
Found candidate GCC installation: /usr/lib/gcc/i686-redhat-linux/4.8.5
Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/4.8.2
Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/4.8.5
Selected GCC installation: /usr/lib/gcc/x86_64-redhat-linux/4.8.5
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
Found CUDA installation: /usr/local/cuda-9.2, version 9.2

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20181120/9acafd78/attachment.html>


More information about the llvm-bugs mailing list