<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - Incorrect results when splitting teams distribute and parallel for"
href="https://bugs.llvm.org/show_bug.cgi?id=39725">39725</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>Incorrect results when splitting teams distribute and parallel for
</td>
</tr>
<tr>
<th>Product</th>
<td>OpenMP
</td>
</tr>
<tr>
<th>Version</th>
<td>unspecified
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>normal
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Runtime Library
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>csdaley@lbl.gov
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>Hello all,
My simple test program fails when "teams distribute" and "parallel for" are
used on different loops. The test program doubles the values in array x and
should print 0, 2, 4, 6. The program works when combining "teams distribute"
and "parallel for" directives on the outer loop. I am using the released
version of LLVM-7.0.0 on an Intel Skylake + Nvidia Volta V100 system. Both
programs work when using Clang-ykt. Please see below.
$ cat target.c
#include <stdio.h>
#define N 2
int main()
{
double x[N*N];
int i, j, k;
for (k=0; k<N*N; ++k) x[k] = k;
#pragma omp target
#pragma omp teams distribute
for (i=0; i<N; ++i) {
#pragma omp parallel for
for (j=0; j<N; ++j) {
x[j+N*i] *= 2.0;
}
}
for (int i=0; i<N; ++i) {
for (int j=0; j<N; ++j) {
printf("x[%d]=%.0f\n", j+N*i, x[j+N*i]);
}
}
return 0;
}
$ clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda target.c -o target
$ ./target
x[0]=0
x[1]=2
x[2]=2
x[3]=3
These are incorrect values. It works if I move all the OpenMP directives within
the target region to the outer loop:
$ cat target2.c
#include <stdio.h>
#define N 2
int main()
{
double x[N*N];
int i, j, k;
for (k=0; k<N*N; ++k) x[k] = k;
#pragma omp target
#pragma omp teams distribute parallel for collapse(2)
for (i=0; i<N; ++i) {
for (j=0; j<N; ++j) {
x[j+N*i] *= 2.0;
}
}
for (int i=0; i<N; ++i) {
for (int j=0; j<N; ++j) {
printf("x[%d]=%.0f\n", j+N*i, x[j+N*i]);
}
}
return 0;
}
$ clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda target2.c -o target2
$ ./target2
x[0]=0
x[1]=2
x[2]=4
x[3]=6
$ clang -v
clang version 7.0.0 (tags/RELEASE_700/final)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/acceptance/csdaley/software/llvm/7.0.0/bin
Found candidate GCC installation: /usr/lib/gcc/i686-redhat-linux/4.8.2
Found candidate GCC installation: /usr/lib/gcc/i686-redhat-linux/4.8.5
Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/4.8.2
Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/4.8.5
Selected GCC installation: /usr/lib/gcc/x86_64-redhat-linux/4.8.5
Candidate multilib: .;@m64
Candidate multilib: 32;@m32
Selected multilib: .;@m64
Found CUDA installation: /usr/local/cuda-9.2, version 9.2</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>