<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Offload Hierarchical Reduction on multiple Thread Give incorrect result"
   href="https://bugs.llvm.org/show_bug.cgi?id=49979">49979</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Offload Hierarchical Reduction on multiple Thread Give incorrect result
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>OpenMP
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Runtime Library
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>tapplencourt@anl.gov
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre># Overview

Running an OpenMP reduction on the host where the body/kernel is a offload
hierarchical reduction (first reduction on team, then reduction on parallel)
produces a race condition / incorrect value.

## Note
- Work with `gcc`.
- Tested on NVIDIA hardware.
- Remove the first reduction give the correct value. 
- Merge the 2 offload reductions give the correct value.

# How to reproduce

```
cat test_parallel_teams_distribute__parallel_for.cpp
#include <iostream>
#include <cstdlib>
#include <cmath>
bool almost_equal(float x, float gold, float tol) {
  if ( std::signbit(x) != std::signbit(gold) )
  {
    x = std::abs(gold) - std::abs(x);
  }
  return std::abs(gold) * (1-tol) <= std::abs(x) && std::abs(x) <=
std::abs(gold) * (1 + tol);
}
void test_parallel_teams_distribute__parallel_for() {
  const int N0 { 32 };
  const int N1 { 32 };
  const int N2 { 32 };
  const float expected_value { N0*N1*N2 };
  float counter_N0{};
  #pragma omp parallel for reduction(+:counter_N0)
  for (int i0 = 0 ; i0 < N0 ; i0++ )
  {
    #pragma omp target teams distribute reduction(+: counter_N0)
    for (int i1 = 0 ; i1 < N1 ; i1++ )
    {
      #pragma omp parallel for reduction(+: counter_N0)
      for (int i2 = 0 ; i2 < N2 ; i2++ )
      {
          counter_N0 = counter_N0 + 1;
      }
    }
  }
  if (!almost_equal(counter_N0, expected_value, 0.1)) {
    std::cerr << "Expected: " << expected_value << " Got: " << counter_N0 <<
std::endl;
    std::exit(112);
  }
}
int main()
{
    test_parallel_teams_distribute__parallel_for();
}
<span class="quote">> clang --version | head -n1</span >
clang version 13.0.0 (<a href="https://github.com/llvm/llvm-project.git">https://github.com/llvm/llvm-project.git</a>
ea14df695ebde3f5bdd5ba7548d3d49e8f1c4411)
<span class="quote">> clang++ -fopenmp -fopenmp-targets=nvptx64 test_parallel_teams_distribute__parallel_for.cpp
> ./a.out</span >
Expected: 32768 Got: 23552
<span class="quote">> ./a.out</span >
Expected: 32768 Got: 11264
```</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>