<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Target followed by Atomic give incorrect result"
   href="https://bugs.llvm.org/show_bug.cgi?id=47039">47039</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Target followed by Atomic give incorrect result
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>OpenMP
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Runtime Library
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>tapplencourt@anl.gov
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>Created <span class=""><a href="attachment.cgi?id=23826" name="attach_23826" title="Reproducer">attachment 23826</a> <a href="attachment.cgi?id=23826&action=edit" title="Reproducer">[details]</a></span>
Reproducer

Found in `clang version 12.0.0 (<a href="https://github.com/llvm/llvm-project.git">https://github.com/llvm/llvm-project.git</a>
55ead5bfffdc00e84cff347ee98471b5616a9f48)` running on a Ndvida V100

Hi,

When run, the following code:

```
  float counter_target{};
  #pragma omp target map(tofrom: counter_target)
  {
    #pragma omp atomic update
    counter_target = counter_target +  1. ;
  }
```
Produce 128, where we expected the result to be 1. Indeed only one thread
should be active in the target region.

LIBOMPTARGET_DEBUG=1 give us some hint from a potential root of the problem:
```
Target CUDA RTL --> Setting CUDA threads per block to default 128
Target CUDA RTL --> Using default number of teams 128
Target CUDA RTL --> Launch kernel with 128 blocks and 128 threads
```
It look like the runtime spwamned 128 threads, corresponding to 128 teams
regardless of the absence of the #teams pragma. 

Regards,
Thomas

PS: I joined a full reproducer for convenience. You can also found it here:
<a href="https://github.com/TApplencourt/OvO/blob/master/test_src/cpp/hierarchical_parallelism/atomic-float/target.cpp">https://github.com/TApplencourt/OvO/blob/master/test_src/cpp/hierarchical_parallelism/atomic-float/target.cpp</a></pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>