<table border="1" cellspacing="0" cellpadding="8">
    <tr>
        <th>Issue</th>
        <td>
            <a href=https://github.com/llvm/llvm-project/issues/153435>153435</a>
        </td>
    </tr>

    <tr>
        <th>Summary</th>
        <td>
            Race condition in Flang OpenMP target teams nested loop structure with reduction in CPU runs.
        </td>
    </tr>

    <tr>
      <th>Labels</th>
      <td>
            flang
      </td>
    </tr>

    <tr>
      <th>Assignees</th>
      <td>
      </td>
    </tr>

    <tr>
      <th>Reporter</th>
      <td>
          scamp-nvidia
      </td>
    </tr>
</table>

<pre>
    The following target teams nested loop reproducer shows a run to run race condition with Flang:

```
program omp_snippet_reproducer
  implicit none
  integer, parameter :: n  = 256
  real :: a(3,0:n), xx1,a1,a2,checksum
  integer  :: i, j

      a = 0.0
!$omp target
!$omp teams private(xx1,a1,a2)
!$omp loop
     do i=0,n-1
       a1=0.0d0
 a2=0.0d0
!$omp loop private(xx1) reduction(+:a1)
        do j=0,n-1
 xx1=j*5/100
          a1 = a1 + xx1
        enddo
        a(1,i) = a1
      enddo
!$omp end teams
!$omp end target

  ! Light-weight check: print a checksum scaled to easily show changes
  checksum = sum(a)-500000
 ! Subtract off serial answer
  checksum = checksum + 112928
  write(*,*) 'Serial Difference:', checksum

end program omp_snippet_reproducer
```

This code is created from an internal SPEC OMP port. It has been set up so that the serial answer is wired into the checksum print - if it's 0, then the answer is right. Otherwise it's off. Compiling this code with a recent build of Flang at no optimization and at "-O1", we get different answers each time we run it:

```
scamp:$ flang test.F90 -o test -fopenmp -v
flang version 22.0.0git (https://github.com/llvm/llvm-project 8c7e1ab98e80c4f224ba2ef7d343534afa237247)
Target: x86_64-unknown-linux-gnu
Thread model: posix
Build config: +assertions
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/13
Found candidate GCC installation: /usr/lib/gcc/x86_64-linux-gnu/14
Selected GCC installation: /usr/lib/gcc/x86_64-linux-gnu/14
Candidate multilib: .;@m64
Selected multilib: .;@m64
 "flang" -fc1 -triple x86_64-unknown-linux-gnu -emit-obj -fcolor-diagnostics -mrelocation-model pic -pic-level 2 -pic-is-pie -target-cpu x86-64 -fopenmp -resource-dir clang/22 -mframe-pointer=all -o /tmp/test-9ad73e.o -x f95 test.F90
warning: loc("test.F90":14:7): Detected standalone OpenMP `loop` directive with thread binding, the associated loop will be rewritten to `simd`.
scamp:$ export OMP_NUM_THREADS=64
scamp:$ ./test ; ./test ; ./test
 Serial Difference: 34776.
 Serial Difference: 28728.
 Serial Difference: 39312.
scamp:$ flang test.F90 -o test -fopenmp -O1
warning: loc("test.F90":14:7): Detected standalone OpenMP `loop` directive with thread binding, the associated loop will be rewritten to `simd`.
scamp:$ ./test ; ./test ; ./test
 Serial Difference: 33264.
 Serial Difference: 30240.
 Serial Difference: 43848.
```

If I make 'OMP_NUM_THREADS' equal to 1, then I get the serial answer. Or if I go to the optimization case (-O1) and (oddly) comment out the "a2=0.0d0" line, then I get the right answer! 

```
scamp:$ flang test-a2commentedout.F90 -o test -fopenmp -O1
warning: loc("test-a2commentedout.F90":14:7): Detected standalone OpenMP `loop` directive with thread binding, the associated loop will be rewritten to `simd`.
scamp:$ ./test ; ./test ; ./test
 Serial Difference: 0.
 Serial Difference: 0.
 Serial Difference: 0.
```

But even in the a2 commented out case, the no optimization case still shows the race condition.

Interestingly, a similar zero optimization problem can also be seen in GCC (14.1.0 here) - with the no optimization case also showing a race condition, and "-O1" showing the correct answer even without having to comment out the "a2=0.0d0" line. This makes me suspect that the issue is something subtle and maybe a awkward edge case. NVHPC (25.7) shows the correct answer in all cases. 

```
scamp:$ gfortran test.F90 -o test -fopenmp
scamp:$ ./test ; ./test ; ./test
 Serial Difference: 287280.000
 Serial Difference:   317520.000
 Serial Difference: 427896.000
scamp:$ gfortran test.F90 -o test -fopenmp -O1
scamp:$ ./test ; ./test ; ./test
 Serial Difference:   0.00000000
 Serial Difference: 0.00000000
 Serial Difference:   0.00000000
scamp:$ nvfortran test.F90 -o test -mp
scamp:$ ./test ; ./test ; ./test
 Serial Difference:    0.000000
 Serial Difference:    0.000000
 Serial Difference:    0.000000
```
</pre>
<img width="1" height="1" alt="" src="http://email.email.llvm.org/o/eJzcWF9z2zYS_zTwyw45JEiJ0oMeaLm6ZuZSZ5r0XjMgsZSQgAAPACWnn_5mQUmWVStpr36582hIk1js39_-AYX3amsQV2x2z2YPd2IMO-tWvhX9kJi9kkrcNVZ-W33aIXRWa3tQZgtBuC0GCCh6DwZ9QAna2gEcDs7KsUUHfmcPHgS40UCw8eZEi9BaI1VQ1sBBhR1stDBbVtQsi795dvxl9eDs1okebD989kYNA4bPz_xZVgOoftCqVQGMNTi9MQG36BhfwyCc6DGgA2Jf1GAAWPEAfDaPpA6FPi0JxhcF4-uMFbVhfEn7n55yxtciXjjj63aH7Vc_9pdy4MRA0Y4vkxEQ_0QUlqVkCuM546Xth6PnXr6KXhyc2ouAjC-u5C5fEJOXTyKkBcWKh4zxtUnys2AQOb1NM0miQfCLp5eMroUuwaEcWwoO4wvG71lRkx7LZ94k9MtLobS1ePjCeD1jfJNn2QU1KRP9QDd-H2mfV9FIaS-eKQpktiJVpl3n1RPtswVo5OS7P749O5l2M57DP9V2F5ID0g1iIClog1MmgIBTZMG3QqMkuKLwSn-LIIZ2J8wWfWR2JiX1CAx8IRhfJrOM_oiExH0cm-BEG8B2HXh0SmgQxh-OsH3B5PmB30Oe8yVfRKKDUzEyjNeMr-N1CYxXHyd2D6rr0KFpkRDIK8LfBUJZVpMnfphDlwnHsvrTTnlorUSgu0NBmd0524MwEfPOCA0fP_y0hsf3H2CwLqTwLsBOeGgQDXgMMA7gLYSdCBB2-NJ-4ntQDiVxs3H97IApHAmoDlRgvPJAKCMaEwmfOTiKYwqPYYfuoDye6G3XpbC2_aB0LFNna2KpEeCwRROgGZWWYLup-ICg-gF2CKpXv4tYmoSR9JpxnjzmjFP6wwGBip48Oj4c9fGAot1BUD0SCdU5FV4taLGqxmiV0EXJAX1IN8sMEhv_h6SzA5p-gGTPsnoi2qPzpBPnaZZmW0VqLXYhDD7y2jC-2aqwG5u0tT3jG633p1syOPsF2wCLtsJcNMsFLrK27DgvG8Gxq2RRFrOiFJ3gRcXLasr1T1P-FDU8Leaf52Uymq_GHkyilRmfkq0ZI1QcCgm9lahjMlmvnlhW30fnttZ0iuo6wVp4j478Sjm0saOR0AojlRQB4R_rNSjjg9A6-n7asxm9IyNUQ-a1LeOboy7POvBNXrw1w5Jl9UfU2BLw_zan9VmpftRB0ZaihpQV96zM-vkLYTcpCIURCYxzSLo2hyQ4NWi8GR1IsFchsc0XorfaukQqsTXWB9V6SHqH2rbRpCTGDwbVQjKoNtG4Rw18elA-GRRCMtXTpB1GEpnMywucOvR2dC0mUjloJzU3pGjfUftNBhvLBisehNaEc8Y3oR_oij4kSyGrAlMLyRN0y9k5I1hWH4QzKs4GoG0bKyE_L3POijovWVFHzBY1PGCYHOmDMFJoaxAeBzTvPwCbZ7FvzjOQymEb1P5YEsIE4kYZSaKmagPCe9sqcZ5qDkpraBAcUlEOGAcaNs-86iWbZ-lVbuMT1UUqkJ9_-e39508___pT_fCRFQ8xnJeU6dENwIr71x8o_q9VfCjKqpqnN5f5ouKL28vFssj5td4_rEmP-f9yWP6Osws-L7_jzYyX2e3lsliUMRZXzfZdB--gF1-R2vo1WngF-O9RaLIpP7fBd7EF_aGrpvDoqG2-g62FY1t90c9a4UnKInazZWxvjC-slPobPbe276mj2XFizji_GBs5B60MvqJE7MOnyYbn8Gd7XiL4USRKO_5XcHuFxf8r-r4DrR8tXSHufgyAe6RRbrKJw9mJMfiEk5PB1zNRxJAPZPZ0tosIeHGkS4_ApoqPPiizJXytQYBXvdLCwe_ortgOzjYae2rgILS35FOPk47UgOlMUKZ5msEOHRJak1OYbugYuZCKNAOKKxWjOhH-x9HuTBlnUesICqdRM_qKhJFvdmIfyeyfS5cU4jBN-e2hR_CjH4j1eS5W3o9xzva2x7Aj3n5sgsaoXy--NQgCxOHrQTgJKLcYzUvhl3_9_CH6hc9SAvpFOK70V-RSHbf59AfZue2sC06Y2w3g7RAdu1OWHk9Mr5IAFHk1498nKnm1WM6PJH_NmGN5eSODAKKi50PgjYz8IckVn0vtzP62TW8Zmwsd3pDmjLo7uSrksliKO1zl1WyWLZd8kd_tVrLKM7ko8gbzssn4fDZHOS-KtppXFS7E_E6teMZn2SIv8llRllnKK9k1nRRLybFq5gUrM-yF0ikdfVLrtncxxVZEXszutGhQ-_il63mgZrOHO7eKZ6Vm3HpWZlr54J9ZBBU0rn59-eVKmePR8dhLbn4P88GNbRjdsbOcP68Qh_WH3-i06NO70enVXzvQMb6Jtnk6aUzm7Vf8PwEAAP__NgIrjA">