<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - Segfault in omp_parallel_reduction.c test (AArch64)"
   href="https://bugs.llvm.org/show_bug.cgi?id=44112">44112</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>Segfault in omp_parallel_reduction.c test (AArch64)
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>OpenMP
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>unspecified
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>Other
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>normal
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Runtime Library
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>graham.hunter@arm.com
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>I observed a segmentation fault in the parallel reduction test when running
'make check-openmp'. I took a look through the resulting core file and found
the crash was in the generated reduction function:

<span class="quote">> #0  0x000000000040179c in .omp.reduction.reduction_func ()
> #1  0x0000ffff8fe95320 in __kmp_barrier () from /home/grahun01/Build/hpc-dev/lib/libomp.so
> #2  0x0000ffff8fe65c10 in __kmpc_reduce_nowait () from /home/grahun01/Build/hpc-dev/lib/libomp.so
> #3  0x00000000004016ac in .omp_outlined. ()
> #4  0x0000ffff8fed09ac in __kmp_invoke_microtask () from /home/grahun01/Build/hpc-dev/lib/libomp.so
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)</span >

The test compiles with -O0, so there's a few additional loads/stores in the
disassembly:

<span class="quote">> Dump of assembler code for function .omp.reduction.reduction_func.6:
>    0x000000000040177c <+0>:     sub     sp, sp, #0x10
>    0x0000000000401780 <+4>:     str     x0, [sp, #8]
>    0x0000000000401784 <+8>:     str     x1, [sp]
>    0x0000000000401788 <+12>:    ldr     x8, [sp, #8]
>    0x000000000040178c <+16>:    ldr     x9, [sp]
>    0x0000000000401790 <+20>:    ldr     x9, [x9]
>    0x0000000000401794 <+24>:    ldr     x8, [x8]
>    0x0000000000401798 <+28>:    ldr     d0, [x8]
> => 0x000000000040179c <+32>:    ldr     d1, [x9]
>    0x00000000004017a0 <+36>:    fadd    d0, d0, d1
>    0x00000000004017a4 <+40>:    str     d0, [x8]
>    0x00000000004017a8 <+44>:    add     sp, sp, #0x10
>    0x00000000004017ac <+48>:    ret</span >

The parameters passed into this function come from one of the kmp_barrier
variants:

<span class="quote">>            (*reduce)(this_thr->th.th_local.reduce_data,
>                      child_thr->th.th_local.reduce_data);</span >

The pointer in x9 at the time of the fault was 0.

My guess is that there's a data race between a child thread writing a pointer
into its reduce_data variable and the thread performing the reduction trying to
read it. A run with TSan confirms this, but as it almost always works (I tried
~100K times to reproduce on different machines without success) I suspect TSan
isn't detecting the method that's supposed to synchronize access.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>