<html>

    <head>

      <base href="http://llvm.org/bugs/" />

    </head>

    <body><table border="1" cellspacing="0" cellpadding="8">

        <tr>

          <th>Bug ID</th>

          <td><a class="bz_bug_link 

          bz_status_NEW "

   title="NEW --- - wrong vectorization in the presence of C++11 atomics"

   href="http://llvm.org/bugs/show_bug.cgi?id=22306">22306</a>

          </td>

        </tr>

        <tr>

          <th>Summary</th>

          <td>wrong vectorization in the presence of C++11 atomics

          </td>

        </tr>

        <tr>

          <th>Product</th>

          <td>new-bugs

          </td>

        </tr>

        <tr>

          <th>Version</th>

          <td>unspecified

          </td>

        </tr>

        <tr>

          <th>Hardware</th>

          <td>PC

          </td>

        </tr>

        <tr>

          <th>OS</th>

          <td>Linux

          </td>

        </tr>

        <tr>

          <th>Status</th>

          <td>NEW

          </td>

        </tr>

        <tr>

          <th>Severity</th>

          <td>normal

          </td>

        </tr>

        <tr>

          <th>Priority</th>

          <td>P

          </td>

        </tr>

        <tr>

          <th>Component</th>

          <td>new bugs

          </td>

        </tr>

        <tr>

          <th>Assignee</th>

          <td>unassignedbugs@nondot.org

          </td>

        </tr>

        <tr>

          <th>Reporter</th>

          <td>sohachak@mpi-sws.org

          </td>

        </tr>

        <tr>

          <th>CC</th>

          <td>llvmbugs@cs.uiuc.edu

          </td>

        </tr>

        <tr>

          <th>Classification</th>

          <td>Unclassified

          </td>

        </tr></table>

      <p>

        <div>

        <pre>Created <span class=""><a href="attachment.cgi?id=13726" name="attach_13726" title="contains testcase .cpp, .ll, .opt.bc files">attachment 13726</a> <a href="attachment.cgi?id=13726&action=edit" title="contains testcase .cpp, .ll, .opt.bc files">[details]</a></span>

contains testcase .cpp, .ll, .opt.bc files

Hi,

LLVM introduces data race in the following compilation in SLP vectorization

phase.

Source

----------

atomic<int> x[4];

int a[4];

void writeA() {

   for(int i=0;i<4;i++) {

     a[i] = 0;

     x[i].store(i,memory_order_release);

   }

}

Compilation command 

---------------------

clang++ -std=c++11 -emit-llvm -pthread <filename>.cpp -S;opt -O3 <filename>.ll

-o <filename>.opt.bc -S

<filename>.opt.bc - optimized code

-------- 

define void @_Z6writeAv() #3 {

entry:

  store atomic i32 0, i32* getelementptr inbounds ([4 x %"struct.std::atomic"]*

@x, i64 0, i64 0, i32 0, i32 0) release, align 16

  store atomic i32 1, i32* getelementptr inbounds ([4 x %"struct.std::atomic"]*

@x, i64 0, i64 1, i32 0, i32 0) release, align 4

  store atomic i32 2, i32* getelementptr inbounds ([4 x %"struct.std::atomic"]*

@x, i64 0, i64 2, i32 0, i32 0) release, align 8

  store <4 x i32> zeroinitializer, <4 x i32>* bitcast ([4 x i32]* @a to <4 x

i32>*), align 16

  store atomic i32 3, i32* getelementptr inbounds ([4 x %"struct.std::atomic"]*

@x, i64 0, i64 3, i32 0, i32 0) release, align 4

  ret void

}

The transformation steps are as follows

for(int i=0;i<4;i++) {a[i] = 0; x[i] = i;}

1. loop unrolling => a[0] = 0; x[0].store(0,release); a[1] = 0;

x[1].store(1,release); a[2] = 0; x[2].store(2,release); a[3] = 0;

x[3].store(3,release);   

2. statement reordering => x[0].store(0,release); x[1].store(1,release);

x[2].store(2,release); a[0] = 0; a[1] = 0; a[2] = 0; a[3] = 0;

x[3].store(3,release); // wrong reordering

3. SLP vectorization => x[0].store(0,release); x[1].store(1,release);

x[2].store(2,release); a[0:3] = 0; x[3].store(3,release);

Movement of non-atomic shared variable a[] after the release write of atomic

variable x is unsafe in step 2. 

Consider the following thread is running in parallel

int readA() {

  int r=0; 

  if(x[2].load(memory_order_acquire) == 2){

    r = a[2];

  }    

  return r;

}

The source program does not have any data race as the write(a[2])

happens-before read(a[2]) due to the synchronization  between

(x[2].store(2,release) , x[2].load(acquire)).

But the target program is racy since the write(a[2]); x[2].store(2,release) is

reordered.   

The reordering in the SLP vectorization is introducing the error.  

Attached are the testcase cpp, and LLVM IR files.

Regards,

soham</pre>

        </div>

      </p>

      <hr>

      <span>You are receiving this mail because:</span>

      <ul>

          <li>You are on the CC list for the bug.</li>

      </ul>

    </body>

</html>