[LLVMbugs] [Bug 22708] New: wrong optimization of C++11 code due to unsafe reordering in GVN

Thu Feb 26 05:57:49 PST 2015

http://llvm.org/bugs/show_bug.cgi?id=22708

            Bug ID: 22708
           Summary: wrong optimization of C++11 code due to unsafe
                    reordering in GVN
           Product: tools
           Version: trunk
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: normal
          Priority: P
         Component: opt
          Assignee: unassignedbugs at nondot.org
          Reporter: sohachak at mpi-sws.org
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

Hi,

The following C++11 source code  is compiled with opt -O3 
where readXa() and writeXa() are running concurrently.

CPP
----

atomic<int> x;
int a;

int readXa(bool flag) {
 int r1=0, r2=500;
 if(flag) {
   a = 43;
 }
 r1=x.load(memory_order_acquire);
 if(r1==1) {
   r2 = a;
 }
return (r1+r2);
}

||

void writeXa(){
  a = 10;
  x.store(1,memory_order_release);
}

compilation command
--------------------
clang++ -std=c++11 -emit-llvm <finename>.cpp -S;opt -O3 <filename>.ll -o
<filename>.opt.bc -S

LLVM IR
-------

define i32 @_Z6readXab(i1 zeroext %flag) #3 {
entry:
  br i1 %flag, label %if.then, label %entry.if.end_crit_edge

entry.if.end_crit_edge:                           ; preds = %entry
  %.pre = load i32* @a, align 4
  br label %if.end

if.then:                                          ; preds = %entry
  store i32 43, i32* @a, align 4
  br label %if.end

if.end:                                           ; preds =
%entry.if.end_crit_edge, %if.then
  %0 = phi i32 [ %.pre, %entry.if.end_crit_edge ], [ 43, %if.then ]
  %1 = load atomic i32* getelementptr inbounds (%"struct.std::atomic"* @x, i64
0, i32 0, i32 0) acquire, align 4
  %cmp = icmp eq i32 %1, 1
  %. = select i1 %cmp, i32 %0, i32 500
  %add = add nsw i32 %., %1
  ret i32 %add
}

||

define void @_Z7writeXav() #3 {
entry:
  store i32 10, i32* @a, align 4
  store atomic i32 1, i32* getelementptr inbounds (%"struct.std::atomic"* @x,
i64 0, i32 0, i32 0) release, align 4
  ret void
}

In the source program when flag=false and r1==1,  
x.load(memory_order_acquire) in readXa() reads from and therefore synchronizes
with the x.store(1,memory_order_release) in writeXa(). 
As a result, a=10 in writeXa() happens-before r2=a in readXa() and the program
is race free. 

However, in the target program the load(a) is moved before the
x.load(memory_order_acquire) operation in readXa() and 
hence there is no happens-before relation between load(a) in readXa() and
store(a) in writeXa(). 
Therefore, the target program is racy and the transformation is wrong.

The movement of load(a) before x.load(memory_order_acquire) is seen after the
"Global Value Numbering" phase.

Regards,
soham

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150226/4d736cf5/attachment.html>