[LLVMbugs] [Bug 22514] New: Wrong transformation due to semantic gap between C11 and LLVM semantics

bugzilla-daemon at llvm.org bugzilla-daemon at llvm.org
Mon Feb 9 02:29:34 PST 2015


http://llvm.org/bugs/show_bug.cgi?id=22514

            Bug ID: 22514
           Summary: Wrong transformation due to semantic gap between C11
                    and LLVM semantics
           Product: tools
           Version: trunk
          Hardware: PC
                OS: Windows NT
            Status: NEW
          Severity: normal
          Priority: P
         Component: opt
          Assignee: unassignedbugs at nondot.org
          Reporter: sohachak at mpi-sws.org
                CC: llvmbugs at cs.uiuc.edu
    Classification: Unclassified

Created attachment 13828
  --> http://llvm.org/bugs/attachment.cgi?id=13828&action=edit
source and IR files

Hi,

The following C++11 source code where readA() and writeA() are running
concurrently is compiled with opt -O3.

Source
----------
atomic<int> x = 0;
int a = 0;
int readA(bool flag) {
 int r=0, r1=0;

 if(flag==true) {
    r = a;
 }
 if(x==1){
   r1 = a;
 }
 else {
  r1 = 42;
}
return (r+r1);
}

void writeA(){
  a = 42;
  x = 1;
}
:

Compilation command 
---------------------
clang++ -std=c++11 -emit-llvm -pthread <filename>.cpp -S;opt -O3 <filename>.ll
-o <filename>.opt.bc -S

Target
-------- 
define i32 @_Z5readAb(i1 zeroext %flag) #3 {
entry:
  %0 = load i32* @a, align 4
  %. = select i1 %flag, i32 %0, i32 0
  %1 = load atomic i32* getelementptr inbounds (%"struct.std::atomic"* @x, i64
0, i32 0, i32 0) seq_cst, align 4
  %cmp1 = icmp eq i32 %1, 1
  %r1.0 = select i1 %cmp1, i32 %0, i32 42
  %add = add nsw i32 %r1.0, %.
  ret i32 %add
}

define void @_Z6writeAv() #3 {
entry:
  store i32 42, i32* @a, align 4
  store atomic i32 1, i32* getelementptr inbounds (%"struct.std::atomic"* @x,
i64 0, i32 0, i32 0) seq_cst, align 4
  ret void
}
:

Suppose now we run readA(false) in parallel with writeA().
The source program is data race free and can return only 42.
The target program, however, is racy and could return any value (practically,
it returns 0).


Analysis of the transformation steps
-------------------------------------
(1) The "Simplify CFG" pass introduces a speculative load of 'a' (introducing a
data race). 

IR
---
define i32 @_Z5readAb(i1 zeroext %flag) #3 {
entry:
  %0 = load i32* @a, align 4
  %. = select i1 %flag, i32 %0, i32 0
  %call = call i32 @_ZNKSt13__atomic_baseIiEcviEv(%"struct.std::__atomic_base"*
getelementptr inbounds (%"struct.std::atomic"* @x, i64 0, i32 0)) #2
  %cmp1 = icmp eq i32 %call, 1
  %1 = load i32* @a, align 4
  %r1.0 = select i1 %cmp1, i32 %1, i32 42
  %add = add nsw i32 %., %r1.0
  ret i32 %add
}

The discussion in https://groups.google.com/forum/#!topic/llvm-dev/5OH6B-nIRyo
and http://llvm.org/docs/Atomics.html#optimization-outside-atomic suggest that 
"speculative loads are allowed; a load which is part of a race returns undef,
but does not have undefined behavior". 
This is different from standard C11 semantics where a racy program has
undefined behavior. 

This "benign" race as %0 will have "undef" value, but this value is not used
when flag=false.

(2) The Early CSE pass removes the second load of 'a' considering it as
redundant.

IR
----
define i32 @_Z5readAb(i1 zeroext %flag) #3 {
entry:
  %0 = load i32* @a, align 4
  %. = select i1 %flag, i32 %0, i32 0
  %1 = load atomic i32* getelementptr inbounds (%"struct.std::atomic"* @x, i64
0, i32 0, i32 0) seq_cst, align 4
  %cmp1 = icmp eq i32 %1, 1
  %r1.0 = select i1 %cmp1, i32 %0, i32 42
  %add = add nsw i32 %., %r1.0
  ret i32 %add
}

As a result of this transformation, the race introduced in step (1) is no
longer benign. 

The shared memory access sequences in the source program is R_sc(x); R_na(a)
when flag=false and x=1. 
In the target program the shared memory access sequence is R_na(a); R_sc(x)
when flag=false. 

The transformation r=R_na(a); R_sc(x); r1=R_na(a) ~> r=R_na(a); R_sc(x); r1=r
is correct according to C11 because any program that can observe the difference
is racy and therefore has undefined semantics.
However, under the LLVM model where races do not have totally undefined
semantics, the transformation is incorrect.

Summary
--------
One of the two transformations has to be disabled.

The testcase and the IR files are attached.

Best Regards,
soham

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-bugs/attachments/20150209/4538e014/attachment.html>


More information about the llvm-bugs mailing list