<html>
    <head>
      <base href="https://bugs.llvm.org/">
    </head>
    <body><table border="1" cellspacing="0" cellpadding="8">
        <tr>
          <th>Bug ID</th>
          <td><a class="bz_bug_link 
          bz_status_NEW "
   title="NEW - C++11 atomic exchange operation compiles into AArch64 store instruction"
   href="https://bugs.llvm.org/show_bug.cgi?id=46719">46719</a>
          </td>
        </tr>

        <tr>
          <th>Summary</th>
          <td>C++11 atomic exchange operation compiles into AArch64 store instruction
          </td>
        </tr>

        <tr>
          <th>Product</th>
          <td>libraries
          </td>
        </tr>

        <tr>
          <th>Version</th>
          <td>trunk
          </td>
        </tr>

        <tr>
          <th>Hardware</th>
          <td>PC
          </td>
        </tr>

        <tr>
          <th>OS</th>
          <td>Linux
          </td>
        </tr>

        <tr>
          <th>Status</th>
          <td>NEW
          </td>
        </tr>

        <tr>
          <th>Severity</th>
          <td>enhancement
          </td>
        </tr>

        <tr>
          <th>Priority</th>
          <td>P
          </td>
        </tr>

        <tr>
          <th>Component</th>
          <td>Scalar Optimizations
          </td>
        </tr>

        <tr>
          <th>Assignee</th>
          <td>unassignedbugs@nondot.org
          </td>
        </tr>

        <tr>
          <th>Reporter</th>
          <td>sunghwan.lee@sf.snu.ac.kr
          </td>
        </tr>

        <tr>
          <th>CC</th>
          <td>llvm-bugs@lists.llvm.org
          </td>
        </tr></table>
      <p>
        <div>
        <pre>clang++ compiles a C++11 atomic exchange operation into a single AArch64 store
instruction whenever the value read by the exchange is never used.
It is a miscompilation since an acquire-fence may induce synchronization when
it follows a relaxed exchange operation, but not when it follows a store.
Following target codes are obtained by "clang++ -std=c++11 -O1"
source (C++):
================================
void foo(atomic<uint64_t> &X) {
  X.exchange(42, memory_order_relaxed);
}
================================
target (IR):
================================
define dso_local void @_Z3fooRSt6atomicImE(%"struct.std::atomic"* nocapture
nonnull align 8 dereferenceable(8) %X) local_unnamed_addr #0 {
entry:
  %_M_i.i = getelementptr inbounds %"struct.std::atomic",
%"struct.std::atomic"* %X, i64 0, i32 0, i32 0
  store atomic i64 42, i64* %_M_i.i monotonic, align 8
  ret void
}
================================
target (assembly):
================================
_Z3fooRSt6atomicImE:                    // @_Z3fooRSt6atomicImE
// %bb.0:                               // %entry
        mov     w8, #42
        str     x8, [x0]
        ret
================================
The following program demonstrates a new behavior introduced by this
miscompilation.
================================
uint64_t foo(atomic<uint64_t> &X, atomic<uint64_t> &Y) {
  X.exchange(42, memory_order_relaxed);
  atomic_thread_fence(memory_order_acquire);
  return Y.load(memory_order_relaxed);
}
uint64_t bar(atomic<uint64_t> &X, atomic<uint64_t> &Y) {
  Y.store(1, memory_order_relaxed);
  return X.fetch_add(1, memory_order_release);
}
================================
When "foo" and "bar" running in parallel (where both X and Y are initialized to
0), both "foo" and "bar" returning 0 at the same time is not allowed by C++11.
In particular, if "X.fetch_add" by "bar" read 0 from X and updated X to 1,
"X.exchange" by "foo" is forced to read 1 and update to 42 due to the atomicity
of the "fetch_add".
In this case, the acquire fence by "foo" induces a happens-before relation 
between "Y.store" by "bar" and "Y.load" by "foo".
However both functions returning 0 is allowed by AArch64 when the exchange
operation by "foo" is optimized into a store instruction.</pre>
        </div>
      </p>


      <hr>
      <span>You are receiving this mail because:</span>

      <ul>
          <li>You are on the CC list for the bug.</li>
      </ul>
    </body>
</html>