<html>
<head>
<base href="https://bugs.llvm.org/">
</head>
<body><table border="1" cellspacing="0" cellpadding="8">
<tr>
<th>Bug ID</th>
<td><a class="bz_bug_link
bz_status_NEW "
title="NEW - C++11 atomic exchange operation compiles into AArch64 store instruction"
href="https://bugs.llvm.org/show_bug.cgi?id=46719">46719</a>
</td>
</tr>
<tr>
<th>Summary</th>
<td>C++11 atomic exchange operation compiles into AArch64 store instruction
</td>
</tr>
<tr>
<th>Product</th>
<td>libraries
</td>
</tr>
<tr>
<th>Version</th>
<td>trunk
</td>
</tr>
<tr>
<th>Hardware</th>
<td>PC
</td>
</tr>
<tr>
<th>OS</th>
<td>Linux
</td>
</tr>
<tr>
<th>Status</th>
<td>NEW
</td>
</tr>
<tr>
<th>Severity</th>
<td>enhancement
</td>
</tr>
<tr>
<th>Priority</th>
<td>P
</td>
</tr>
<tr>
<th>Component</th>
<td>Scalar Optimizations
</td>
</tr>
<tr>
<th>Assignee</th>
<td>unassignedbugs@nondot.org
</td>
</tr>
<tr>
<th>Reporter</th>
<td>sunghwan.lee@sf.snu.ac.kr
</td>
</tr>
<tr>
<th>CC</th>
<td>llvm-bugs@lists.llvm.org
</td>
</tr></table>
<p>
<div>
<pre>clang++ compiles a C++11 atomic exchange operation into a single AArch64 store
instruction whenever the value read by the exchange is never used.
It is a miscompilation since an acquire-fence may induce synchronization when
it follows a relaxed exchange operation, but not when it follows a store.
Following target codes are obtained by "clang++ -std=c++11 -O1"
source (C++):
================================
void foo(atomic<uint64_t> &X) {
X.exchange(42, memory_order_relaxed);
}
================================
target (IR):
================================
define dso_local void @_Z3fooRSt6atomicImE(%"struct.std::atomic"* nocapture
nonnull align 8 dereferenceable(8) %X) local_unnamed_addr #0 {
entry:
%_M_i.i = getelementptr inbounds %"struct.std::atomic",
%"struct.std::atomic"* %X, i64 0, i32 0, i32 0
store atomic i64 42, i64* %_M_i.i monotonic, align 8
ret void
}
================================
target (assembly):
================================
_Z3fooRSt6atomicImE: // @_Z3fooRSt6atomicImE
// %bb.0: // %entry
mov w8, #42
str x8, [x0]
ret
================================
The following program demonstrates a new behavior introduced by this
miscompilation.
================================
uint64_t foo(atomic<uint64_t> &X, atomic<uint64_t> &Y) {
X.exchange(42, memory_order_relaxed);
atomic_thread_fence(memory_order_acquire);
return Y.load(memory_order_relaxed);
}
uint64_t bar(atomic<uint64_t> &X, atomic<uint64_t> &Y) {
Y.store(1, memory_order_relaxed);
return X.fetch_add(1, memory_order_release);
}
================================
When "foo" and "bar" running in parallel (where both X and Y are initialized to
0), both "foo" and "bar" returning 0 at the same time is not allowed by C++11.
In particular, if "X.fetch_add" by "bar" read 0 from X and updated X to 1,
"X.exchange" by "foo" is forced to read 1 and update to 42 due to the atomicity
of the "fetch_add".
In this case, the acquire fence by "foo" induces a happens-before relation
between "Y.store" by "bar" and "Y.load" by "foo".
However both functions returning 0 is allowed by AArch64 when the exchange
operation by "foo" is optimized into a store instruction.</pre>
</div>
</p>
<hr>
<span>You are receiving this mail because:</span>
<ul>
<li>You are on the CC list for the bug.</li>
</ul>
</body>
</html>