[PATCH] [X86] replace (atomic fetch_add of 0) by (mfence; mov)
morisset at google.com
Wed Aug 27 15:01:10 PDT 2014
Mostly useful for implementing seqlocks in C11/C++11, as explained in
In particular, it can avoid cache-line bouncing, bringing massive scalability
improvements in the micro-benchmarks of the paper.
This cannot be done as a target-independent pass, because it is unsound
to turn a fetch_add(&x, 0, release) into fence(seq_cst); load(&x, seq_cst)
as shown by the following example(from the paper above):
atomic<int> x = y = 0;
r1 = y.fetch_add(0, mo_release);
r2 = x.load(mo_relaxed);
r1 == r2 == 0 is not possible in the above code, but becomes possible if it the
fetch_add of thread 0 is turned into a fence followed by a load, even if they
are both seq_cst.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 5378 bytes
Desc: not available
More information about the llvm-commits