[PATCH] [X86] replace (atomic fetch_add of 0) by (mfence; mov)

JF Bastien jfb at chromium.org
Sat Aug 30 10:29:35 PDT 2014

The code and tests don't really discuss the atomic ordering. It would be good to provide an intuition for why your code works even with different orderings.

Comment at: lib/Target/X86/X86ISelDAGToDAG.cpp:1749
@@ +1748,3 @@
+// On x86 an atomic load-add of the constant 0 can be replaced by an mfence
+// followed by a mov. A detailed explanation of this (and exemple of why the
+// mfence is required) is available at

Comment at: lib/Target/X86/X86ISelDAGToDAG.cpp:1756
@@ +1755,3 @@
+// being sunk out of the critical section. Replacing the last load by a
+// fetch_add(0, release) accomplishes just that.. but requires this
+// optimization to preserve the desirable property of seqlocks that readers

Comment at: lib/Target/X86/X86ISelDAGToDAG.cpp:1758
@@ +1757,3 @@
+// optimization to preserve the desirable property of seqlocks that readers
+// do not cause cache line bouncing.
+// The mfence is required because otherwise the load could be hoisted before
I think it might be clearer to add that the cache line goes from shared to modified state with the fetch_add, whereas it remains shared with the load.

Comment at: lib/Target/X86/X86ISelDAGToDAG.cpp:1762
@@ +1761,3 @@
+// fetch_add could not do (since only store-load can be reordered in
+// load-store) on X86.
+SDNode *X86DAGToDAGISel::SelectAtomicAddZero(SDNode *Node, MVT NVT) {
I think it would be good to explain that the mfence is required because the *hardware* can reorder load/store. This code is in a compiler, the reader may assume that "hoisting" is done by the compiler.

Comment at: lib/Target/X86/X86ISelDAGToDAG.cpp:2191
@@ -2111,5 +2190,3 @@
-    SDNode *RetVal = SelectAtomicLoadArith(Node, NVT);
-    if (RetVal)
+  case ISD::ATOMIC_LOAD_OR: {
+    if (SDNode *RetVal = SelectAtomicLoadArith(Node, NVT))
Would it make sense to also select atomic or zero in the same way? I can't think of a reason for someone to use atomic or zero instead of add zero, but they are strictly equivalent so it seems silly to not select both with the same code.


More information about the llvm-commits mailing list